Stochastic nested variance reduction for nonconvex optimization

Loading...
Thumbnail Image

Date

2020-05-01

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

2
views
11
downloads

Abstract

We study nonconvex optimization problems, where the objective function is either an average of n nonconvex functions or the expectation of some stochastic function. We propose a new stochastic gradient descent algorithm based on nested variance reduction, namely, Stochastic Nested Variance-Reduced Gradient descent (SNVRG). Compared with conventional stochastic variance reduced gradient (SVRG) algorithm that uses two reference points to construct a semi-stochastic gradient with diminishing variance in each iteration, our algorithm uses K + 1 nested reference points to build a semi-stochastic gradient to further reduce its variance in each iteration. For smooth nonconvex functions, SNVRG converges to an ε-approximate first-order stationary point within Oe(n∧ε−2 + ε−3 ∧n1/2ε−2)1 number of stochastic gradient evaluations. This improves the best known gradient complexity of SVRG O(n+ n2/3ε−2) and that of SCSG O(n∧ε−2 + ε−10/3 ∧n2/3ε−2). For gradient dominated functions, SNVRG also achieves better gradient complexity than the state-of-the-art algorithms. Based on SNVRG, we further propose two algorithms that can find local minima faster than state-of-the-art algorithms in both finite-sum and general stochastic (online) nonconvex optimization. In particular, for finite-sum optimization problems, the proposed SNVRG + Neon2finite algorithm achieves Oe(n1/2ε−2 + nε−H3 + n3/4ε−H7/2) gradient complexity to converge to an (ε, εH)-second-order stationary point, which outperforms SVRG+Neon2finite (Allen-Zhu and Li, 2018), the best existing algorithm, in a wide regime. For general stochastic optimization problems, the proposed SNVRG + Neon2online achieves Oe(ε−3 + ε−H5 + ε−2ε−H3) gradient complexity, which is better than both SVRG + Neon2online (Allen-Zhu and Li, 2018) and Natasha2 (Allen-Zhu, 2018a) in certain regimes. Thorough experimental results on different nonconvex optimization problems back up our theory.

Department

Description

Provenance

Citation

Scholars@Duke

Xu

Pan Xu

Assistant Professor of Biostatistics & Bioinformatics

My research is centered around Machine Learning, with broad interests in the areas of Artificial Intelligence, Data Science, Optimization, Reinforcement Learning, High Dimensional Statistics, and their applications to real-world problems including Bioinformatics and Healthcare. My research goal is to develop computationally- and data-efficient machine learning algorithms with both strong empirical performance and theoretical guarantees.


Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.