SVRG Escapes Saddle Points
Abstract
In this paper, we consider a fundamental problem of non-convex optimization that has
wide applications and implications in machine learning. Previous works have shown
that stochastic gradient descent with the variance reduction technique (SVRG) converges
to a first-order stationary point in O(n^(2/3)/ε^2) iterations in non-convex settings.
However, many problems in non-convex optimization requires an efficient way to find
second- order stationary points that are a subset of first-order stationary points,
and therefore algorithms that converges to a first-order stationary point are not
satisfying.
This paper shows how SVRG converges to a second-order stationary point in non-convex
setting with a uniformly random perturbation. To find an ε-second-order stationary
point, it takes O( n^(3/4)polylog(d/(nε))/ε^2 ) iterations. In comparison with the
convergence rate of SVRG to a first-order stationary point, the loss in convergence
rate only depends poly- logarithmically on the dimension d and involves a small polynomial
of n, O(n^(1/12) ). This is the best known result in finding second-order stationary
point.
We also give some intuitions and proof sketches to a new framework of analysis using
the stable manifold theorem. The analysis from the new framework may help to eliminate
the n^(1/12) loss from our current analysis.
Our results can be directly applied to problems such as the training of neural networks
and tensor decomposition. The intuition of stable manifold may provide independent
interest to the non-convex optimization community.
Type
Honors thesisDepartment
Computer SciencePermalink
https://hdl.handle.net/10161/16663Citation
Wang, Weiyao (2018). SVRG Escapes Saddle Points. Honors thesis, Duke University. Retrieved from https://hdl.handle.net/10161/16663.Collections
More Info
Show full item record
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Undergraduate Honors Theses and Student papers
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info