Browsing by Author "Wang, Weiyao"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Open Access SVRG Escapes Saddle Points(2018-04-25) Wang, WeiyaoIn this paper, we consider a fundamental problem of non-convex optimization that has wide applications and implications in machine learning. Previous works have shown that stochastic gradient descent with the variance reduction technique (SVRG) converges to a first-order stationary point in O(n^(2/3)/ε^2) iterations in non-convex settings. However, many problems in non-convex optimization requires an efficient way to find second- order stationary points that are a subset of first-order stationary points, and therefore algorithms that converges to a first-order stationary point are not satisfying. This paper shows how SVRG converges to a second-order stationary point in non-convex setting with a uniformly random perturbation. To find an ε-second-order stationary point, it takes O( n^(3/4)polylog(d/(nε))/ε^2 ) iterations. In comparison with the convergence rate of SVRG to a first-order stationary point, the loss in convergence rate only depends poly- logarithmically on the dimension d and involves a small polynomial of n, O(n^(1/12) ). This is the best known result in finding second-order stationary point. We also give some intuitions and proof sketches to a new framework of analysis using the stable manifold theorem. The analysis from the new framework may help to eliminate the n^(1/12) loss from our current analysis. Our results can be directly applied to problems such as the training of neural networks and tensor decomposition. The intuition of stable manifold may provide independent interest to the non-convex optimization community.Item Open Access Understanding Operator Reed-Muller Codes Through the Weyl Transform(2018-04-25) Wang, WeiyaoThis paper expands the framework on the multidimensional generalizations of binary Reed-Muller code, operator Reed-Muller codes, where the codewords are projection operators through the Weyl Transform. The Weyl Transform of these operator Reed- Muller codes maps the operators to vectors, and it is isometric. This nice property gives new proofs for some known results and produce a simpler decoding algorithm. In particular, the property provides a different framework to analyze the distance spectrum of second operator Reed-Muller codes without using the Dickson’s Theorem.