Mukherjee, SayanTan, Zilong2019-04-022020-01-092018https://hdl.handle.net/10161/18280<p>Latent variable models are widely used in applications ranging from</p><p>natural language processing to recommender systems. Exact inference</p><p>using maximum likelihood for these models is generally NP-hard, and</p><p>computationally prohibitive on big and/or high-dimensional data. This</p><p>has motivated the development of approximate inference methods that</p><p>balance between computational complexity and statistical</p><p>efficiency. Understanding the computational and statistical tradeoff</p><p>is important for analyzing approximate inference approaches as well</p><p>as designing new ones. Towards this goal, this dissertation presents a</p><p>study of new approximate inference algorithms with provable guarantees</p><p>for three classes of inference tasks.</p><p>The first class is based on the method of moments. The inference in</p><p>this setting is typically reduced to a tensor decomposition problem</p><p>which requires decomposing a $p$-by-$p$-by-$p$ estimator tensor for</p><p>$p$ variables. We divide-and-conquer the tensor method to instead</p><p>decompose $O\left(p/k\right)$ sub-tensors each of size</p><p>$O\left(k^3\right)$, achieving significant reduction in computational</p><p>complexity when the number of latent variables $k$ is small. Our</p><p>approach can also enforce the nonnegativity of estimates for inferring</p><p>nonnegative models parameters. Theoretical analysis gives sufficient</p><p>conditions for ensuring robustness of the divide-and-conquer method,</p><p>as well as proof of linear convergence for the nonnegative</p><p>factorization.</p><p>In the second class, we further consider mixed-effect models in which</p><p>the variance of latent variables also needs to be inferred. We present</p><p>approximate estimators which have closed-form analytical</p><p>expressions. Fast computational techniques based on the subsampled</p><p>randomized Hadamard transform are also developed achieving sublinear</p><p>complexity in the dimension. This makes our approach useful for</p><p>high-dimensional applications like genome-wide association</p><p>studies. Moreover, we provide theoretical analysis that states</p><p>provable error guarantees for the approximation.</p><p>The last class is more general inference in an infinite-dimensional</p><p>function space specified by a Gaussian process (GP) prior. We provide</p><p>a dual formulation of GPs using random functions in a reproducing</p><p>kernel Hilbert space (RKHS) where the function representation is</p><p>specified as latent variables. We show that the dual GP can realize an</p><p>expanded class of functions, and can also be well-approximated by a</p><p>low-dimensional sufficient dimension reduction subspace of the RKHS. A</p><p>fast learning algorithm for the dual GP is developed which improves</p><p>upon the state-of-the-art computational complexity of GPs.</p>Computer scienceapproximate inferencelatent variable modelsMachine learningApproximate Inference for High-Dimensional Latent Variable ModelsDissertation