Approximate Inference for High-Dimensional Latent Variable Models

Mukherjee, SayanTan, Zilong2019-04-022020-01-092018https://hdl.handle.net/10161/18280Latent variable models are widely used in applications ranging fromnatural language processing to recommender systems. Exact inferenceusing maximum likelihood for these models is generally NP-hard, andcomputationally prohibitive on big and/or high-dimensional data. Thishas motivated the development of approximate inference methods thatbalance between computational complexity and statisticalefficiency. Understanding the computational and statistical tradeoffis important for analyzing approximate inference approaches as wellas designing new ones. Towards this goal, this dissertation presents astudy of new approximate inference algorithms with provable guaranteesfor three classes of inference tasks.The first class is based on the method of moments. The inference inthis setting is typically reduced to a tensor decomposition problemwhich requires decomposing a $p$-by-$p$-by-$p$ estimator tensor for$p$ variables. We divide-and-conquer the tensor method to insteaddecompose $O\left(p/k\right)$ sub-tensors each of size$O\left(k^3\right)$, achieving significant reduction in computationalcomplexity when the number of latent variables $k$ is small. Ourapproach can also enforce the nonnegativity of estimates for inferringnonnegative models parameters. Theoretical analysis gives sufficientconditions for ensuring robustness of the divide-and-conquer method,as well as proof of linear convergence for the nonnegativefactorization.In the second class, we further consider mixed-effect models in whichthe variance of latent variables also needs to be inferred. We presentapproximate estimators which have closed-form analyticalexpressions. Fast computational techniques based on the subsampledrandomized Hadamard transform are also developed achieving sublinearcomplexity in the dimension. This makes our approach useful forhigh-dimensional applications like genome-wide associationstudies. Moreover, we provide theoretical analysis that statesprovable error guarantees for the approximation.The last class is more general inference in an infinite-dimensionalfunction space specified by a Gaussian process (GP) prior. We providea dual formulation of GPs using random functions in a reproducingkernel Hilbert space (RKHS) where the function representation isspecified as latent variables. We show that the dual GP can realize anexpanded class of functions, and can also be well-approximated by alow-dimensional sufficient dimension reduction subspace of the RKHS. Afast learning algorithm for the dual GP is developed which improvesupon the state-of-the-art computational complexity of GPs.Computer scienceapproximate inferencelatent variable modelsMachine learningApproximate Inference for High-Dimensional Latent Variable ModelsDissertation