Approximate Inference for High-Dimensional Latent Variable Models
dc.contributor.advisor | Mukherjee, Sayan | |
dc.contributor.author | Tan, Zilong | |
dc.date.accessioned | 2019-04-02T16:27:47Z | |
dc.date.available | 2020-01-09T09:17:09Z | |
dc.date.issued | 2018 | |
dc.department | Computer Science | |
dc.description.abstract | Latent variable models are widely used in applications ranging from natural language processing to recommender systems. Exact inference using maximum likelihood for these models is generally NP-hard, and computationally prohibitive on big and/or high-dimensional data. This has motivated the development of approximate inference methods that balance between computational complexity and statistical efficiency. Understanding the computational and statistical tradeoff is important for analyzing approximate inference approaches as well as designing new ones. Towards this goal, this dissertation presents a study of new approximate inference algorithms with provable guarantees for three classes of inference tasks. The first class is based on the method of moments. The inference in this setting is typically reduced to a tensor decomposition problem which requires decomposing a $p$-by-$p$-by-$p$ estimator tensor for $p$ variables. We divide-and-conquer the tensor method to instead decompose $O\left(p/k\right)$ sub-tensors each of size $O\left(k^3\right)$, achieving significant reduction in computational complexity when the number of latent variables $k$ is small. Our approach can also enforce the nonnegativity of estimates for inferring nonnegative models parameters. Theoretical analysis gives sufficient conditions for ensuring robustness of the divide-and-conquer method, as well as proof of linear convergence for the nonnegative factorization. In the second class, we further consider mixed-effect models in which the variance of latent variables also needs to be inferred. We present approximate estimators which have closed-form analytical expressions. Fast computational techniques based on the subsampled randomized Hadamard transform are also developed achieving sublinear complexity in the dimension. This makes our approach useful for high-dimensional applications like genome-wide association studies. Moreover, we provide theoretical analysis that states provable error guarantees for the approximation. The last class is more general inference in an infinite-dimensional function space specified by a Gaussian process (GP) prior. We provide a dual formulation of GPs using random functions in a reproducing kernel Hilbert space (RKHS) where the function representation is specified as latent variables. We show that the dual GP can realize an expanded class of functions, and can also be well-approximated by a low-dimensional sufficient dimension reduction subspace of the RKHS. A fast learning algorithm for the dual GP is developed which improves upon the state-of-the-art computational complexity of GPs. | |
dc.identifier.uri | ||
dc.subject | Computer science | |
dc.subject | approximate inference | |
dc.subject | latent variable models | |
dc.subject | Machine learning | |
dc.title | Approximate Inference for High-Dimensional Latent Variable Models | |
dc.type | Dissertation | |
duke.embargo.months | 9 |