Show simple item record

Deep Latent-Variable Models for Natural Language Understanding and Generation

dc.contributor.advisor Carin, Lawrence Shen, Dinghan 2020-06-09T17:58:15Z 2020-06-09T17:58:15Z 2020
dc.description.abstract <p>Deep latent-variable models have been widely adopted to model various types of data, due to its ability to: 1) infer rich high-level information from the input data (especially in a low-resource setting); 2) result in a generative network that can synthesize samples unseen during training. In this dissertation, I will present the contributions I have made to leverage the general framework of latent-variable model to various natural language processing problems, which is especially challenging given the discrete nature of text sequences. Specifically, the dissertation is divided into two parts.</p><p>In the first part, I will present two of my recent explorations on leveraging deep latent-variable models for natural language understanding. The goal here is to learn meaningful text representations that can be helpful for tasks such as sentence classification, natural language inference, question answering, etc. Firstly, I will propose a variational autoencoder based on textual data to digest unlabeled information. To alleviate the observed posterior collapse issue, a specially-designed deconvolutional decoder is employed as the generative network. The resulting sentence embeddings greatly boost the downstream tasks performances. Then I will present a model to learn compressed/binary sentence embeddings, which is storage-efficient and applicable to on-device applications.</p><p>As to the second part, I will introduce a multi-level Variational Autoencoder (VAE) to model long-form text sequences (with as many as 60 words). A multi-level generative network is leveraged to capture the word-level, sentence-level coherence, respectively. Moreover, with a hierarchical design of the latent space, long-form and coherent texts can be more reliably produced (relative to baseline text VAE models). Semantically-rich latent representations are also obtained in such an unsupervised manner. Human evaluation further demonstrates the superiority of the proposed method.</p>
dc.subject Artificial intelligence
dc.subject Computer engineering
dc.subject Deep Learning
dc.subject Machine Learning
dc.subject Natural Language Processing
dc.subject Repesentation Learning
dc.subject Text Generation
dc.title Deep Latent-Variable Models for Natural Language Understanding and Generation
dc.type Dissertation
dc.department Electrical and Computer Engineering

Files in this item


This item appears in the following Collection(s)

Show simple item record