dc.description.abstract |
<p>Deep latent-variable models have been widely adopted to model various types of
data, due to its ability to: 1) infer rich high-level information from the input data
(especially in a low-resource setting); 2) result in a generative network that can
synthesize samples unseen during training. In this dissertation, I will present the
contributions I have made to leverage the general framework of latent-variable model
to various natural language processing problems, which is especially challenging given
the discrete nature of text sequences. Specifically, the dissertation is divided into
two parts.</p><p>In the first part, I will present two of my recent explorations on
leveraging deep latent-variable models for natural language understanding. The goal
here is to learn meaningful text representations that can be helpful for tasks such
as sentence classification, natural language inference, question answering, etc. Firstly,
I will propose a variational autoencoder based on textual data to digest unlabeled
information. To alleviate the observed posterior collapse issue, a specially-designed
deconvolutional decoder is employed as the generative network. The resulting sentence
embeddings greatly boost the downstream tasks performances. Then I will present a
model to learn compressed/binary sentence embeddings, which is storage-efficient and
applicable to on-device applications.</p><p>As to the second part, I will introduce
a multi-level Variational Autoencoder (VAE) to model long-form text sequences (with
as many as 60 words). A multi-level generative network is leveraged to capture the
word-level, sentence-level coherence, respectively. Moreover, with a hierarchical
design of the latent space, long-form and coherent texts can be more reliably produced
(relative to baseline text VAE models). Semantically-rich latent representations are
also obtained in such an unsupervised manner. Human evaluation further demonstrates
the superiority of the proposed method.</p>
|
|