dc.description.abstract |
<p>Recently there has been increasing interest in developing generative models of
data, offering the promise of learning based on the often vast quantity of unlabeled
data. With such learning, one typically seeks to build rich, hierarchical probabilistic
models that are able to</p><p>fit to the distribution of complex real data, and are
also capable of realistic data synthesis. In this dissertation, novel models and learning
algorithms are proposed for deep generative models. </p><p>This disseration consists
of three main parts.</p><p>The first part developed a deep generative model joint
analysis of images and associated labels or captions. The model is efficiently learned
using variational autoencoder. A multilayered (deep) convolutional dictionary representation
is employed as a decoder of the</p><p>latent image features. Stochastic unpooling
is employed to link consecutive layers in the image model, yielding top-down image
generation. A deep Convolutional Neural Network (CNN) is used as an image encoder;
the CNN is used to approximate a distribution for the latent DGDN features/code. The
latent code is also linked to generative models for labels (Bayesian support vector
machine) or captions (recurrent neural network). When predicting a label/caption for
a new image at test, averaging is performed across the distribution of latent codes;
this is computationally efficient as a consequence of the learned CNN-based encoder.
Since the framework is capable of modeling the image in the presence/absence of associated
labels/captions, a new semi-supervised setting is manifested for CNN learning with
images; the framework even allows unsupervised CNN learning, based on images alone.
Excellent results are obtained on several benchmark datasets, including ImageNet,
demonstrating that the proposed model achieves results that are highly competitive
with similarly sized convolutional neural networks.</p><p>The second part developed
a new method for learning variational autoencoders (VAEs), based on Stein variational
gradient descent. A key advantage of this approach is that one need not make parametric
assumptions about the form of the encoder distribution. Performance is further enhanced
by integrating the proposed encoder with importance sampling. Excellent performance
is demonstrated across multiple unsupervised and semi-supervised problems, including
semi-supervised analysis of the ImageNet data, demonstrating the scalability of the
model to large datasets.</p><p>The third part developed a new form of variational
autoencoder, in which the joint distribution of data and codes is considered in two
(symmetric) forms: (i) from observed data fed through the encoder to yield codes,
and (ii) from latent codes drawn from a simple</p><p>prior and propagated through
the decoder to manifest data. Lower bounds are learned for marginal log-likelihood
fits observed data and latent codes. When learning with the variational bound, one
seeks to minimize the symmetric Kullback-Leibler divergence of</p><p>joint density
functions from (i) and (ii), while simultaneously seeking to maximize the two marginal
log-likelihoods. To facilitate learning, a new form of adversarial training is developed.
An extensive set of experiments is performed, in which we demonstrate state-of-the-art
data reconstruction and generation on several image benchmark datasets.</p>
|
|