Deep Generative Models for Vision, Languages and Graphs

Wang, Wenlin

Deep Generative Models for Vision, Languages and Graphs

dc.contributor.advisor	Carin, Lawrence
dc.contributor.author	Wang, Wenlin
dc.date.accessioned	2020-01-27T16:52:41Z
dc.date.available	2020-03-13T08:17:14Z
dc.date.issued	2019
dc.department	Electrical and Computer Engineering
dc.description.abstract	Deep generative models have achieved remarkable success in modeling various types of data, ranging from vision, languages and graphs etc. They offer flexible and complementary representations for both labeled and unlabeled data. Moreover, they are naturally capable of generating realistic data. In this thesis, novel variations of generative models have been proposed for various learning tasks, which can be categorized into three parts. In the first part, generative models are designed to learn generalized representation for images under Zero-Shot Learning (ZSL) setting. An attribute conditioned variational autoencoder is introduced, representing each class as a latent-space distribution and enabling learning highly discriminative and robust feature representations. It endows the generative model discriminative power by choosing one class that maximize the variational lower bound. I further show that the model can be naturally generalized to transductive and few-shot setting. In the second part, generative models are proposed for controllable language generation. Specifically, two types of topic enrolled language generation models have been proposed. The first introduces a topic compositional neural language model for controllable and interpretable language generation via a mixture-of-expert model design. While the second solve the problem via a VAE framework with a topic-conditioned GMM model design. Both of the two models have boosted the performance of existing language generation systems with controllable properties. In the third part, generative models are introduced for the broaden graph data. First, a variational homophilic embedding (VHE) model is proposed. It is a fully generative model that learns network embeddings by modeling the textual semantic information with a variational autoencoder, while accounting for the graph structure information through a homophilic prior design. Secondly, for the heterogeneous multi-task learning, a novel graph-driven generative model is developed to unifies them into the same framework. It combines graph convolutional network (GCN) with multiple VAEs, thus embedding the nodes of graph in a uniform manner while specializing their organization and usage to different tasks.
dc.identifier.uri	https://hdl.handle.net/10161/19880
dc.subject	Computer engineering
dc.subject	Computer vision
dc.subject	Deep learning
dc.subject	Generative models
dc.subject	Machine learning
dc.subject	Natural language processing
dc.title	Deep Generative Models for Vision, Languages and Graphs
dc.type	Dissertation
duke.embargo.months	1.4794520547945205

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Wang_duke_0066D_15360.pdf
Size:: 8.59 MB
Format:: Adobe Portable Document Format

Download

Collections

Dissertations