Deep Generative Models for Vision, Languages and Graphs

dc.contributor.advisor

Carin, Lawrence

dc.contributor.author

Wang, Wenlin

dc.date.accessioned

2020-01-27T16:52:41Z

dc.date.available

2020-03-13T08:17:14Z

dc.date.issued

2019

dc.department

Electrical and Computer Engineering

dc.description.abstract

Deep generative models have achieved remarkable success in modeling various types of data, ranging from vision, languages and graphs etc. They offer flexible and complementary representations for both labeled and unlabeled data. Moreover, they are naturally capable of generating realistic data. In this thesis, novel variations of generative models have been proposed for various learning tasks, which can be categorized into three parts.

In the first part, generative models are designed to learn generalized representation for images under Zero-Shot Learning (ZSL) setting. An attribute conditioned variational autoencoder is introduced, representing each class as a latent-space distribution and enabling learning highly discriminative and robust feature representations. It endows the generative model discriminative power by choosing one class that maximize the variational lower bound. I further show that the model can be naturally generalized to transductive and few-shot setting.

In the second part, generative models are proposed for controllable language generation. Specifically, two types of topic enrolled language generation models have been proposed. The first introduces a topic compositional neural language model for controllable and interpretable language generation via a mixture-of-expert model design. While the second solve the problem via a VAE framework with a topic-conditioned GMM model design. Both of the two models have boosted the performance of existing language generation systems with controllable properties.

In the third part, generative models are introduced for the broaden graph data. First, a variational homophilic embedding (VHE) model is proposed. It is a fully generative model that learns network embeddings by modeling the textual semantic information with a variational autoencoder, while accounting for the graph structure information through a homophilic prior design. Secondly, for the heterogeneous multi-task learning, a novel graph-driven generative model is developed to unifies them into the same framework. It combines graph convolutional network (GCN) with multiple VAEs, thus embedding the nodes of graph in a uniform manner while specializing their organization and usage to different tasks.

dc.identifier.uri

https://hdl.handle.net/10161/19880

dc.subject

Computer engineering

dc.subject

Computer vision

dc.subject

Deep learning

dc.subject

Generative models

dc.subject

Machine learning

dc.subject

Natural language processing

dc.title

Deep Generative Models for Vision, Languages and Graphs

dc.type

Dissertation

duke.embargo.months

1.4794520547945205

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Wang_duke_0066D_15360.pdf
Size:
8.59 MB
Format:
Adobe Portable Document Format

Collections