Exploring Deep Representation Learning on Vision and Language Intelligence

Dai, Shuyang

Exploring Deep Representation Learning on Vision and Language Intelligence

View / Download10.97 MB

Date

2021

Authors

Dai, Shuyang

Advisors

Carin, Lawrence

Repository Usage Stats

122
views

280
downloads

Abstract

Deep neural networks have achieved tremendous success in recent years, with applications in various tasks involving both computer vision and natural language processing. Representation learning is often adopted to extract useful latent features for these tasks. In this dissertation, I will discuss the contributions that I have made in using representation learning methodologies for deep generative models, as well as unsupervised domain adaptation.

The first part of the dissertation will mainly focus on deep generative models for vision and language intelligence. I will present Symmetric Variational Autoencoder, which unifies the Variational Bayesian and adversarial training frameworks. Then, I will show the application of such generative models in the natural language domain, and present a VAE framework with a hyperbolic latent space.

For the second part, I will mainly focus on representation learning for unsupervised domain adaptation (UDA). In this problem setup, we want to extract representative features that contain mostly task-oriented information but little domain-related information. I will first present to learn such features in a contrastive manner: pulling data of the same class together while pushing those that are not away from each other. Next, I will focus on UDA where large domain gaps exist. To tackle such a UDA problem, I propose to use unlabeled domain bridges, and transform the original problem into several intermediate ones.

Type

Dissertation

Department

Electrical and Computer Engineering

Subjects

Artificial intelligence, Computer science, Statistics

Permalink

https://hdl.handle.net/10161/23764

Citation

Dai, Shuyang (2021). Exploring Deep Representation Learning on Vision and Language Intelligence. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/23764.

Collections

Dissertations

Full item page

Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.

Exploring Deep Representation Learning on Vision and Language Intelligence

Date

Authors

Advisors

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

Abstract

Type

Department

Description

Provenance

Subjects

Citation

Permalink

Citation

Collections