EFFICIENT LOW-RESOURCE TRAINING WITH PRE-TRAINED DEEP NEURAL NETWORKS

Loading...
Thumbnail Image

Date

2023

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

13
views
55
downloads

Abstract

The performance of machine learning systems has been dramatically improving in recent years thanks to the advent of pre-trained deep neural networks. Such models are generally adopted as the foundation for feature extraction, yielding state-of-the-art results when being further trained (or fine-tuned) on downstream tasks. Nonetheless, the pre-trained deep neural networks are generally huge in size, e.g., with millions or billions of parameters, thus demanding abundant data and computation resources during fine-tuning. Therefore, it is of pragmatic merit to investigate efficient training approaches with such pre-trained deep neural networks for low-resource scenarios, i.e., when there is limited computation budget or annotated data for the down stream tasks. In this dissertation, we consider low-resource scenarios for efficient training with the following tasks: i) natural language understanding, ii) fair text generation, and iii) compositional image retrieval.

We first study natural language understanding through its sub-tasks of sequence classification and sequence labeling.We propose an attention-based architecture and a data-free distillation framework, respectively, both designed for the scenario where there are limited data annotations. These approaches improves the data efficiency in fine-tuning pre-trained deep neural networks for better understanding of natural language. We then explore strategies in fine-tuning pre-trained language models for demographic fairness in text generation. Specifically, we propose to minimize the mutual information between the semantics in the generated text sentences and their demographic polarity, i.e., the demographic group to which the sentence is referring. We develop a computational efficient approach in estimating the upper bound of such mutual information via importance sampling, reducing the number of model inference required during training. Finally, for compositional image retrieval, we propose a visual prompt tuning mechanism and a self-supervised auxiliary task that adapt a pre-trained vision-language model to perform image retrieval tasks with only few annotations, while improving the computation efficiency by allowing the pre-trained parameters to be frozen during training.

Overall, my research work draws attention to the data and computational efficiency of the current large pre-trained deep neural networks, improving the flexibility in deploying such models in the considered low-resource scenarios.

Description

Provenance

Citation

Citation

Wang, Rui (2023). EFFICIENT LOW-RESOURCE TRAINING WITH PRE-TRAINED DEEP NEURAL NETWORKS. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/29180.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.