On Impact of Network Architecture for Deep Learning

Loading...
Thumbnail Image

Date

2023

Authors

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

80
views
94
downloads

Abstract

The architecture of neural networks is a crucial factor in the success of deep learning models across a range of fields, including computer vision and natural language processing (NLP). Specific architectures are tailored to address particular tasks, and the selection of architecture can significantly affect the training process, model performance, and robustness.

In the field of NLP, we address the training deficiency of text VAEs with autoregressive decoders through two approaches. First, we introduce a cyclical annealing schedule that enables progressive learning of meaningful latent codes by leveraginginformative representations from previous cycles as warm restarts. Second, we propose semi-implicit (SI) representations for the latent distributions of natural languages, which extend the commonly used Gaussian distribution family by mixing the variational parameter with a flexible implicit distribution. Our proposed methods are demonstrated to be effective in text generation tasks such as dialog response generation, with significant performance improvements compared to other training techniques.

In the field of computer vision, we investigate the intrinsic influence of network structure on a model’s robustness in addressing data distribution shifts. We propose a novel paradigm, Dense Connectivity Search of Outlier Detector (DCSOD), that automatically explores the dense connectivity of CNN architectures on Out-of-Distribution (OOD) detection tasks using Neural Architecture Search (NAS). To improve the quality of evaluation on OOD detection during the search, we propose evolving distillation based on our multi-view feature learning explanation. Experimental results show that DCSOD achieves remarkable performance over widely used architectures and previous NAS baselines.

Description

Provenance

Citation

Citation

Fu, Hao (2023). On Impact of Network Architecture for Deep Learning. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/27760.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.