On Impact of Network Architecture for Deep Learning

dc.contributor.advisor

Chen, Yiran

dc.contributor.author

Fu, Hao

dc.date.accessioned

2023-06-08T18:25:18Z

dc.date.available

2023-06-08T18:25:18Z

dc.date.issued

2023

dc.department

Electrical and Computer Engineering

dc.description.abstract

The architecture of neural networks is a crucial factor in the success of deep learning models across a range of fields, including computer vision and natural language processing (NLP). Specific architectures are tailored to address particular tasks, and the selection of architecture can significantly affect the training process, model performance, and robustness.

In the field of NLP, we address the training deficiency of text VAEs with autoregressive decoders through two approaches. First, we introduce a cyclical annealing schedule that enables progressive learning of meaningful latent codes by leveraginginformative representations from previous cycles as warm restarts. Second, we propose semi-implicit (SI) representations for the latent distributions of natural languages, which extend the commonly used Gaussian distribution family by mixing the variational parameter with a flexible implicit distribution. Our proposed methods are demonstrated to be effective in text generation tasks such as dialog response generation, with significant performance improvements compared to other training techniques.

In the field of computer vision, we investigate the intrinsic influence of network structure on a model’s robustness in addressing data distribution shifts. We propose a novel paradigm, Dense Connectivity Search of Outlier Detector (DCSOD), that automatically explores the dense connectivity of CNN architectures on Out-of-Distribution (OOD) detection tasks using Neural Architecture Search (NAS). To improve the quality of evaluation on OOD detection during the search, we propose evolving distillation based on our multi-view feature learning explanation. Experimental results show that DCSOD achieves remarkable performance over widely used architectures and previous NAS baselines.

dc.identifier.uri

https://hdl.handle.net/10161/27760

dc.subject

Electrical engineering

dc.subject

Computer science

dc.subject

Computer engineering

dc.subject

Deep Generative Model

dc.subject

Deep learning

dc.subject

Natural Language Processing (NLP)

dc.subject

Neural Architecture Search (NAS)

dc.subject

Out-of-Distribution (OOD) detection

dc.title

On Impact of Network Architecture for Deep Learning

dc.type

Dissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Fu_duke_0066D_17424.pdf
Size:
3.25 MB
Format:
Adobe Portable Document Format

Collections