Chen, YiranFu, Hao2023-06-082023-06-082023https://hdl.handle.net/10161/27760<p>The architecture of neural networks is a crucial factor in the success of deep learning models across a range of fields, including computer vision and natural language processing (NLP). Specific architectures are tailored to address particular tasks, and the selection of architecture can significantly affect the training process, model performance, and robustness. </p><p>In the field of NLP, we address the training deficiency of text VAEs with autoregressive decoders through two approaches. First, we introduce a cyclical annealing schedule that enables progressive learning of meaningful latent codes by leveraginginformative representations from previous cycles as warm restarts. Second, we propose semi-implicit (SI) representations for the latent distributions of natural languages, which extend the commonly used Gaussian distribution family by mixing the variational parameter with a flexible implicit distribution. Our proposed methods are demonstrated to be effective in text generation tasks such as dialog response generation, with significant performance improvements compared to other training techniques.</p><p>In the field of computer vision, we investigate the intrinsic influence of network structure on a model’s robustness in addressing data distribution shifts. We propose a novel paradigm, Dense Connectivity Search of Outlier Detector (DCSOD), that automatically explores the dense connectivity of CNN architectures on Out-of-Distribution (OOD) detection tasks using Neural Architecture Search (NAS). To improve the quality of evaluation on OOD detection during the search, we propose evolving distillation based on our multi-view feature learning explanation. Experimental results show that DCSOD achieves remarkable performance over widely used architectures and previous NAS baselines.</p>Electrical engineeringComputer scienceComputer engineeringDeep Generative ModelDeep learningNatural Language Processing (NLP)Neural Architecture Search (NAS)Out-of-Distribution (OOD) detectionOn Impact of Network Architecture for Deep LearningDissertation