Optimizing Deep Neural Networks: Leveraging Structural Characteristics for Enhanced NAS and Adversarial Robustness

Loading...

Date

2025

Journal Title

Journal ISSN

Volume Title

Abstract

Deep Neural Networks (DNNs) have revolutionized artificial intelligence, driving advancements in computer vision, natural language processing, and autonomous systems. However, the predominant focus on training techniques, data scaling, and computational optimizations often neglects the structural design of models, which is fundamental to both generalized and adversarial accuracy, particularly under efficiency constraints. Moreover, standard model evaluations assume controlled conditions, overlooking real-world challenges such as limited computational resources and adversarial vulnerabilities. In practical deployment settings, architectures must not only achieve high accuracy but also maintain robustness and efficiency, minimizing overhead while adapting to dynamic constraints. This thesis explores how structural properties can be leveraged to simultaneously enhance model discovery, generalization, and robustness, ensuring that deep learning models remain performant and resilient even under adversarial and resource-constrained conditions.

First, we investigate Neural Architecture Search (NAS), a technique for automating model design. While NAS aims to improve performance, most approaches overlook the role of the search space itself in enhancing generalization and search efficiency. Even with one-shot and supernet-based NAS, search pipelines remain computationally intensive due to redundant architecture evaluations, limiting scalability. To address this, we introduce LISSNAS, a search space reduction framework that exploits structural similarities to prune architecture spaces, significantly improving search efficiency while retaining high-performance architectures. LISSNAS demonstrates state-of-the-art performance, achieving 77.6% Top-1 accuracy on ImageNet under mobile constraints while preserving architectural diversity and accelerating NAS.

Beyond efficiency and generalization, structural considerations are also critical for robustness. Adversarial attacks can mislead state-of-the-art models with high confidence, posing serious risks in safety-critical applications. While adversarial training improves robustness, it often leads to overfitting and deteriorates clean accuracy. To mitigate this trade-off, we introduce CLAT (Critical Layer Adversarial Training), a fine-tuning framework for CNNs that selectively optimizes robustness-critical layers while freezing the rest of the model. CLAT achieves a 95% reduction in trainable parameters and improves adversarial robustness by over 2% compared to baseline methods.

However, we find that CLAT does not generalize effectively to vision transformers (ViTs), whose self-attention mechanisms differ fundamentally from CNNs in layer composition and computation. ViTs are particularly prone to adversarial overfitting, exacerbating clean and adversarial accuracy trade-offs. To address this, we introduce SAFER (Structure-Aware Fine-tuning for Enhanced Robustness), a fine-tuning strategy that mitigates adversarial overfitting in ViTs. SAFER selectively fine-tunes a small subset of layers using sharpness-aware minimization, significantly enhancing both clean and adversarial accuracy. Our approach achieves up to 5% improvement in general cases and as high as 20% gains across multiple ViT architectures and datasets, setting a new benchmark for adversarial robustness in transformer-based models.

This thesis demonstrates that structural optimization serves as a unifying principle across efficiency, generalization, and robustness. By systematically leveraging architecture—from NAS search space design to adversarial robustness techniques—this work establishes that structure is not merely a passive framework for learning but a fundamental tool for developing deep learning models that are resilient, efficient, and deployable in real-world settings.

Description

Provenance

Subjects

Computer engineering, Computer science, Artificial intelligence, Adversarial Attack and Defense, Adversarial Robustness, Computer Vision, Deep Learning, Machine Learning, Neural Architecture Search

Citation

Citation

Gopal, Bhavna (2025). Optimizing Deep Neural Networks: Leveraging Structural Characteristics for Enhanced NAS and Adversarial Robustness. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32741.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.