Measuring the Impact of Architecture and Training for Model Performance and Robustness in Image Detection
Date
2025
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Attention Stats
Abstract
This research investigates the impact of model components on the performance of object detection, with a particular focus on robustness to variations in object scale. The primary goal is to understand how key changes in model architecture and training strategies influence a model's performance in computer vision tasks. This work introduces the first controlled evaluation of two state-of-the-art segmentation models, TransUnet and SwinUnet, in the context of processing overhead imagery, and explores the specific benefits of transformer-based architectures for vision tasks in this domain. Additionally, propose a novel improvement on the current standard method for stratifying objects by scale and introduce $\eta$, the first theoretical metric for measuring a model's scale robustness as well as a practical estimator for $\eta$. Additionally, we conduct a comprehensive comparison of the scale robustness across many state-of-the-art models and perform rigorous ablation studies to identify the architectural and training factors that most significantly impact scale robustness. Our findings offer new insights into the role of model design in improving the performance of object detection systems in real-world, scale-variable environments.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Luzi, Francesco (2025). Measuring the Impact of Architecture and Training for Model Performance and Robustness in Image Detection. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32718.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.
