Measuring the Impact of Architecture and Training for Model Performance and Robustness in Image Detection

Loading...

Date

2025

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

3
views
6
downloads

Attention Stats

Abstract

This research investigates the impact of model components on the performance of object detection, with a particular focus on robustness to variations in object scale. The primary goal is to understand how key changes in model architecture and training strategies influence a model's performance in computer vision tasks. This work introduces the first controlled evaluation of two state-of-the-art segmentation models, TransUnet and SwinUnet, in the context of processing overhead imagery, and explores the specific benefits of transformer-based architectures for vision tasks in this domain. Additionally, propose a novel improvement on the current standard method for stratifying objects by scale and introduce $\eta$, the first theoretical metric for measuring a model's scale robustness as well as a practical estimator for $\eta$. Additionally, we conduct a comprehensive comparison of the scale robustness across many state-of-the-art models and perform rigorous ablation studies to identify the architectural and training factors that most significantly impact scale robustness. Our findings offer new insights into the role of model design in improving the performance of object detection systems in real-world, scale-variable environments.

Description

Provenance

Subjects

Computer engineering, Computer science, Statistics, Computer Vision, Machine Learning, Object Detection, Segmentation

Citation

Citation

Luzi, Francesco (2025). Measuring the Impact of Architecture and Training for Model Performance and Robustness in Image Detection. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32718.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.