The Effects of Metric Alignment on Computer Vision Models

Loading...

Date

2024

Journal Title

Journal ISSN

Volume Title

Abstract

The widespread adoption of deep learning techniques has significantly improved performance across various tasks in computer vision. However, there are still many questions about how these models will perform across diverse real-world conditions. In this dissertation, we focus on improving the reliability of models by aligning the training and evaluation metrics with the objectives of the user. We find two cases where the current literature is using oversimplified proxies and propose new objectives. In the first case, we propose a new training objective and find that this improves the model's performance. In the second case, we propose new evaluation metrics and use these to gain insights into the biases and robustness of current models.

First, we consider binary classification models trained under severe imbalance. Recent work has proposed techniques that mitigate the ill effects of imbalance on training by modifying the loss functions or optimization methods. These techniques each have hyperparameters that can be tuned to optimize performance on some metric (usually accuracy on a balanced test set). We find that different hyperparameters perform best at different recalls on the Precision-Recall curve. This implies that there is a disconnect between how these models are trained--to optimize one scalar metric--and how they are evaluated--over a range of classification thresholds. We propose to align these objectives by training a single model over a distribution of loss-function-hyperparameters via Loss Conditional Training (LCT) and find that this training regimen improves the performance of a variety of methods designed to address imbalance.

Next, we consider neural image compression (NIC) models. Recent advances have produced NIC models that can outperform classical codecs, such as JPEG, on in-distribution data. However, in practice, these models may be deployed in settings with unseen distribution shifts. To bridge this gap, we propose novel datasets and metrics to evaluate the out-of-distribution performance of image compression methods. We then carry out a detailed performance comparison of several classic codecs and NIC variants, revealing intriguing findings that challenge our current understanding of NIC.

Description

Provenance

Subjects

Computer science, class imbalance, computer vision, image compression, machine learning, metric alignment

Citation

Citation

Lieberman, Kelsey (2024). The Effects of Metric Alignment on Computer Vision Models. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32606.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.