Robustness and Generalization Under Distribution Shifts
Date
2022
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
Machine learning algorithms are applied in a wide variety of fields such as finance, healthcare, and entertainment. The objectives of these machine learning algorithms are varied, with two of the most common use cases being inference of a target variable from observations, and sequential decision-making to maximize a reward in the reinforcement learning setting. Regardless of the objective, it is common for machine learning algorithms to be trained on a finite dataset where each sample is collected independently from some data distribution emulating the real world, or, in the case of reinforcement learning, over a finite set of interactions with an environment simulating real world interactions.
One major concern is how to characterize the generalization of these objectives outside of their training data, measured as the discrepancy between performance on the training dataset or environment, and performance in the real world. This is exacerbated by the fact that many applications suffer from distribution shift; a phenomenon where there is a mismatch between the training distribution and the real world environment. Algorithms that are not robust to distribution shifts are liable to present unintended behaviours during deployment. In this work, we develop tools to minimize the risks posed by distribution shifts in a variety of settings. In the first part of this work, we propose and analyze techniques to deal with distribution shifts the supervised learning setting, making the model's decision either independent or robust to certain factors in the input distribution, and show the efficacy of these techniques in dealing with distribution shift. We later examine the setting of sequential decision making, where we discuss how to reinterpret the reinforcement learning scenario in a way that allows generalization bounds from standard supervised learning to be applied to reinforcement learning. We then analyze how to learn representations that are invariant to task-irrelevant distribution, and demonstrate how this can improve performance in the presence of distribution shifts.
Type
Department
Description
Provenance
Citation
Permalink
Citation
Bertran Lopez, Martin Andres (2022). Robustness and Generalization Under Distribution Shifts. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/25297.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.