Stable Variable Selection for Sparse Linear Regression in a Non-uniqueness Regime

Loading...
Thumbnail Image

Date

2023

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

130
views
175
downloads

Abstract

This thesis presents a comprehensive investigation of the LASSO method in a non-uniqueness regime and its ability of stable variable selection. We characterize when LASSO may have non-unique solutions through a sufficient and necessary condition, and how these non-unique solutions behave geometrically: all solutions must lie in the same simplex within the same orthant, and they form a polytope structure in which each corner represents the most parsimonious collection of features that does not contain other corners. Leveraging this geometric structure, this work then explores what to do to practically obtain these estimators, by proposing an efficient sampling algorithm that returns uniformly distributed points on the polytope.

We present a non-asymptotic analysis of the l_2 coefficient error and feature selection consistency for the non-unique LASSO, by restricting necessary conditions (the eigenvalue condition and mutual incoherence condition) to a certain direction or a subset of features. Our theoretical results show that, under strong assumptions, the non-unique LASSO is as theoretically efficient as the original LASSO. Moreover, when dealing with linearly combined features in a dataset, numerical experiments demonstrate the superior stable variable selection performance of our proposed non-unique LASSO over other existing algorithms, particularly if the proper tuning parameters can be selected.

Description

Provenance

Subjects

Citation

Citation

Zhang, Xiaozhu (2023). Stable Variable Selection for Sparse Linear Regression in a Non-uniqueness Regime. Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/27885.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.