Improved Genomic Selection using Vowpal Wabbit with Random Fourier Features
Abstract
Nonlinear regression models are often used in statistics and machine learning due
to greater accuracy than linear models. In this work, we present a novel modeling
framework that is both computationally efficient for high-dimensional datasets, and
predicts more accurately than most of the classic state-of-the-art predictive models.
Here, we couple a nonlinear random Fourier feature data transformation with an intrinsically
fast learning algorithm called Vowpal Wabbit or VW. The key idea we develop is that
by introducing nonlinear structure to an otherwise linear framework, we are able to
consider all possible higher-order interactions between entries in a string. The utility
of our nonlinear VW extension is examined, in some detail, under an important problem
in statistical genetics: genomic selection (i.e.~the prediction of phenotype from
genotype). We illustrate the benefits of our method and its robustness to underlying
genetic architecture on a real dataset, which includes 129 quantitative heterogeneous
stock mice traits from the Wellcome Trust Centre for Human Genetics.
Type
Honors thesisDepartment
Statistical SciencePermalink
https://hdl.handle.net/10161/14308Citation
Zhang, Jaslyn (2017). Improved Genomic Selection using Vowpal Wabbit with Random Fourier Features. Honors thesis, Duke University. Retrieved from https://hdl.handle.net/10161/14308.Collections
More Info
Show full item record
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Undergraduate Honors Theses and Student papers
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info