Wavelet Regression using MapReduce and Analysis of Multiple Sclerosis Clinical Data
Abstract
Two problems, one related to scalable methods and the other on application of statistical methods to clinical data are addressed in this thesis. In the first chapter, motivated by growing numbers of ``large p'' datasets, we present a novel MapReduce framework for handling multivariate wavelet regression. We compare the time complexity of proposed and conventional methods and show the novel framework scales linearly in the dimension $p$ of the response matrix. Empirical results show consistency with our complexity analysis. This work has its potential application in analysing image data or genomic data where the dimensions are huge.
In the second chapter, we explore a clinical dataset of Multiple Sclerosis (MS) provided by Biogen, which comprises 579 actively managed MS patients enrolled at single center for up to 5 years. Since a therapy to curing MS is unknown, Biogen and we are developing statistical models to predict the progression of disability level as a therapeutic guide. Such disability can be roughly quantified by EDSS (Expanded Disability Status Scale), and as such we conduct predict modelling of EDSS. Before we arrive at these models, we perform explanatory data analysis, conduct predictive modelling of current EDSS based on measurements in the same year.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Song, Hanyu (2017). Wavelet Regression using MapReduce and Analysis of Multiple Sclerosis Clinical Data. Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/15265.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.