Mukherjee, SayanZheng, Lingling2013-12-162015-12-062013https://hdl.handle.net/10161/8209<p>The accumulation of high-throughput data from vast sources has drawn a lot attentions to develop methods for extracting meaningful information out of the massive data. More interesting questions arise from how to combine the disparate information, which goes beyond modeling sparsity and dimension reduction. This dissertation focuses on the innovations in the area of heterogeneous data integration.</p><p>Chapter 1 contextualizes this dissertation by introducing different aspects of meta-analysis and model frameworks for high-dimensional genomic data.</p><p>Chapter 2 introduces a novel technique, joint Bayesian sparse factor analysis model, to vertically integrate multi-dimensional genomic data from different platforms. </p><p>Chapter 3 extends the above model to a nonparametric Bayes formula. It directly infers number of factors from a model-based approach.</p><p>On the other hand, chapter 4 deals with horizontal integration of diverse gene expression data; the model infers pathway activities across various experimental conditions. </p><p>All the methods mentioned above are demonstrated in both simulation studies and real data applications in chapters 2-4.</p><p>Finally, chapter 5 summarizes the dissertation and discusses future directions.</p>BioinformaticsStatisticsBayesian statisticsBiomarkerCancer genomicsEpigenomicsFactor analysis modelintegrated analysisBayesian meta-analysis models for heterogeneous genomics dataDissertation