Integrative Genomic Modeling of Complex Traits using Pathway Analysis

dc.contributor.advisor

Mukherjee, Sayan

dc.contributor.author

Bennett, Brian D

dc.date.accessioned

2013-01-16T20:29:06Z

dc.date.available

2013-01-16T20:29:06Z

dc.date.issued

2012

dc.department

Computational Biology and Bioinformatics

dc.description.abstract

Understanding the root molecular causes driving complex traits is a fundamental challenge in genomics and genetics. Numerous studies have used variation in gene expression to understand complex traits, but the underlying genomic variation that contributes to these expression changes is not well understood. The overall goal of this work is to develop an integrative framework to better understand the genetic and molecular causes of complex traits, including complex diseases. In this work, I present a computational framework that I developed to integrate gene expression and other genomic data to identify biological differences between samples from opposing complex trait classes that are driven by expression changes and genomic variation. This framework combines analysis on the multi-gene biological pathway level with multi-task learning to build predictive models that also uncover pathways potentially relevant to the complex trait of interest. To validate this framework, I first performed a simulation study to test its predictive ability and to measure how well it uncovered pathways that contain genes that are both differentially expressed and genetically associated with a complex trait. The predictive performance of the multi-task model was found to be comparable to other similar methods. Also, multi-task learning, along with other methods that jointly considered pathway scores from both data sets, was able to better identify pathways with both genetic and expression differences related to the phenotype. I applied this framework to gene expression and genotype data from estrogen receptor (ER) positive and ER negative breast cancer samples. The top 15 predictive pathways from the multi-task model were all related to estrogen, steroids, cell signaling, or the cell cycle. The results from both the simulation studies and the breast cancer analysis suggest that this multi-task framework is useful for both identifying biologically relevant pathways associated with a phenotype across multiple data types while also retaining similar predictive performance as other similar methods.

dc.identifier.uri

https://hdl.handle.net/10161/6167

dc.subject

Bioinformatics

dc.title

Integrative Genomic Modeling of Complex Traits using Pathway Analysis

dc.type

Dissertation

Files

Original bundle

Now showing 1 - 5 of 6
Loading...
Thumbnail Image
Name:
Bennett_duke_0066D_11705.pdf
Size:
1.44 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
Bennett_duke_0066D_17/TableS1.xlsx
Size:
67.06 KB
Format:
Microsoft Excel
No Thumbnail Available
Name:
Bennett_duke_0066D_17/TableS2.xlsx
Size:
112.03 KB
Format:
Microsoft Excel
No Thumbnail Available
Name:
Bennett_duke_0066D_17/TableS3.xlsx
Size:
652.87 KB
Format:
Microsoft Excel
No Thumbnail Available
Name:
Bennett_duke_0066D_17/TableS4.xlsx
Size:
672.4 KB
Format:
Microsoft Excel

Collections