Statistical Modeling of Genetic and Epigenetic Factors in Gene Structures and Transcriptional Enhancers

Loading...
Thumbnail Image

Date

2017

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

119
views
160
downloads

Abstract

Predicting the phenotypic effects of genetic variants is a major goal in modern genetics, with direct applicability in both the study of diseases in humans and animals, and the breeding of agriculturally important plants. Computational methods for interpreting genetic variants are still in their infancy, and rely heavily on annotations of functional genomic elements. Importantly, functional annotations inform the interpretation of genetic variants, but the locations and boundaries of such annotations can be altered by the presence of specific alleles, either singly or in combination, so that variant interpretation and genomic annotation should ideally be performed jointly. Such joint interpretation would enable predictions to account for the influence that one or more variants may have on the phenotypic impacts of other variants.

In this dissertation I describe computational methods for variant interpretation in both gene bodies and, separately, in transcriptional enhancers that regulate the expression of genes. In the case of gene bodies, I describe novel methods for joint modeling of multiple variants and gene structures. Whereas gene structure prediction methods have to date focused exclusively on annotation of reference genomes, I introduce the novel problem of annotating personal genomes of individuals or strains, and I describe and evaluate novel methods for addressing that problem. I show that these methods are able to predict complex changes in gene structures that result from genetic variants, that they are able to jointly interpret multiple variants that are not independent in their effects, and that predictions are supported by both RNA-seq data and patterns of intolerance to mutation across human populations.

In the case of transcriptional enhancers, I describe experimental and associated computational methods for assessing the impacts of genetic variants on the ability of an enhancer to drive gene expression in an episomal reporter assay. I show that these methods are able to identify variants impacting enhancer function, and I show that the functional score assigned by these methods can be used to fine-map gene expression associations.

I also describe a statistical pattern recognition method for efficiently identifying stimulus-responsive regulatory elements genome-wide and parsing those elements into functional sub-components. I show that this model is able to identify stimulus-responsive enhancers with high accuracy. I show that sub-components identified by this method are enriched for distinct sets of binding motifs for transcription factors known to mediate the response to treatment by glucocorticoids. Applying this model to timecourse data, I was able to cluster predicted enhancers into sets having distinct trajectories of activity over time in response to treatment by glucocorticoids. Using experimental chromatin conformation data, I show that these trajectories associate with distinct patterns of expression for genes in physical association with these enhancers.

Description

Provenance

Citation

Citation

Majoros, William H (2017). Statistical Modeling of Genetic and Epigenetic Factors in Gene Structures and Transcriptional Enhancers. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/16348.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.