Predicting Genome-wide DNA Methylation in Humans

Loading...
Thumbnail Image

Date

2014

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

319
views
365
downloads

Abstract

DNA methylation is one of the most studied and important epigenetic modifications in cells, playing a role in DNA transcription, splicing, and imprinting. Recently, advanced genome-wide DNA methylation profiling technologies have been developed, making it possible to conduct methylome-wide association studies. One of the problems with large scale DNA methylation studies is that the current technologies are either targeting only a limited number of CpG sites in the genome or whole genome sequencing is expensive and time consuming for most laboratories. Computational prediction of CpG site-specific methylation levels is the cost-saving and time-saving alternative.

In this work, we found striking patterns of DNA methylation across the genome. We show that correlation among CpG sites decays rapidly within several hundreds base pairs in contrast to the LD structure of genotypes which holds for up to several KB. Using genomic features including, neighbor CpG site methylation and genomic distance, genomic context such as CpG island regions, and genomic regulatory elements, we built random forest classifiers to predict CpG site methylation levels. Our approach achieves 92% prediction accuracy at single CpG sites in different genome-wide methylation datasets. We achieves the highest accuracy as 98% for prediction within CpG island regions. What's more, our method identifies genomic features that interact with DNA methylation, which improves our understanding of mechanisms involved in DNA methylation modification and regulation.

Description

Provenance

Citation

Citation

Zhang, Weiwei (2014). Predicting Genome-wide DNA Methylation in Humans. Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/9123.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.