Regulatory Elements and Gene Expression in Primates and Diverse Human Cell-types
Date
2013
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
After finishing a human genome reference sequence in 2002, the genomics community has
turned to the task of interpreting it. A primary focus is to identify and characterize not only
protein-coding genes, but all functional elements in the genome. The effort has identified
millions of regulatory elements across species and in hundreds of human cell-types. Nearly
all identified regulatory elements are found in non-coding DNA, hypothesizing a function
for previously unannotated sequence. The ability to identify regulatory DNA genome-wide
provides a new opportunity to understand gene regulation and to ask fundamental questions
in diverse areas of biology.
One such area is the aim to understand the molecular basis for phenotypic differences
between humans and other primates. These phenotypic differences are partially driven
by mutations in non-coding regulatory DNA that alter gene expression. This hypothesis
has been supported by differential gene expression analyses in general, but we have not
yet identified specific regulatory variants responsible for differences in transcription and
phenotype. I have worked to identify regulatory differences in the same cell-type isolated
from human, chimpanzee, and macaque. Most regulatory elements were conserved among
all three species, as expected based on their central role in regulating transcription. How-
ever, several hundred regulatory elements were gained or lost on the lineages leading to
modern human and chimpanzee. Species-specific regulatory elements are enriched near
differentially expressed genes, are positively correlated with increased transcription, show
evidence of branch-specific positive selection, and overlap with active chromatin marks.
ivSpecies-specific sequence differences in transcription factor motifs found within this regu-
latory DNA are linked with species-specific changes in chromatin accessibility. Together,
these indicate that species-specific regulatory elements contribute to transcriptional and
phenotypic differences among primate species.
Another fundamental function of regulatory elements is to define different cell-types in
multicellular organisms. Regulatory elements recruit transcription factors that modulate
gene expression distinctly across cell-types. In a study of 112 human cell-types, I classified
regulatory elements into clusters based on regulatory signal tissue specificity. I then used
these to uncover distinct associations between regulatory elements and promoters, CpG-
islands, conserved elements, and transcription factor motif enrichment. Motif analysis
identified known and novel transcription factor binding motifs in cell-type-specific and
ubiquitous regulatory elements. I also developed a classifier that accurately predicts cell-
type lineage based on only 43 regulatory elements and evaluated the tissue of origin for
cancer cell-types. By correlating regulatory signal and gene expression, I predicted target
genes for more than 500k regulatory elements. Finally, I introduced a web resource to
enable researchers to explore these regulatory patterns and better understand how expression
is modulated within and across human cell-types.
Regulation of gene expression is fundamental to life. This dissertation uses identified
regulatory DNA to better understand regulatory systems. In the context of either evolution-
ary or developmental biology, understanding how differences in regulatory DNA contribute
to phenotype will be central to completely understanding human biology.
Type
Department
Description
Provenance
Citation
Permalink
Citation
Sheffield, Nathan (2013). Regulatory Elements and Gene Expression in Primates and Diverse Human Cell-types. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/7157.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.