Regulatory Elements and Gene Expression in Primates and Diverse Human Cell-types
After finishing a human genome reference sequence in 2002, the genomics community has
turned to the task of interpreting it. A primary focus is to identify and characterize not only
protein-coding genes, but all functional elements in the genome. The effort has identified
millions of regulatory elements across species and in hundreds of human cell-types. Nearly
all identified regulatory elements are found in non-coding DNA, hypothesizing a function
for previously unannotated sequence. The ability to identify regulatory DNA genome-wide
provides a new opportunity to understand gene regulation and to ask fundamental questions
in diverse areas of biology.
One such area is the aim to understand the molecular basis for phenotypic differences
between humans and other primates. These phenotypic differences are partially driven
by mutations in non-coding regulatory DNA that alter gene expression. This hypothesis
has been supported by differential gene expression analyses in general, but we have not
yet identified specific regulatory variants responsible for differences in transcription and
phenotype. I have worked to identify regulatory differences in the same cell-type isolated
from human, chimpanzee, and macaque. Most regulatory elements were conserved among
all three species, as expected based on their central role in regulating transcription. How-
ever, several hundred regulatory elements were gained or lost on the lineages leading to
modern human and chimpanzee. Species-specific regulatory elements are enriched near
differentially expressed genes, are positively correlated with increased transcription, show
evidence of branch-specific positive selection, and overlap with active chromatin marks.
ivSpecies-specific sequence differences in transcription factor motifs found within this regu-
latory DNA are linked with species-specific changes in chromatin accessibility. Together,
these indicate that species-specific regulatory elements contribute to transcriptional and
phenotypic differences among primate species.
Another fundamental function of regulatory elements is to define different cell-types in
multicellular organisms. Regulatory elements recruit transcription factors that modulate
gene expression distinctly across cell-types. In a study of 112 human cell-types, I classified
regulatory elements into clusters based on regulatory signal tissue specificity. I then used
these to uncover distinct associations between regulatory elements and promoters, CpG-
islands, conserved elements, and transcription factor motif enrichment. Motif analysis
identified known and novel transcription factor binding motifs in cell-type-specific and
ubiquitous regulatory elements. I also developed a classifier that accurately predicts cell-
type lineage based on only 43 regulatory elements and evaluated the tissue of origin for
cancer cell-types. By correlating regulatory signal and gene expression, I predicted target
genes for more than 500k regulatory elements. Finally, I introduced a web resource to
enable researchers to explore these regulatory patterns and better understand how expression
is modulated within and across human cell-types.
Regulation of gene expression is fundamental to life. This dissertation uses identified
regulatory DNA to better understand regulatory systems. In the context of either evolution-
ary or developmental biology, understanding how differences in regulatory DNA contribute
to phenotype will be central to completely understanding human biology.
Transcription factor binding
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations