Regulatory Elements and Gene Expression in Primates and Diverse Human Cell-types

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



After finishing a human genome reference sequence in 2002, the genomics community has

turned to the task of interpreting it. A primary focus is to identify and characterize not only

protein-coding genes, but all functional elements in the genome. The effort has identified

millions of regulatory elements across species and in hundreds of human cell-types. Nearly

all identified regulatory elements are found in non-coding DNA, hypothesizing a function

for previously unannotated sequence. The ability to identify regulatory DNA genome-wide

provides a new opportunity to understand gene regulation and to ask fundamental questions

in diverse areas of biology.

One such area is the aim to understand the molecular basis for phenotypic differences

between humans and other primates. These phenotypic differences are partially driven

by mutations in non-coding regulatory DNA that alter gene expression. This hypothesis

has been supported by differential gene expression analyses in general, but we have not

yet identified specific regulatory variants responsible for differences in transcription and

phenotype. I have worked to identify regulatory differences in the same cell-type isolated

from human, chimpanzee, and macaque. Most regulatory elements were conserved among

all three species, as expected based on their central role in regulating transcription. How-

ever, several hundred regulatory elements were gained or lost on the lineages leading to

modern human and chimpanzee. Species-specific regulatory elements are enriched near

differentially expressed genes, are positively correlated with increased transcription, show

evidence of branch-specific positive selection, and overlap with active chromatin marks.

ivSpecies-specific sequence differences in transcription factor motifs found within this regu-

latory DNA are linked with species-specific changes in chromatin accessibility. Together,

these indicate that species-specific regulatory elements contribute to transcriptional and

phenotypic differences among primate species.

Another fundamental function of regulatory elements is to define different cell-types in

multicellular organisms. Regulatory elements recruit transcription factors that modulate

gene expression distinctly across cell-types. In a study of 112 human cell-types, I classified

regulatory elements into clusters based on regulatory signal tissue specificity. I then used

these to uncover distinct associations between regulatory elements and promoters, CpG-

islands, conserved elements, and transcription factor motif enrichment. Motif analysis

identified known and novel transcription factor binding motifs in cell-type-specific and

ubiquitous regulatory elements. I also developed a classifier that accurately predicts cell-

type lineage based on only 43 regulatory elements and evaluated the tissue of origin for

cancer cell-types. By correlating regulatory signal and gene expression, I predicted target

genes for more than 500k regulatory elements. Finally, I introduced a web resource to

enable researchers to explore these regulatory patterns and better understand how expression

is modulated within and across human cell-types.

Regulation of gene expression is fundamental to life. This dissertation uses identified

regulatory DNA to better understand regulatory systems. In the context of either evolution-

ary or developmental biology, understanding how differences in regulatory DNA contribute

to phenotype will be central to completely understanding human biology.





Sheffield, Nathan (2013). Regulatory Elements and Gene Expression in Primates and Diverse Human Cell-types. Dissertation, Duke University. Retrieved from


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.