Finding regulatory DNA motifs using alignment-free evolutionary conservation information.
Abstract
As an increasing number of eukaryotic genomes are being sequenced, comparative studies
aimed at detecting regulatory elements in intergenic sequences are becoming more prevalent.
Most comparative methods for transcription factor (TF) binding site discovery make
use of global or local alignments of orthologous regulatory regions to assess whether
a particular DNA site is conserved across related organisms, and thus more likely
to be functional. Since binding sites are usually short, sometimes degenerate, and
often independent of orientation, alignment algorithms may not align them correctly.
Here, we present a novel, alignment-free approach for using conservation information
for TF binding site discovery. We relax the definition of conserved sites: we consider
a DNA site within a regulatory region to be conserved in an orthologous sequence if
it occurs anywhere in that sequence, irrespective of orientation. We use this definition
to derive informative priors over DNA sequence positions, and incorporate these priors
into a Gibbs sampling algorithm for motif discovery. Our approach is simple and fast.
It requires neither sequence alignments nor the phylogenetic relationships between
the orthologous sequences, yet it is more effective on real biological data than methods
that do.
Type
Journal articleSubject
Base SequenceBinding Sites
Conserved Sequence
Molecular Sequence Data
Promoter Regions, Genetic
Sequence Alignment
Sequence Analysis, DNA
Transcription Factors
Permalink
https://hdl.handle.net/10161/15158Published Version (Please cite this version)
10.1093/nar/gkp1166Publication Info
Gordân, Raluca; Narlikar, Leelavati; & Hartemink, Alexander J (2010). Finding regulatory DNA motifs using alignment-free evolutionary conservation information.
Nucleic Acids Res, 38(6). pp. e90. 10.1093/nar/gkp1166. Retrieved from https://hdl.handle.net/10161/15158.This is constructed from limited available data and may be imprecise. To cite this
article, please review & use the official citation provided by the journal.
Collections
More Info
Show full item recordScholars@Duke
Alexander J. Hartemink
Professor of Computer Science
Computational biology, machine learning, Bayesian statistics, transcriptional regulation,
genomics and epigenomics, graphical models, Bayesian networks, hidden Markov models, systems
biology, computational neurobiology, classification, feature selection

Articles written by Duke faculty are made available through the campus open access policy. For more information see: Duke Open Access Policy
Rights for Collection: Scholarly Articles
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info
Related items
Showing items related by title, author, creator, and subject.
-
A high-resolution map of human evolutionary constraint using 29 mammals.
Lindblad-Toh, Kerstin; Garber, Manuel; Zuk, Or; Lin, Michael F; Parker, Brian J; Washietl, Stefan; Kheradpour, Pouya; ... (89 authors) (Nature, 2011-10-12)The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome ... -
Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs.
Green, Richard E; Braun, Edward L; Armstrong, Joel; Earl, Dent; Nguyen, Ngan; Hickey, Glenn; Vandewege, Michael W; ... (55 authors) (Science, 2014-12-12)To provide context for the diversification of archosaurs--the group that includes crocodilians, dinosaurs, and birds--we generated draft genomes of three crocodilians: Alligator mississippiensis (the American alligator), ... -
The 70-kDa heat shock cognate protein (Hsc73) gene is enhanced by ovarian hormones in the ventromedial hypothalamus.
Krebs, CJ; Jarvis, ED; Pfaff, DW (Proc Natl Acad Sci U S A, 1999-02-16)Estrogen (E) and progesterone (P) orchestrate many cellular responses involved in female reproductive physiology, including reproductive behaviors. E- and P-binding neurons important for lordosis behavior have been located ...