The Spatial and Temporal Regulatory Code of Transcription Initiation in Drosophila melanogaster
Transcription initiation is a key component in the regulation of gene expression. Recent high-throughput sequencing techniques have enhanced our understanding of mammalian transcription by revealing narrow and broad patterns of transcription start sites (TSSs). Transcription initiation is central to the determination of condition specificity, as distinct repertoires of transcription factors (TFs) that assist in the recruitment of the RNA polymerase II to the DNA are present under different conditions. However, our understanding of the presence and spatiotemporal architecture of the promoter patterns in the fruit fly remains in its infancy. Nucleosome organization and transcription initiation have been considered hallmarks of gene expression, but their cooperative regulation is also not yet understood.
In this work, we applied a hierarchical clustering strategy on available 5' expressed sequence tags (ESTs), and developed an improved paired-end sequencing strategy to explore the transcription initiation landscape of the D.melanogaster genome. We distinguished three initiation patterns: 'Peaked or Narrow Peak TSSs‛, 'Broad Peak TSSs‛, and 'Broad TSS cluster groups or Weak Peak TSSs‛. The promoters of peaked TSSs contained the location specific sequence elements, and were bound by TATA Binding Protein (TBP), while the promoters of broad TSS cluster groups were associated with non-location-specific elements, and were bound by the TATA-box related Factor 2 (TRF2).
Available ESTs and a tiling array time series enabled us to show that TSSs had distinct associations to conditions, and temporal patterns of embryonic activity differed across the majority of alternative promoters. Peaked promoters had an association to maternally inherited transcripts, and broad TSS cluster group promoters were more highly associated to zygotic utilization. The paired-end sequencing strategy identified a large number of 5' capped transcripts originating from coding exons that were unlikely the result of alternative TSSs, but rather the product of post-transcriptional modifications.
We applied an innovative search program called FREE to embryo, head, and testes specific core promoter sequences and identified 123 motifs: 16 novel and 107 supported by other motif sources. Motifs in the embryo specific core promoters were found at location hotspots from the TSS. A family of oligos was discovered that matched the Pause Button motif that is associated with RNA pol II stalling.
Lastly, we analyzed nucleosome organization, chromatin structure, and insulators across the three promoter patterns in the fruit fly and human genomes. The WP promoters showed higher associations with H2A.Z, DNase Hypersensitivity Sites (DHS), H3K4 methylations, and Class I insulators CTCF/BEAF32/CP190. Conversely, NP promoters had higher associations with polII and GAF binding. BP promoters exhibited a combination of features from both promoter patterns. Our study provides a comprehensive map of initiation sites and the conditions under which they are utilized in D. melanogaster. The presence of promoter specific histone replacements, chromatin modifications, and insulator elements support the existence of two divergent strategies of transcriptional regulation in higher eukaryotes. Together, these data illustrate the complex regulatory code of transcription initiation.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations