K-mer Based Methods for Measuring and Predicting DNA-Binding Specificity of Transcription Factors
dc.contributor.advisor | Gordân, Raluca | |
dc.contributor.author | Mielko, Zachery | |
dc.date.accessioned | 2023-06-08T18:23:33Z | |
dc.date.issued | 2023 | |
dc.department | Genetics and Genomics | |
dc.description.abstract | Transcription factors (TFs) are proteins that bind DNA based on the sequence and structure to regulate gene expression. They are fundamental components of genomic function, present in all known forms of life. Thus, understanding the conditions required for TF-DNA interactions is a longstanding and active field of study. With the advent of comprehensive k-mer based measurements using protein binding microarrays, the binding profiles of hundreds of TFs have been measured. This dissertation addresses two major problems. First, the information from these comprehensive measurements are used to create simplistic models of binding that capture only the high affinity range. In a biological context, weak binding sites are often the most important in developmental and regulatory processes and can be missed by models targeting high affinity binding sites. Second, that the vast majority of measurements are on structurally unmodified DNA. TF binding occurs in complex and dynamic systems where the DNA structure can be significantly altered due to sources such as DNA damage. First, we look at how DNA shape influences binding through the study of UV induced photoproducts, DNA adducts formed from UV light exposure that distort the shape of pyrimidine dinucleotides. We developed a new k-mer based method for measuring TF binding to UV-irradiated DNA, UV-Bind. Using this technology, we find that the UV-induced changes in DNA structure from pyrimidine dinucleotide photoproducts can change the specificity of TFs. Using high-throughput k-mer measurements, we also found non-canonical sequences that show an increase in binding signal after UV-irradiation. We then introduce a new algorithm for calling TF binding sites using k-mers, CtrlF-TF. CtrlF-TF takes high-throughput k-mer measurements from PBMs and outputs aligned, ranked consensus sites that can be searched in a genome. These sites compare favorably to traditional position weight matrix defined sites via in vivo and in vitro benchmarks. | |
dc.identifier.uri | ||
dc.subject | Bioinformatics | |
dc.subject | Biochemistry | |
dc.subject | Genetics | |
dc.subject | K-mer | |
dc.subject | Photoproduct | |
dc.subject | Protein-DNA | |
dc.subject | Transcription factor | |
dc.subject | UV Damage | |
dc.title | K-mer Based Methods for Measuring and Predicting DNA-Binding Specificity of Transcription Factors | |
dc.type | Dissertation | |
duke.embargo.months | 24 | |
duke.embargo.release | 2025-05-24T00:00:00Z |