K-mer Based Methods for Measuring and Predicting DNA-Binding Specificity of Transcription Factors

Limited Access
This item is unavailable until:
2025-05-24

Date

2023

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

30
views
0
downloads

Abstract

Transcription factors (TFs) are proteins that bind DNA based on the sequence and structure to regulate gene expression. They are fundamental components of genomic function, present in all known forms of life. Thus, understanding the conditions required for TF-DNA interactions is a longstanding and active field of study. With the advent of comprehensive k-mer based measurements using protein binding microarrays, the binding profiles of hundreds of TFs have been measured. This dissertation addresses two major problems. First, the information from these comprehensive measurements are used to create simplistic models of binding that capture only the high affinity range. In a biological context, weak binding sites are often the most important in developmental and regulatory processes and can be missed by models targeting high affinity binding sites. Second, that the vast majority of measurements are on structurally unmodified DNA. TF binding occurs in complex and dynamic systems where the DNA structure can be significantly altered due to sources such as DNA damage. First, we look at how DNA shape influences binding through the study of UV induced photoproducts, DNA adducts formed from UV light exposure that distort the shape of pyrimidine dinucleotides. We developed a new k-mer based method for measuring TF binding to UV-irradiated DNA, UV-Bind. Using this technology, we find that the UV-induced changes in DNA structure from pyrimidine dinucleotide photoproducts can change the specificity of TFs. Using high-throughput k-mer measurements, we also found non-canonical sequences that show an increase in binding signal after UV-irradiation. We then introduce a new algorithm for calling TF binding sites using k-mers, CtrlF-TF. CtrlF-TF takes high-throughput k-mer measurements from PBMs and outputs aligned, ranked consensus sites that can be searched in a genome. These sites compare favorably to traditional position weight matrix defined sites via in vivo and in vitro benchmarks.

Description

Provenance

Citation

Citation

Mielko, Zachery (2023). K-mer Based Methods for Measuring and Predicting DNA-Binding Specificity of Transcription Factors. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/27723.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.