Uncovering causal noncoding variants among nervous system disease linked variants with evolutionary analysis, epigenomic annotation, and in vivo scSTARR-seq
Repository Usage Stats
Integrating information from multiple sources of genomic data, including inferred evolutionary history, epigenomic annotation, and high throughput in vivo assays, can help expand from the conclusions drawn from Genome-Wide Association Studies (GWAS) to provide causal insight into human genomic variation. I collected all nervous system disease-associated variation from the human GWAS catalog and identified variants linked to human ancestor quickly evolved regions (HAQERs), regions that rapidly evolved under positive selection and are enriched in neurodevelopmental functional elements. I identified variants located in putative enhancer elements based on open chromatin and 3-dimensional chromosome contacts with nearby promotors. To facilitate in vivo testing of these variants, I implemented haplotypeGenerator, an open-source program that infers haplotype sequences from individual variation data. I evaluated the efficiency of multiple methods of input library preparation and identified bulk transformation with maxiprep DNA isolation, as well as content validation by sequencing, as the most efficient method. The methods outlined in this work provide a framework for deeper interpretation of disease-linked variation and facilitate better understandings of the genetic determinants of human disease.
Simpson, Shae (2023). Uncovering causal noncoding variants among nervous system disease linked variants with evolutionary analysis, epigenomic annotation, and in vivo scSTARR-seq. Honors thesis, Duke University. Retrieved from https://hdl.handle.net/10161/27105.
Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.