Population Sequencing for Studying Natural and Artifcial Variation in C. elegans
dc.contributor.advisor | Baugh, L Ryan | |
dc.contributor.author | Moore, Brad T. | |
dc.date.accessioned | 2017-05-16T17:29:02Z | |
dc.date.available | 2019-04-24T08:17:07Z | |
dc.date.issued | 2017 | |
dc.department | Computational Biology and Bioinformatics | |
dc.description.abstract | The advent of high coverage and low cost sequencing technologies has allowed for newer and more powerful approaches in molecular and population genetics. Transposon sequencing, where genome-saturated mutant populations allele frequencies are measured before and after selection, functionally characterizes each and every gene in the genome in a single experiment. The approach has been successfully applied to a variety of phenotypes in a variety of unicellular systems: growth and motility in E. coli, synthetic genetic interactions in yeast, and in vitro pathogen-resistance in mammalian cell lines. However, transposon insertion typically produces null alleles, which can be valuable to identify gene function, but evolutionary insight relies on identifcation of naturally occurring polymorphisms affecting the trait of interest. Genome-wide association studies (GWAS) can be used to study the effect of natural genetic variation on a trait, but they grow prohibitively expensive if the number of individuals to genotype and phenotype becomes large. Here I describe the application of transposon sequencing and pooled sequencing GWAS in the whole metazoan model, Caenorhabditis elegans. Transposon sequencing has not been previously implemented in an animal model. I have sequenced a control library using our method, C. elegans transposon sequencing (CeTnSeq). We have constructed a new Mos1 transposon mutator strain that is more convenient to use than the existing strain and allows for extra-chromosomal insertions to be degraded by restriction digest. My preliminary results show that our method is qualitatively effective at identifying transposon insertion sites, but suffers from PCR duplication error. I propose to optimize the number of PCR cycles in the library and to include unique molecular identifiers (UMI) in the library adaptor. I also show that the restriction digest is effective at removing extra-chromosomal array insertions from the library. I constructed simulation models to help design optimal Ce-TnSeq experiments with respect to statistical power for a proposed starvation survival assay. I considered many parameters affecting the design, including: culture size, number of generations, expected effect size, sequencing coverage, and sample size. I show that the number of homozygous mutant animals in the screen is a critical factor in the design of experiments. I also saw diminishing returns with respect to increasing sample size and sequencing depth. These simulations will be invaluable in designing future Ce- TnSeq experiments and identifying critical aspects of the protocol to optimize. We performed pooled sequencing (using restriction-site associated DNA sequencing) on a population of 95 wild isolates subjected to starvation. I identified strains that were resistant and sensitive to starvation, and we verified these results using traditional methods. We used our population sequencing data to perform an association study of starvation survival across the 95 strains, and identified two statistically significant quantitative trait loci. | |
dc.identifier.uri | ||
dc.subject | Genetics | |
dc.subject | Bioinformatics | |
dc.subject | Molecular biology | |
dc.subject | Caenorhabditis elegans | |
dc.subject | Genetics | |
dc.subject | mos1 | |
dc.subject | Population sequencing | |
dc.subject | Quantitative genetics | |
dc.subject | transposon sequencing | |
dc.title | Population Sequencing for Studying Natural and Artifcial Variation in C. elegans | |
dc.type | Dissertation | |
duke.embargo.months | 23 |