Browsing by Subject "Human genetics"
- Results Per Page
- Sort Options
Item Open Access Functional Evaluation of Causal Mutations Identified in Human Genetic Studies(2016) Lu, Yi-FanHuman genetics has been experiencing a wave of genetic discoveries thanks to the development of several technologies, such as genome-wide association studies (GWAS), whole-exome sequencing, and whole genome sequencing. Despite the massive genetic discoveries of new variants associated with human diseases, several key challenges emerge following the genetic discovery. GWAS is known to be good at identifying the locus associated with the patient phenotype. However, the actually causal variants responsible for the phenotype are often elusive. Another challenge in human genetics is that even the causal mutations are already known, the underlying biological effect might remain largely ambiguous. Functional evaluation plays a key role to solve these key challenges in human genetics both to identify causal variants responsible for the phenotype, and to further develop the biological insights from the disease-causing mutations.
We adopted various methods to characterize the effects of variants identified in human genetic studies, including patient genetic and phenotypic data, RNA chemistry, molecular biology, virology, and multi-electrode array and primary neuronal culture systems. Chapter 1 is a broader introduction for the motivation and challenges for functional evaluation in human genetic studies, and the background of several genetics discoveries, such as hepatitis C treatment response, in which we performed functional characterization.
Chapter 2 focuses on the characterization of causal variants following the GWAS study for hepatitis C treatment response. We characterized a non-coding SNP (rs4803217) of IL28B (IFNL3) in high linkage disequilibrium (LD) with the discovery SNP identified in the GWAS. In this chapter, we used inter-disciplinary approaches to characterize rs4803217 on RNA structure, disease association, and protein translation.
Chapter 3 describes another avenue of functional characterization following GWAS focusing on the novel transcripts and proteins identified near the IL28B (IFNL3) locus. It has been recently speculated that this novel protein, which was named IFNL4, may affect the HCV treatment response and clearance. In this chapter, we used molecular biology, virology, and patient genetic and phenotypic data to further characterize and understand the biology of IFNL4. The efforts in chapter 2 and 3 provided new insights to the candidate causal variant(s) responsible for the GWAS for HCV treatment response, however, more evidence is still required to make claims for the exact causal roles of these variants for the GWAS association.
Chapter 4 aims to characterize a mutation already known to cause a disease (seizure) in a mouse model. We demonstrate the potential use of multi-electrode array (MEA) system for the functional characterization and drug testing on mutations found in neurological diseases, such as seizure. Functional characterization in neurological diseases is relatively challenging and available systematic tools are relatively limited. This chapter shows an exploratory research and example to establish a system for the broader use for functional characterization and translational opportunities for mutations found in neurological diseases.
Overall, this dissertation spans a range of challenges of functional evaluations in human genetics. It is expected that the functional characterization to understand human mutations will become more central in human genetics, because there are still many biological questions remaining to be answered after the explosion of human genetic discoveries. The recent advance in several technologies, including genome editing and pluripotent stem cells, is also expected to make new tools available for functional studies in human diseases.
Item Open Access Genetic and Environmental Contributions to Baseline Cognitive Ability and Cognitive Response to Topiramate(2010) Cirulli, Elizabeth TrilbyAlthough much research has focused on cognitive ability and the genetic and environmental factors that might influence it, this aspect of human nature is still far from being well understood. It has been well-established that certain factors such as age and education have significant impacts on performance on most cognitive tests, but the effects of variables such as cognitive pastimes and strategies used during testing have generally not been assessed. Additionally, no genetic variant has yet been unequivocally shown to influence the normal variation in cognitive ability of healthy individuals. Candidate gene studies of cognition have produced conflicting results that have not been replicable, and genome-wide association studies have not found common variants with large influences on this trait.
Here, we have recruited a large cohort of healthy volunteers (n=1,887) and administered a brief cognitive battery utilizing diverse, common, and well-known tests. In addition to providing standard demographic information, the subjects also filled out a questionnaire that was designed to assess novel factors such as whether they had seen the test before, in what cognitive pastimes they participated, and what strategies they had used during testing. Linear regression models were built to assess the effects of these variables on the test scores. I found that the addition of novel covariates to standard ones increased the percent of the variation in test score that was explained for all tests; for some tests, the increase was as high as 70%.
Next, I examined the effects of genetic variants on test scores. I first performed a genome-wide association study using the Illumina HumanHap 550 and 610 chips. These chips are designed to directly genotype or tag the vast majority of the common variants in the genome. Despite having 80% power to detect a common variant explaining at least 3-6% (depending on the test) of the variation in the trait, I did not find any genetic variants that were significantly associated after correction for multiple testing. This is in line with the general findings from GWA studies that single common variants have a limited impact on complex traits.
Because of the recent technological advances in next-generation sequencing and the apparently limited role of very common variants, many human geneticists are making a transition from genome-wide association study to whole-genome and whole-exome sequencing, which allow for the identification of rarer variants. Because these methods are currently costly, it is important to utilize study designs that have the best chance of finding causal variants in a small sample size. One such method is the extreme-trait design, where individuals from one or both ends of a trait distribution are sequenced and variants that are enriched in the group(s) are identified. Here, I have sequenced the exomes of 20 young individuals of European ethnicity: 10 that performed at the top of the distribution for the cognitive battery and 10 that performed at the bottom. I identified rare genetic variants that were enriched in one extreme group as compared to the other and performed follow-up genotyping of the best candidate variant that emerged from this analysis. Unfortunately, this variant was not found to be associated in a larger sample of individuals. This pilot study indicates that a larger sample size will be needed to identify variants enriched in cognition extremes.
Finally, I assessed the effect of topiramate, an antiepileptic drug that causes marked side effects in certain cognitive areas in certain individuals, on some of the healthy volunteers (n=158) by giving them a 100 mg dose and then administering the cognitive test two hours later. I compared their scores at this testing session to those at the previous session and calculated the overall level to which they were affected by topiramate. I found that the topiramate blood levels, which were highly dependent on weight and the time from dosing to testing, varied widely between individuals after this acute dose, and that this variation explained 35% of the variability in topiramate response. A genome-wide association study of the remaining variability in topiramate response did not identify a genome-wide significant association.
In sum, I studied the contributions of both environmental and genetic variables to cognitive ability and cognitive response to topiramate. I found that I could identify environmental variables explaining large proportions of the variation in these traits, but that I could not identify genetic variants that influenced the traits. My analysis of genetic variants was for the most part restricted to the very common ones found on genotyping chips, and this and other studies have generally found that single common genetic variants do not have large affects on complex traits. As we move forward into studies that involve the sequencing of whole exomes and genomes, genetic variants with large effects on these complex traits may finally be found.
Item Open Access Genetic Dissection of Chiari Type I Malformation(2013) Markunas, Christina AnnChiari Type I Malformation (CMI) is a developmental disorder characterized by displacement of the cerebellar tonsils below the base of the skull, resulting in significant neurologic morbidity. While there are multiple proposed mechanisms for tonsillar herniation, "classical" CMI is thought to occur due to a compromised posterior cranial fossa (PF). As CMI patients display a high degree of clinical variability, it is hypothesized that this heterogeneous disorder has a complex etiology influenced by multiple genetic and environmental factors. Despite the fact that multiple lines of evidence support a genetic contribution to disease, no genes have been identified to date. Thus, the primary goal of this dissertation is to begin to dissect the genetic etiology of this important disorder and gain a better understanding of what factors contribute to the observed disease heterogeneity.
In order to address these goals, two studies and three distinct analytic approaches were carried out. In the first study, 367 individuals from 66 nonsyndromic, CMI multiplex families provided the basis for a whole genome linkage screen to identify genomic regions likely to harbor CMI susceptibility genes. Results from the linkage screen using the complete collection of families yielded limited evidence for linkage, likely due to genetic heterogeneity. Thus, two separate analytic approaches were applied to the data to reduce phenotypic and hopefully genetic heterogeneity, thereby increasing power to identify disease genes. In the first approach, families were stratified based on the presence or absence of connective tissue disorder (CTD) related conditions as hereditary CTDs are commonly associated with CMI and the presumed mechanism for tonsillar herniation differs between CMI patients with CTDs and "classical" CMI patients. Stratified analyses resulted in increased evidence for linkage to multiple genomic regions. Of particular interest were two regions located on chromosomes 8 and 12, both of which harbor growth differentiation factors, GDF6 and GDF3, which have been implicated in Klippel-Feil syndrome (KFS). In the second approach, a comprehensive evaluation of the genetic contribution to the PF was performed, followed by ordered subset analysis (OSA) using heritable, disease-relevant PF traits to identify increased evidence for linkage within subsets of families that were similar with respect to cranial base morphological traits. Much of the PF was found to be heritable and results from OSA identified multiple genomic regions showing increased evidence for linkage, including regions on chromosomes 1 and 22 which implicated several strong biological candidates for disease.
In the second study, 44 pediatric, surgical CMI patients were ascertained in order to establish disease subtypes using whole genome expression profiles generated from patient blood and dura mater samples and radiological data consisting of PF morphometrics. Sparse k-means clustering as well as a modified version were used to cluster patients using the biological and radiological data both separately and collectively. The most significant patient classes were identified from the pure biological clustering analyses. Further characterization of these classes implicated strong biological candidates involved in endochondral ossification from the dura analysis and a blood gene expression profile exhibiting a global down-regulation in protein synthesis and related pathways that may be associated with comorbid conditions.
Collectively, these studies established several strong biological disease candidates, as well as emphasized the need to better understand and account for disease heterogeneity, re-evaluate the current diagnostic criteria for CMI, and continue to investigate the use of endophenotypes, such as cranial base morphometrics, when conducting genetic studies.
Item Embargo Novel Methods and Mechanisms of Human Genetic Susceptibility to Infectious Disease(2023) Schott, BenjaminUnderstanding the complex interactions between humans and their pathogens is key to the development of effective therapeutic strategies for infectious diseases. One approach to gain insight into host-pathogen interactions is to leverage natural human genetic variation. Traditionally, researchers have employed clinical GWAS (genome-wide association studies) of infected individuals to identify genetic variants that confer susceptibility to infection phenotypes. However, standard clinical GWAS approaches are hampered by issues with sampling, variation in exposure, and difficulty obtaining appropriately matched controls. In this thesis, I have leveraged cellular and molecular GWAS of lymphoblastoid cell lines (LCLs) to uncover mechanisms of immune suppression by Chlamydia trachomatis and identify novel regulators of influenza infection.Previously, our lab developed Hi-throughput Human in vitrO Susceptibility Testing (Hi-HOST) to connect human genetic variation to infectious disease phenotypes measured by flow cytometry and immunoassays from cellular infection of LCLs. Applying Hi-HOST to Chlamydia trachomatis, an obligate intracellular bacterium, revealed a genome-wide significant association between rs2869462 and levels of a pro-inflammatory chemokine, CXCL10, measured in assay supernatants before and after infection. Curiously, we noticed wide variation in induction of CXCL10 that was not associated with rs2869462. Leveraging flow cytometric measurements of infected cells in a multivariate linear model revealed that the most highly infected LCLs showed a high degree of suppression of CXCL10. This indicated to us that C. trachomatis may be actively suppressing CXCL10 induction. We experimentally identified chlamydial protease-like activity factor (CPAF) as responsible for suppression of CXCL10. Applying our multivariate modeling to a panel of 17 other cytokines revealed a similar signature of suppression for RANTES. However, this phenotype was not mediated by CPAF, indicating some degree of specificity of CPAF activity. To further refine Hi-HOST with higher resolution phenotypes and integrated eQTL (expression quantitative trait loci) analyses all in a single infection, we developed single-cell Hi-HOST (scHi-HOST). scHi-HOST leverages single-cell RNA-sequencing of pooled LCLs to simultaneously identify alleles associated with gene expression and susceptibility to influenza A virus (IAV). scHi-HOST identified a common missense variant in ERAP1, rs27895, as associated with viral burden in LCLs. I confirmed this association experimentally using RNAi, overexpression and small molecule inhibition of ERAP1 in vitro. Finally, we performed analysis of human flu challenge and found that volunteers with the risk allele of rs27895 had increased viral burden and worse symptoms over the course of their infection, indicating that our cellular findings may translate to human flu susceptibility as well. Finally, to identify strain-specific susceptibility alleles, I applied scHi-HOST to six diverse strains of IAV. Analyses of these data suggested that infection with CA09 (the strain responsible for the 2009 “Swine Flu” pandemic, A/California/04/2009), produced distinct infection phenotypes and a distinct set of associated genetic variants relative to other strains. I identified rs7144228, an eQTL for HSP90AA1, as significantly associated with CA09 infection, but not any other IAV strain. rs7144228 is specific to populations with African ancestry and contributes more broadly to population differences observed during IAV infection of LCLs. I also identified rs113816500, a SNP intronic to CTSH, as associated with all six strains of IAV, and therefore is a conserved host factor that influenza exploits to increase viral burden. This study suggests that susceptibility to infection is not only dependent on the genotype of the affected individual but is also dependent on the genetic background of the virus.
Item Open Access States of Allelic Imbalance on the X Chromosomes in Human Females(2011) Kucera, Katerina SAllelic imbalance, in which two alleles at a given locus exhibit differences in gene expression, chromatin composition and/or protein binding, is a widespread phenomenon in the human and other complex genomes. Most examples concern individual loci located more or less randomly around the genome and thus imply local and gene-specific mechanisms. However, genomic or chromosomal basis for allelic imbalance is supported by multi-locus examples such as those exemplified by domains of imprinted genes, spanning ~1-2 Mb, or by X chromosome inactivation, involving much of an entire chromosome. Recent studies have shown that genes on the two female X chromosomes exhibit a breadth of expression patterns ranging from complete silencing of one allele to fully balanced biallelic expression. Although evidence for heritability of allele-specific chromatin and expression patterns exists at individual loci, it is unknown whether heritability is also reflected in the chromosome-wide patterns of X inactivation.
The aim of this thesis is to elucidate the extent to which the widespread variable patterns of allelic imbalance on the human X chromosome in females are under genetic control and how access of the transcription machinery to the human inactive X chromosome in females is determined at a genomic level. For the set of variable genes examined in this study, the absence or presence of expression appears to be stochastic with respect to the population rather than abiding by strict genetic rules. Furthermore, variable gene expression that I have detected even among multiple clonal cell lines derived from a single individual suggests fluctuation in transcriptional machinery engagement. I find that, although expression at most genes on the human inactive X chromosome is repressed as a result of X inactivation, a number of loci are accessible to the transcriptional machinery. It appears that RNA Polymerase II is present at alleles on the inactive X even at the promoters of several silenced genes, indicating a potential for expression.
This thesis embodies a transition in the field of human X chromosome inactivation from gene by gene approaches used in the past to utilizing high-throughput technologies and applying follow-up analytic techniques to draw upon the vast data publicly available from large consortia projects.
Item Open Access Whole-exome Sequencing in Rare Diseases and Complex Traits: Analysis and Interpretation(2017) Zhu, XiaolinNext-generation sequencing (NGS), including whole-exome sequencing (WES) and whole-genome sequencing (WGS), has dramatically empowered the human genetic analysis of disease. This is clearly demonstrated by the exponential increase in the number of newly identified disease-causing genes since the early applications of WES to study human diseases1. In particular, WES has been extremely successful in determining the causal mutations of sporadic rare disorders that are intractable to genetic mapping due to lack of informative pedigrees, a large proportion of which are later shown to be caused by de novo mutations (DNMs). In the meantime, WES and WGS studies are being performed increasingly in population scale to understand the genetic basis of complex diseases and traits. The discovery power of WES and WGS is expected to boost further as more and more patients and healthy individuals are sequenced with the development of more powerful and less expensive sequencing technologies. Meanwhile, clinical WES and WGS are playing an increasingly important role in guiding physicians to achieve the accurate diagnosis and treatment informed by genetic findings.
WES and WGS generate a massive amount of sequence data that include both signals and noises, posing challenges for scientists and clinicians to overcome to fully realize the discovery and diagnostic power of NGS. Indeed, despite the many new disease-associated genes identified to date, methodologies used to analyze sequence data vary across research groups and there seems to be no standardized analysis method like those used in genome-wide association studies. Indeed, the sequence data generated allows different ways to perform genetic analysis, and the approach taken can rely on the hypothesis about the underlying genetic architecture of the phenotype, which is often difficult to know precisely. Herein we explored and evaluated two primary approaches to analyzing and interpreting WES data in different contexts. The first is a trio interpretation framework based on variant/genotype filtering and bioinformatic signatures of genes and variants, which can be used to identify the highly penetrant causal mutations responsible for rare genetic conditions given known genotype-phenotype correlations and, at the same time, indicate the presence of novel disease-causing genes. The second is a formal statistical genetics framework comparing the burden of rare, high impact variants gene by gene in case and control subjects, which can be used to identify novel disease-causing genes for both Mendelian and complex traits. We introduced each of these two approaches and provided examples of their implementations.
In Chapter 2, we presented our trio interpretation framework and sought to determine the genetic diagnoses of undiagnosed genetic conditions using WES. By further developing a trio interpretation framework reported in our pilot study2 and applying it to interpreting 119 trios (affected children and their unaffected biological parents), we achieved a diagnostic rate of 24% based on known genotype-phenotype correlations when the study was performed. Furthermore, we sought to extract bioinformatic signatures of causal variants and genes and showed that these bioinformatic signatures clearly pointed toward the pathogenic variants in novel disease-causing genes, some of which have been confirmed later by larger, independent studies. Our study highlights the importance of regularly reanalyzing clinical exomes to achieve the best diagnostic rate of WES by leveraging the latest databases of population-scale sequence (e.g., EVS and ExAC) and genotype-phenotype correlation (e.g., OMIM and ClinVar) as well as advancements in analytical methods. In our documentation of individual cases resolved by trio WES, we also discussed technical details relevant to how the genetic diagnosis was achieved in individual cases.
In Chapter 3, we presented our case-control gene-based collapsing framework focusing on ultra-rare variants predicted to have a high functional impact and sought to identify disease-causing genes responsible for two neurodevelopmental disorders: epileptic encephalopathy (EE) and periventricular nodular heterotopia (PVNH), both rare disorders presumably caused by highly penetrant mutations and characterized by genetic heterogeneity. Remarkably, both conditions had been subjected to DNM analyses by us or other groups, which was very successful and implicated novel causal genes for EE. We showed that our genetic association analysis framework successfully identified EE genes originally implicated by DNM analyses, and a novel PVNH gene that would not be identified by a DNM analysis, empirically highlighting the power of case-control gene-based collapsing analysis in identifying disease genes in rare neurodevelopmental disorders, in particular when we have an educated guess of the genetic architecture of the phenotype. In the third project, we extended our gene-based collapsing analysis to quantitative traits and sought to understand the genetic basis of a newly defined quantitative bone microstructure phenotype that was shown to be strongly associated with ethnicity in White and Asian women. We have special interest in this presumably complex trait for mainly two reasons: it can be a potential “endophenotype” for important bone diseases including osteoporosis, and its strong association with ethnicity indicates genetic control. In this pilot study, we generated WES data on a small number of White and Asian women and applied our gene-based collapsing analysis framework in order to identify the genes associated with this quantitative trait. A larger sample size would likely yield more definitive genetic discoveries, but our preliminary findings suggest that using WES to study the genetics of such “endophenotypes” and other novel clinically relevant phenotypes can identify causal variants that have a high functional impact and deliver novel biological insights.
In conclusion, we introduced a trio WES interpretation framework to identify the genetic diagnoses of unresolved rare genetic diseases and a gene-based collapsing analysis framework focusing on rare, functionally impactful variants to identify genes associated with dichotomous and quantitative traits, and evaluated these methods in multiple human genetic studies. The power of these methods will be boosted by its application to novel phenotypes and the accruing WES and WGS data generated from both patients and healthy individuals.