Bias Corrections for Genetic Association Studies with Applications in Pediatric Nephrotic Syndrome
Abstract
Nephrotic syndrome (NS) is one of the most common glomerular diseases seen in children, but its molecular mechanisms are poorly understood, and there are currently no reliable predictors of response to steroids, which are the first line of therapy but fail in 20% of patients. To improve our understanding of the genetic basis of NS and define predictors of therapy response, we carried out genome-wide association studies (GWAS) of previously unstudied multi-ancestry cohorts of children with steroid sensitive nephrotic syndrome (SSNS) and steroid resistant nephrotic syndrome (SRNS), and developed polygenic risk scores (PRS) for steroid response. Our GWAS findings of genome-wide HLA class II variants across multiple ancestries indicate that SSNS, but not SRNS, is a predominantly immune-mediated disorder. These findings add to our knowledge of NS mechanisms and may one day result in improved prediction of therapy response from each patient's genome.
As part of the quality control process, we developed two methods to address the common challenge of merging multiethnic case-control genotype data from multiple platforms, where platform-specific biases may severely confound downstream analyses. This is a critical step in increasing sample size and reducing cost by reusing data from large projects or biobanks. The first method, Allele Frequency Filter (AF-filter), assumes a Binomial distribution of genotypes per ancestry and evaluates significance through a likelihood ratio test combining matched ancestry population pairs, requiring overlap of homogeneous ancestries across platforms. The second method, Logistic Mixed Model Filter (LMM-filter), handles admixture and family structure without ancestry labels but requires individual level data instead of summary statistics. GWAS iterations in a pseudo-case-control setting identifies biased SNPs, categorized as 'flip' for misaligned alleles due to reverse complements or 'remove' for SNP exclusion. We validated these methods, along with the standard Hardy-Weinberg Equilibrium (HWE) test, on simulations, and applied them successfully to harmonize external controls from 1000 Genomes Project with three major ancestry groups (African, European, and South Asian), which were used to perform NS GWAS and replication analyses.
In addition to standard association testing, we explore confounding due to cryptic relatedness between GWAS studies that are meta-analyzed, a challenge often ignored by researchers. We consider both sex and subpopulation stratified analyses by testing simulated replicates and real data with different between-study relatedness scenarios, particularly population structure and recent family relatedness. We confirmed that the presence of family structure between studies causes inflation upon meta-analysis, which is severe for family studies at low sample sizes, and observed but negligible effect size in population studies with up to n ~ 10,000 individuals, although we predict that larger n increases confounding for the same levels of relatedness. We recommend avoiding meta-analysis of studies that share the same populations, especially sex-stratified analysis, because male and female populations are highly likely to share substantial cryptic relatedness.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Tu, Tiffany (2025). Bias Corrections for Genetic Association Studies with Applications in Pediatric Nephrotic Syndrome. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32720.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.