Functional annotation signatures of disease susceptibility loci improve SNP association analysis.
Abstract
BACKGROUND: Genetic association studies are conducted to discover genetic loci that
contribute to an inherited trait, identify the variants behind these associations
and ascertain their functional role in determining the phenotype. To date, functional
annotations of the genetic variants have rarely played more than an indirect role
in assessing evidence for association. Here, we demonstrate how these data can be
systematically integrated into an association study's analysis plan. RESULTS: We developed
a Bayesian statistical model for the prior probability of phenotype-genotype association
that incorporates data from past association studies and publicly available functional
annotation data regarding the susceptibility variants under study. The model takes
the form of a binary regression of association status on a set of annotation variables
whose coefficients were estimated through an analysis of associated SNPs in the GWAS
Catalog (GC). The functional predictors examined included measures that have been
demonstrated to correlate with the association status of SNPs in the GC and some whose
utility in this regard is speculative: summaries of the UCSC Human Genome Browser
ENCODE super-track data, dbSNP function class, sequence conservation summaries, proximity
to genomic variants in the Database of Genomic Variants and known regulatory elements
in the Open Regulatory Annotation database, PolyPhen-2 probabilities and RegulomeDB
categories. Because we expected that only a fraction of the annotations would contribute
to predicting association, we employed a penalized likelihood method to reduce the
impact of non-informative predictors and evaluated the model's ability to predict
GC SNPs not used to construct the model. We show that the functional data alone are
predictive of a SNP's presence in the GC. Further, using data from a genome-wide study
of ovarian cancer, we demonstrate that their use as prior data when testing for association
is practical at the genome-wide scale and improves power to detect associations. CONCLUSIONS:
We show how diverse functional annotations can be efficiently combined to create 'functional
signatures' that predict the a priori odds of a variant's association to a trait and
how these signatures can be integrated into a standard genome-wide-scale association
analysis, resulting in improved power to detect truly associated variants.
Type
Journal articleSubject
Disease SusceptibilityFemale
Genetic Association Studies
Genetic Loci
Genome-Wide Association Study
Humans
Molecular Sequence Annotation
Ovarian Neoplasms
Polymorphism, Single Nucleotide
Permalink
https://hdl.handle.net/10161/8882Published Version (Please cite this version)
10.1186/1471-2164-15-398Publication Info
Iversen, ES; Lipton, G; Clyde, MA; & Monteiro, ANA (2014). Functional annotation signatures of disease susceptibility loci improve SNP association
analysis. BMC Genomics, 15. pp. 398. 10.1186/1471-2164-15-398. Retrieved from https://hdl.handle.net/10161/8882.This is constructed from limited available data and may be imprecise. To cite this
article, please review & use the official citation provided by the journal.
Collections
More Info
Show full item recordScholars@Duke
Merlise Clyde
Professor of Statistical Science
Model uncertainty and choice in prediction and variable selection problems for linear,
generalized linear models and multivariate models. Bayesian Model Averaging. Prior
distributions for model selection and model averaging. Wavelets and adaptive kernel
non-parametric function estimation. Spatial statistics. Experimental design for
nonlinear models. Applications in proteomics, bioinformatics, astro-statistics,
air pollution and health effects, and environmental sciences.
Edwin Severin Iversen Jr.
Research Professor of Statistical Science
Bayesian statistical modeling with application to problems in genetic epidemiology
and cancer research; models for epidemiological risk assessment, including hierarchical
methods for combining related epidemiological studies; ascertainment corrections for
high risk family data; analysis of high-throughput genomic data sets.
Alphabetical list of authors with Scholars@Duke profiles.

Articles written by Duke faculty are made available through the campus open access policy. For more information see: Duke Open Access Policy
Rights for Collection: Scholarly Articles
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info