Gene-Environment Interactions in Cardiovascular Disease
In this manuscript I seek to demonstrate the importance of gene-environment interactions in cardiovascular disease. This manuscript contains five studies each of which contributes to our understanding of the joint impact of genetic variation and environmental exposures to cardiovascular disease: a candidate gene study for gene-smoking interactions associated with early-onset coronary artery disease, an epidemiology study of the association between traffic-related air pollution and cardiovascular disease, a Genome-Wide Interaction Study for gene-by-traffic related air pollution interactions associated with peripheral arterial disease, a Genome-Wide Interaction Study for gene-by-traffic related air pollution interactions on coronary atherosclerosis burden, and a method for analyzing associations between high-dimensional genomics datasets.
Smoking is a strong risk factors for coronary artery disease, and may play a causative role in the incidence of coronary artery disease. Smoking had been implicated as a reason for heterogeneity observed in associations between genetic variants on chromosome three and coronary artery disease. I used a family-based early-onset coronary artery disease cohort (GENECARD) to study gene-smoking interactions. I also used data from the three independent cohorts to perform a meta-analysis of gene-smoking interactions focusing on the KALRN gene and Rho-GTPase pathway. I found significant evidence for gene-smoking interactions associations involving variants in KALRN and other Rho-GTPase pathway genes on chromosome 3.
Though the estimated increase in incident cardiovascular disease or cardiovascular events due to air pollution exposure is modest at 3-5%, the ubiquitous nature of air pollution exposures means it has a substantial population-level impact on cardiovascular disease. Historically genome-wide interaction studies with air pollution have not yielded genome-wide significant interactions, however by implementing statistical tools novel to this field I have discovered significant interactions between genetic variants and traffic-related air pollution that are associated with cardiovascular diseases.
I studied interactions associated with peripheral arterial disease and the number of diseased coronary vessels (an indicator for coronary artery disease burden) using race-stratified cohort study designs. With peripheral arterial disease I observed that variants in both BMP8A and BMP2 showed evidence for interactions in both European-American and African-American cohorts. In BMP8A I uncovered the first genome-wide significant interaction with air pollution associated with cardiovascular disease. BMP2 gene expression is upregulated after exposure to black carbon, a major component of diesel exhaust, and coding variants within this gene showed evidence for interaction. With the number of diseased coronary vessels I observed that variants in PIGR showed significant evidence for involvement in gene-traffic related air pollution interactions. I observed that coding variation within PIGR was associated with coronary artery disease burden in a gene-by-traffic related air pollution interaction model. As PIGR is involved in the immune response it represents a strong candidate gene discovered via an unbiased genome-wide scan.
The use of high dimensional data to study chronic disease is becoming commonplace. In order to properly analyze high-dimensional data without suffering from high false-discovery rate penalties, the data is often summarized in a way that takes advantage of the correlation structure. Two common approaches for this are principal components analysis and canonical correlation analysis. However neither of these approaches are appropriate when one preferentially desires to preserve structure within the data. To address this shortcoming I developed constrained canonical correlation analysis (cCCA). With cCCA one can evaluate the correlation between two high dimensional datasets while preferentially preserving structure in one of the datasets. This has uses when studying multi-variate outcomes such as cardiovascular disease using multi-variate predictors such as air pollution. Additionally cCCA can be used to create endophenotype factors that specifically explain the variation within a high-dimensional set of predictors (such as gene expression or metabolomics data) with respect to potential endophenotypes for cardiovascular disease, such as cholesterol measures.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations