Bayesian interaction estimation with high-dimensional dependent predictors
Date
2021
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
Humans are constantly exposed to mixtures of different chemicals arising from environmental contamination. While certain compounds, such as heavy metals and mercury, are well known to be toxic, there are many complex mixtures whose health effects are still unknown. It is of fundamental public health importance to understand how these exposures interact to impact risk of disease and the health effects of cumulative exposure to multiple agents. The goal of this thesis is to build data-driven models to tackle major challenges in modern health applications, with a special interest in estimating statistical interactions among correlated exposures. In Chapter 1, we develop a flexible Gaussian process regression model (MixSelect) that allows to simultaneously estimate a complex nonparametric model and provide interpretability. A key component of this approach is the incorporation of a heredity constraint to only include interactions in the presence of main effects, effectively reducing dimensionality of the model search. Next, we focus our modelling effort on characterizing the joint variability of chemical exposures using factor models. In fact, chemicals usually co-occur in the environment or in synthetic mixtures; as a result, their exposure levels can be highly correlated. In Chapter 3, we build a Factor analysis for INteractions (FIN) framework that jointly provides dimensionality reduction in the chemical measurements and allows to estimate main effects and interactions. Through appropriate modifications of the factor modeling structure, FIN can accommodate higher order interactions and multivariate outcomes. Further, we extend FIN to survival analysis and exponential families in Chapter 4, as medical studies often include collect high-dimensional data and time-to-event outcomes. We address these cases through a joint factor analysis modeling approach in which latent factors underlying the predictors are included in a quadratic proportional hazards regression model, and we provide expressions for the induced coefficients on the covariates. In Chapter 5, we combine factor models and nonparametric regression. We build a copula factor model for the chemical exposures and use Bayesian B-splines for flexible dose-response modeling. Finally, in Chapter 6 we we propose a post-processing algorithm that allows for identification and interpretation of the factor loadings matrix and can be easily applied to the models described in the previous chapters.
Type
Department
Description
Provenance
Citation
Permalink
Citation
Ferrari, Federico (2021). Bayesian interaction estimation with high-dimensional dependent predictors. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/23121.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.