Bayesian interaction estimation with high-dimensional dependent predictors
dc.contributor.advisor | Dunson, David B | |
dc.contributor.author | Ferrari, Federico | |
dc.date.accessioned | 2021-05-19T18:09:06Z | |
dc.date.available | 2021-05-19T18:09:06Z | |
dc.date.issued | 2021 | |
dc.department | Statistical Science | |
dc.description.abstract | Humans are constantly exposed to mixtures of different chemicals arising from environmental contamination. While certain compounds, such as heavy metals and mercury, are well known to be toxic, there are many complex mixtures whose health effects are still unknown. It is of fundamental public health importance to understand how these exposures interact to impact risk of disease and the health effects of cumulative exposure to multiple agents. The goal of this thesis is to build data-driven models to tackle major challenges in modern health applications, with a special interest in estimating statistical interactions among correlated exposures. In Chapter 1, we develop a flexible Gaussian process regression model (MixSelect) that allows to simultaneously estimate a complex nonparametric model and provide interpretability. A key component of this approach is the incorporation of a heredity constraint to only include interactions in the presence of main effects, effectively reducing dimensionality of the model search. Next, we focus our modelling effort on characterizing the joint variability of chemical exposures using factor models. In fact, chemicals usually co-occur in the environment or in synthetic mixtures; as a result, their exposure levels can be highly correlated. In Chapter 3, we build a Factor analysis for INteractions (FIN) framework that jointly provides dimensionality reduction in the chemical measurements and allows to estimate main effects and interactions. Through appropriate modifications of the factor modeling structure, FIN can accommodate higher order interactions and multivariate outcomes. Further, we extend FIN to survival analysis and exponential families in Chapter 4, as medical studies often include collect high-dimensional data and time-to-event outcomes. We address these cases through a joint factor analysis modeling approach in which latent factors underlying the predictors are included in a quadratic proportional hazards regression model, and we provide expressions for the induced coefficients on the covariates. In Chapter 5, we combine factor models and nonparametric regression. We build a copula factor model for the chemical exposures and use Bayesian B-splines for flexible dose-response modeling. Finally, in Chapter 6 we we propose a post-processing algorithm that allows for identification and interpretation of the factor loadings matrix and can be easily applied to the models described in the previous chapters. | |
dc.identifier.uri | ||
dc.subject | Statistics | |
dc.subject | Biostatistics | |
dc.subject | Factor models | |
dc.subject | Gaussian processes | |
dc.subject | Interaction Estimation | |
dc.title | Bayesian interaction estimation with high-dimensional dependent predictors | |
dc.type | Dissertation |
Files
Original bundle
- Name:
- Ferrari_duke_0066D_16271.pdf
- Size:
- 10.25 MB
- Format:
- Adobe Portable Document Format