Incorporating Scalability and Structural Constraints in Bayesian Modeling

Limited Access
This item is unavailable until:
2025-09-14

Date

2023

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

57
views
0
downloads

Abstract

Real-life modeling of probabilistic events often involves incorporating constraints on quantities of interest. Broadly, such constraints can be classified as being either computational when facing limitations on computational feasibility or budget, or structural when facing limitations in terms of modeling a desirable quantity of interest due to the inherent nature of this quantity. To that end, this work focuses on incorporating computational and structural constraints into modeling real-life data from a Bayesian perspective. In Chapter 2, we focus on the problem of Bayesian nonparametric density estimation. Although well-studied and highly regarded in existing literature due to their flexibility, adaptability, and accuracy along with quantifying uncertainty when estimating probability density functions, Bayesian nonparametric approaches often face major roadblocks in terms of computation via cumbersome Markov chain Monte Carlo (MCMC) algorithms. By leveraging on aspects of nearest neighbor allocation and Bayesian mixture models, we engineer a highly effective hybrid density estimation approach called Nearest Neighbor Dirichlet Mixtures (NN-DM). The NN-DM completely avoids MCMC and is embarrassingly parallel, providing substantial computational gains in comparison to existing approaches, along with providing accurate point estimation and uncertainty quantification both theoretically and empirically. In Chapter 3, we consider the problem of dose response modeling in a public health scenario, where individuals are exposed to toxic chemicals. An overwhelming portion of the current approaches only focus on quantifying the marginal effects of these exposures on the response, ignoring possible interactions. As an alternative, our focus is on incorporating structural constraints in the form of modeling synergistic and antagonistic interactions between the chemicals. We developed the Synergistic Antagonistic Interaction Detection (SAID), a novel Bayesian approach shrinking interactions to being synergistic or antagonistic. Instead of focusing only on linear effects, our model is flexible to allow non-linearity and scales well computationally with moderate number of exposures. We apply our approach to an NHANES data set and uncover interactions between heavy metals affecting kidney function. Finally, in Chapter 4, we focus on the problem of Bayesian factor analysis. Bayesian factor models provide an elegant framework to model high-dimensional covariance matrices as the sum of two components, one low rank and another diagonal. Existing approaches utilizing MCMC to obtain posterior draws of the covariance matrix face significant challenges in terms of slow convergence and mode switching due to non-identifiability of the factor model resulting from rotational invariance. As both the sample size and the number of dimensions increase, we focus on a blessing of dimensionality phenomenon allowing us to effectively obtain a plug-in estimate of the latent factors. Using this plug-in estimate, our proposed Factor Analysis with BLEssing of dimensionality (FABLE) approach provides a pseudo-posterior for the covariance matrix. FABLE is an embarrassingly parallel technique with immense computational benefits, completely bypassing MCMC and thus its pitfalls. We provide theoretical guarantees on the performance of FABLE, along with evaluating the approach in numerous simulation studies.

Description

Provenance

Subjects

Statistics

Citation

Citation

Chattopadhyay, Shounak (2023). Incorporating Scalability and Structural Constraints in Bayesian Modeling. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/29156.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.