Advances in Bayesian Hierarchical Models Motivated by Environmental Applications
Date
2023
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
This thesis presents Bayesian hierarchical models that are designed to tackle challenges and accommodate insights from environmental applications. In many environmental applications, we often face high-dimensional and/or large functional data with complex dependence structure. It is of fundamental interest to build an interpretable statistical model that appropriately characterizes the complex dependence and generates accurate predictions. First, Bayesian matrix completion (BMC) is developed to fill missing elements in a large but sparse binary matrix of bioactivity across thousands of chemicals and assay endpoints. Sparsity is a well-known problem in toxicology data because it is not feasible to test all possible combinations of chemicals and assay endpoints even with highly advanced technology. BMC tackles this sparsity through Bayesian hierarchical framework and simultaneously models heteroscedastic errors and a nonparametric mean function with common latent factors to suggest a more interpretable and broader definition of activity. Real application identifies chemicals most likely active for human disease outcomes. Next, Barrier Overlap-Removal Acyclic directed graph Gaussian Process (BORA-GP) is proposed, which is a class of scalable nonstationary Gaussian processes (GPs) that can handle complex geometries of domains. Spatial distribution of measurements that are observed only in some constrained domains can be significantly impacted by physical barriers in the domains. Typical spatial GP models are inappropriate in this case because they may lead to incorrect smoothing over the barriers. BORA-GP constructs sparse directed acyclic graphs (DAGs) with neighbors conforming to barriers, enabling characterization of physically sensible dependence in constrained domains. We apply BORA-GP to predict sea surface salinity (SSS) in the Arctic Ocean. Finally, we propose another class of nonstationary processes that characterize varying directional associations in space and time for point-referenced data. Our construction places a prior over possible directional edges within sparse DAGs, accounting for uncertainty in directional correlation patterns across a domain. The resulting Bag of DAGs processes (BAGs) lead to interpretable nonstationarity and scalability for large data due to sparsity of DAGs. We analyze spatiotemporal movement of fine particulate matter in California using BAGs in which a directed edge represents a prevailing wind direction causing some associated covariance in the particulate matters.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Jin, Bora (2023). Advances in Bayesian Hierarchical Models Motivated by Environmental Applications. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/27623.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.