Browsing by Author "Gelfand, Alan E"
Results Per Page
Sort Options
Item Open Access A Bayesian Forward Simulation Approach to Establishing a Realistic Prior Model for Complex Geometrical Objects(2018) Wang, YizhengGeology is a descriptive science making itself hard to provide quantification. We develop a Bayesian forward simulation approach to formulate a realistic prior model for geological images using the Approximate Bayesian computation (ABC) method. In other words, our approach aims to select a set of representative images from a larger list of complex geometrical objects and provide a probability distribution on it. This allows geologists to start contributing their perspectives to the specification of a realistic prior model. We examine the proposed ABC approach on an experimental Delta dataset and show that, on the basis of selected representative images, the nature of the variability of the Delta can be statistically reproduced by means of the IQSIM, a state-of-the-art multiple-point geostatistical (MPS) simulation algorithm. The results demonstrate that the proposed approach may have a broader spectrum of application. In addition, two different choices for the size of the prior, i.e., the number of representative images are compared and discussed.
Item Open Access Bayesian Analysis of Spatial Point Patterns(2014) Leininger, Thomas JeffreyWe explore the posterior inference available for Bayesian spatial point process models. In the literature, discussion of such models is usually focused on model fitting and rejecting complete spatial randomness, with model diagnostics and posterior inference often left as an afterthought. Posterior predictive point patterns are shown to be useful in performing model diagnostics and model selection, as well as providing a wide array of posterior model summaries. We prescribe Bayesian residuals and methods for cross-validation and model selection for Poisson processes, log-Gaussian Cox processes, Gibbs processes, and cluster processes. These novel approaches are demonstrated using existing datasets and simulation studies.
Item Open Access Bayesian Modeling and Computation for Complex Spatial Point Patterns(2017) Shirota, ShinichiroThis thesis focuses on solving some problems associated with complex spatial point patterns from each modeling and computational perspective. Chapter 1 reviews spatial point patterns and introduces repulsive point processes. We also discuss some potential problems for spatial point patterns, to which this thesis contributes.
In Chapter 2, we begin with modeling the space and time dependence of crime
events in San Francisco. We imagine event times as wrapped around a circle of circumference 24 hours, then introduce the circular dependence of crime times within a log Gaussian Cox process (LGCP). To construct a valid space and circular time LGCP, we propose valid separable and nonseparable space and circular time covariance matrices.
We also compare the proposed models with a nonhomogeneous
Poisson process (NHPP) through the model validation strategy for Cox processes. Our proposed models show better fitting and capture the space and circular time dependence of the intensity surface for crime events.
In Chapter 3, we propose a new Bayesian inference scheme for univariate and multivariate LGCPs. Although a LGCP is a flexible class which can incorporate space and spatio-temporal clustering dependence, Bayesian inference for LGCP is notoriously computationally tough because sampling of high dimensional Gaussian processes
(GP) and their hyperparameters involves high correlation and requires some matrix factorization of high dimensional covariance functions. We tackle with this problem by considering a separable inference scheme for GP and hyperparameters. Our approach utilizes the pseudo-marginal Markov chain Monte Carlo (MCMC) and
estimate the approximate marginal posterior distribution of parameters. Given approximate posterior samples of parameters, the efficient sampling of high dimensional GP is available. We demonstrate the performance of our algorithm by comparing with preceding MCMC algorithms and show the better performance with respect to
the computational time and the accurate parameter recovery for simulated datasets.
In Chapter 4, we develop an approximate Bayesian computation (ABC) scheme for two types of repulsive point processes, Gibbs point processes and determinantal point processes, for which the exact Bayesian inference is unavailable or involves poor mixing. Although these processes have been investigated in mathematics and physics communities, the exact inference of these processes is generally unavailable. Furthermore, the straightforward model comparison between both types of processes has not been investigated though both processes might show different repulsive patterns.
We propose an ABC algorithm for these processes, this approach enables us to compare both processes under a unied approximation strategy. We demonstrate the recovery of parameters and true model classes through the simulation study. The result suggests the true model can be recovered for the relatively large number of points.
In Chapter 5, we propose some models for origin-destination point patterns motivated from two car theft datasets. Both datasets have theft (origin) and corresponding recovered (destination) location information. First one is theft and recovered locations in Neza region in Mexico. Although this dataset has some covariate information,
but some of recovered points are located outside Neza region and some of them are missing. The other one is theft and recovered locations in Belo Horizonte in Brazil. Although all events have origin and destination information within Belo Horizonte, covariate information is unavailable. We suggest four modeling directions for both datasets, different ways of incorporating the spatial dependence between origin and destination pairs. The proposed models are compared with independent
models and show their superior performance.
Item Open Access Bayesian Modeling for Annual Abundance in Ecological Communities Incorporating Zero-Inflation(2022) Tang, BeckyIn this dissertation, we present models that are developed to accommodate challenges and advance insights from modeling ecological abundance data. All the models presented in this work are fit within a Bayesian hierarchical framework. Species distribution models (SDMs) relate observed species abundance or occurrence data to geographically referenced environmental variables. In this dissertation, we focus mainly on multi-species or joint SDMs to incorporate the dependence between multiple species. We provide a dynamic mechanistic modeling framework that combines several biological and physiological processes that are known to operate within a species community. More specifically, we include the processes of local species growth as well as species movement within a geographic region. The mechanisms are represented as parameters to be estimated in our model. As an illustrative example, we first apply our model to the citizen science dataset eBird. We then provide an application to fisheries data from the Northeast Fisheries Science Center that develops a richer model for species redistribution that incorporates time-varying environmental covariates.
As ecological data often exhibit a high incidence of zeros, we also develop models that address the issue of zero-inflation. While there is a wealth of literature on zero-inflated models for count data, we focus on data with continuous support. The familiar Tobit model accommodates positive continuous data with an excess of zeros, but does not allow for multiple interpretations of a zero as the also-familiar zero-inflated Poisson model does. We address this gap in the literature by first providing spatial and non-spatial zero-inflated Beta (ZIB) regression models for data that lie on the unit interval. We also provide a multivariate zero-inflated Tobit (MVT ZI-Tobit) regression model that can capture dependence between elements at a given observation index at multiple stages of the model. For both the ZIB and MVT ZI-Tobit, we present model comparison metrics for predictive performance that specifically target a model’s ability to capture zeros or dependence between observation elements. We apply our ZIB and MV ZI-Tobit models to percent cover of plant species in the Cape Floristic Region and total basal area of trees using Forest Inventory Analysis data, respectively.
Item Open Access Bayesian Spatial Quantile Regression(2010) Lum, KristianSpatial quantile regression is the combination of two separate and individually well-developed ideas that, to date, has barely been explored. Quantile regression seeks to model each quantile of an outcome distribution, whether separately or jointly, conditional upon covariates. Spatial methods have been developed for instances when spatial dependence ought to be incorporated into the model, whether to adjust for the decreased e ective sample size that comes with highly correlated data or to allow the ability to create a model-based spatial surface that interpolates between the data collected. Combining the spatial methods with quantile regression, this dissertation proposes and studies the properties of several process models for quantile regression that incorporate spatial dependence. In each chapter, we present an application for the model presented therein. In all cases, we are able to achieve improved check loss by incorporating a spatial component into the model.
In Chapter 1, the introduction, we motivate this work by exploring several examples that demonstrate the utility of both quantile regression and spatial models separately.
In Chapter 2, we present the asymmetric Laplace process (ALP), a process model suitable for quantile regression. We derive several covariance properties of various speci cations of this model and discuss the advantages and disadvantages of each option. As an example, we apply this model to real estate data.
In Chapter 3, we extend the ALP to accommodate large data sets by incorporating a predictive process covariance structure and sampling scheme into the ALP. By doing so, we create the asymmetric Laplace predictive process (ALPP), which we apply to a data set of approximately 3,000 births in the state of North Carolina in the year 2000. Here, interest lies primarily on the relationship between various maternal covariates and the lower tails of the distribution of birth weights.
In Chapter 4, we again extend the ALP, this time to incorporate a temporal component. We discuss several ways in which both continuous and discrete time can be included in the model. We further develop and outline the details of a discrete time spatial dynamic model. We apply this model k to a data set of spatially and temporally indexed temperatures, given elevation.
In Chapter 5, we propose an alternative to the ALP, which re-scales a Gaussian process using two separate scale parameters. We investigate the properties of this double normal process (DNP), and present a simulation example to illustrate the utility (and disutility) of this model.
Item Open Access Improving the Modeling of Government Surveillance Data: a Case Study on Malaria in the Brazilian Amazon(2013) Valle, DenisThe study of the effect of the environment (e.g., climate and land use) on disease typically relies on aggregate disease data collected by the government surveillance network. The usual approach to analyze these data, however, often ignores a) changes in sampling effort (i.e., total number of individuals examined), b) the fact that these data are biased towards symptomatic individuals, and; c) the fact that the observations (e.g., individuals diagnosed and treated for the disease) often directly influence disease dynamics by decreasing infection prevalence. Here we highlight the consequences of ignoring the problems listed above and develop a novel modeling framework to circumvent them. We illustrate this modeling framework using simulated and real malaria data from the Western Brazilian Amazon.
Our simulations reveal that trends in the number of disease cases do not necessarily imply similar trends in infection prevalence or incidence, due to the strong influence of concurrent changes in sampling effort. Furthermore, we show that ignoring decreases in the pool of infected individuals due to the treatment of part of these individuals can significantly hinder inference on underlying patterns of infection incidence. We propose an innovative model that avoids the problems listed above. This model can be seen as a compromise between more phenomenological statistical models and more mechanistic disease dynamics models; in particular, a validation exercise reveals that the proposed model has higher out-of-sample predictive performance than either one of these alternative models. Our case study on malaria in the Brazilian Amazon reveals surprising patterns in infection prevalence and incidence, which might be partially attributed to seasonal rainfall variation.
We have proposed and applied a novel modeling approach that avoids problems that have plagued several earlier analyses of government surveillance disease data. We illustrate how ignoring these problems can significantly hinder inference on the effect of environmental factors on disease dynamics. This modeling approach is likely to be useful for the modeling of various diseases using government surveillance data.
Item Open Access Integral Projection Models: Simulation Studies and Sensitivity Analyses(2014) Zhu, KaiIntegral projection model (IPM) is an important tool to study population dynamics and demography in ecology. Traditional IPMs are handled first with a fitting stage at individual-level transitions, then with a projection stage at population-level distributions. Here we adopt a new IPM framework that coherently focusing on population-level size distributions using point pattern theory.
We conduct simulation studies and sensitivity analyses to explore the properties of this new IPM framework. Under certain settings of demographic functions and parameters, we conduct two simulation studies by deterministically projecting population dynamics and stochastically generating point patterns. Assuming stationarity at equilibrium state, we then derive analytical solutions for the sensitivity of stable stage size distribution to kernel demographic parameters. We implement the sensitivity analyses to the two simulation studies. Demography, population dynamics, prior vs. posterior parameters, and sensitivities are compared among parameter settings and simulations.
For two simulation studies, we find that parameter recovery is challenging except under tight priors, suggesting possible parameter identification problems. Issues could somewhat be resolved by sensitivity analyses, which identify parameters that are most sensitive to the stable stage size distributions. In summary, we find population-level only data may be limited to infer demography, and we will integrate both individual- and population-level data in the future.
Item Open Access Kernel Averaged Predictors for Space and Space-Time Processes(2011) Heaton, MatthewIn many spatio-temporal applications a vector of covariates is measured alongside a spatio-temporal response. In such cases, the purpose of the statistical model is to quantify the change, in expectation or otherwise, in the response due to a change in the predictors while adequately accounting for the spatio-temporal structure of the response, the predictors, or both. The most common approach for building such a model is to confine the relationship between the response and the predictors to a single spatio-temporal coordinate. For spatio-temporal problems, however, the relationship between the response and predictors may not be so confined. For example, spatial models are often used to quantify the effect of pollution exposure on mortality. Yet, an unknown lag exists between time of exposure to pollutants and mortality. Furthermore, due to mobility and atmospheric movement, a spatial lag between pollution concentration and mortality may also exist (e.g. subjects may live in the suburbs where pollution levels are low but work in the city where pollution levels are high).
The contribution of this thesis is to propose a hierarchical modeling framework which captures complex spatio-temporal relationships between responses and covariates. Specifically, the models proposed here use kernels to capture spatial and/or temporal lagged effects. Several forms of kernels are proposed with varying degrees of complexity. In each case, however, the kernels are assumed to be parametric with parameters that are easily interpretable and estimable from the data. Full distributional results are given for the Gaussian setting along with consequences of model misspecification. The methods are shown to be effective in understanding the complex relationship between responses and covariates through various simulated examples and analyses of physical data sets.
Item Open Access Latent Stick-Breaking Processes.(J Am Stat Assoc, 2010-04-01) Rodríguez, Abel; Dunson, David B; Gelfand, Alan EWe develop a model for stochastic processes with random marginal distributions. Our model relies on a stick-breaking construction for the marginal distribution of the process, and introduces dependence across locations by using a latent Gaussian copula model as the mechanism for selecting the atoms. The resulting latent stick-breaking process (LaSBP) induces a random partition of the index space, with points closer in space having a higher probability of being in the same cluster. We develop an efficient and straightforward Markov chain Monte Carlo (MCMC) algorithm for computation and discuss applications in financial econometrics and ecology. This article has supplementary material online.Item Open Access Local Real-Time Forecasting of Ozone Exposure using Temperature Data(2017-05-08) Lu, LucyRigorous and prompt assessment of ambient ozone exposure is important for inform- ing the public about ozone levels that may lead to adverse health effects. In this paper, we make use of hierarchical modeling to forecast 8-hour average ozone exposure. Our contribution is to show how incorporating temperature data in addition to observed ozone can significantly improve forecast accuracy, as measured by predictive mean squared error and empirical coverage. Furthermore, our model meets the objective of forecasting in real-time. These advantages are illustrated through modeling data collected at the Village Green monitoring station in Durham, North Carolina.Item Open Access Modeling Point Patterns, Measurement Error and Abundance for Exploring Species Distributions(2010) Chakraborty, AvishekThis dissertation focuses on solving some common problems associated with ecological field studies. In the core of the statistical methodology, lies spatial modeling that provides greater flexibility and improved predictive performance over existing algorithms. The applications involve prevalence datasets for hundreds of plants over a large area in the Cape Floristic Region (CFR) of South Africa.
In Chapter 2, we begin with modeling the categorical abundance data with a multi level spatial model using background information such as environmental and soil-type factors. The empirical pattern is formulated as a degraded version of the potential pattern, with the degradation effect accomplished in two stages. First, we adjust for land use transformation and then we adjust for measurement error, hence misclassification error, to yield the observed abundance classifications. With data on a regular grid over CFR, the analysis is done with a conditionally autoregressive prior on spatial random effects. With around ~ 37000 cells to work with, a novel paralleilization algorithm is developed for updating the spatial parameters to efficiently estimate potential and transformed abundance surfaces over the entire region.
In Chapter 3, we focus on a different but increasingly common type of prevalence data in the so called presence-only setting. We detail the limitations associated with a usual presence-absence analysis for this data and advocate modeling the data as a point pattern realization. The underlying intensity surface is modeled with a point-level spatial Gaussian process prior, after taking into account sampling bias and change in land-use pattern. The large size of the region enforces using an computational approximation with a bias-corrected predictive process. We compare our methodology against the the most commonly used maximum entropy method, to highlight the improvement in predictive performance.
In Chapter 4, we develop a novel hierarchical model for analyzing noisy point pattern datasets, that arise commonly in ecological surveys due to multiple sources of bias, as discussed in previous chapters. The effect of the noise leads to displacements of locations as well as potential loss of points inside a bounded domain. Depending on the assumption on existence of locations outside the boundary, a couple of different models -- island and subregion, are specified. The methodology assumes informative knowledge of the scale of measurement error, either pre-specified or learned from a training sample. Its performance is tested against different scales of measurement error related to the data collection techniques in CFR.
In Chapter 5, we suggest an alternative model for prevalence data, different from the one in Chapter 3, to avoid numerical approximation and subsequent computational complexities for a large region. A mixture model, similar to the one in Chapter 4 is used, with potential dependence among the weights and locations of components. The covariates as well as a spatial process are used to model the dependence. A novel birth-death algorithm for the number of components in the mixture is under construction.
Lastly, in Chapter 6, we proceed to joint modeling of multiple-species datasets. The challenge is to infer about inter-species competition with a large number of populations, possibly running into several hundreds. Our contribution involves applying hierarchical Dirichlet process to cluster the presence localities and subsequently developing measures of range overlap from posterior draws. This kind of simultaneous inference can potentially have implications for questions related to biodiversity and conservation studies. .
Item Open Access Multivariate Spatial Process Gradients with Environmental Applications(2014) Terres, Maria AntoniaPrevious papers have elaborated formal gradient analysis for spatial processes, focusing on the distribution theory for directional derivatives associated with a response variable assumed to follow a Gaussian process model. In the current work, these ideas are extended to additionally accommodate one or more continuous covariate(s) whose directional derivatives are of interest and to relate the behavior of the directional derivatives of the response surface to those of the covariate surface(s). It is of interest to assess whether, in some sense, the gradients of the response follow those of the explanatory variable(s), thereby gaining insight into the local relationships between the variables. The joint Gaussian structure of the spatial random effects and associated directional derivatives allows for explicit distribution theory and, hence, kriging across the spatial region using multivariate normal theory. The gradient analysis is illustrated for bivariate and multivariate spatial models, non-Gaussian responses such as presence-absence and point patterns, and outlined for several additional spatial modeling frameworks that commonly arise in the literature. Working within a hierarchical modeling framework, posterior samples enable all gradient analyses to occur as post model fitting procedures.
Item Open Access Non-Parametric Priors for Functional Data and Partition Labelling Models(2017) Hellmayr, Christoph StefanPrevious papers introduced a variety of extensions of the Dirichlet process to the func-
tional domain, focusing on the challenges presented by extending the stick-breaking
process. In this thesis some of these are examined in more detail for similarities
and differences in their stick-breaking extensions. Two broad classes of extensions
can be defined, differentiating by how the construction of functional mixture weights
are handled: one type of process views it as the product of a sequence of marginal
mixture weights, whereas the other specifies a joint mixture weight for an entire ob-
servation. These are termed “marginal” and “joint” labelling processes respectively,
and we show that there are significant differences in their posterior predictive perfor-
mance. Further investigation of the generalized functional Dirichlet process reveals
that a more fundamental difference exists. Whereas marginal labelling models nec-
essarily assign labels only at specific arguments, joint labelling models can allow for
the assignment of labels to random subsets of the domain of the function. This leads
naturally to the idea of a stochastic process based around a random partitioning of a
bounded domain, which we call the partitioned functional Dirichlet process. Here we
explicitly model the partitioning of the domain in a constrained manner, rather than
implicitly as happens in the generalized functional Dirichlet process. Comparisons
are made in terms of posterior predictive behaviour between this model, the general-
ized functional Dirichlet process and the functional Dirichlet process. We find that
the explicit modelling of the partitioning leads to more tractable computational and
more structured posterior predictive behaviour than in the generalized functional
Dirichlet process, while still offering increased flexibility over the functional Dirich-
let process. Finally, we extend the partitioned functional Dirichlet process to the
bivariate case.
Item Open Access Space and Space-Time Modeling of Directional Data(2013) Wang, FangpoDirectional data, i.e., data collected in the form of angles or natural directions arise in many scientific fields, such as oceanography, climatology, geology, meteorology and biology to name a few. The non-Euclidean nature of such data poses difficulties in applying ordinary statistical methods developed for inline data, motivating the need for specialized modeling framework for directional data. Motivated in particular by a marine application of modeling spatial association of wave directions and additionally association between spatial wave directions and spatial wave heights, this dissertation focuses on providing general frameworks of modeling spatial and spatio-temporal directional data, while also studying the theoretical properties of the proposed methods. In particular, the projected normal family of circular distributions is proposed as a default parametric family of distributions for directional data. Operating in a Bayesian framework and exploiting standard data augmentation techniques, the projected normal family is shown to have straightforward extensions to the regression and process setting.
A fully model-based approach is developed to capture structured spatial dependence for modeling directional data at different spatial locations. A stochastic process taking values on the circle, a projected Gaussian spatial process, is introduced. This spatial angular process is induced from an inline bivariate Gaussian process. The properties of the projected Gaussian process is discussed with special emphasis on the ``covariance'' structure. We show how to fit this process as a model for data, using suitable latent variables with Markov chain Monte Carlo methods. We also show how to implement spatial interpolation and conduct model comparison in this setting. Simulated examples are provided as proof of concept. A real data application arises for modeling the aforementioned wave direction data in the Adriatic sea, off the coast of Italy. This directional data being available dynamically, naturally motivated extension to a space-time setting.
As the basis of the projected Gaussian process, the properties of the general projected normal distribution is first clarified. The general projected normal distribution on a circle is defined to be the distribution of a bivariate normal random variable with arbitrary mean and covariance, projected on the unit circle. The projected normal distribution is an under-utilized model for explaining directional data. In particular, the general version with non-identity covariance provides flexibility, e.g., bimodality, asymmetry, and convenient regression specification.
For analyzing non-spatial circular data, fully Bayesian hierarchical models using the general projected normal distribution are developed and fitting using Markov chain Monte Carlo methods with suitable latent variables is illustrated. The posterior inference for distributional features such as the angular mean direction and concentration can be implemented as well as how prediction within the regression setting can be handled. For analyzing spatial directional data, latent variables are also introduced to facilitate the model fitting with MCMC methods. The implementation of spatial interpolation and conduction of model comparison are demonstrated. With regard to model comparison, an out-of-sample approach using both a predictive likelihood scoring loss criterion and a cumulative rank probability score criterion is utilized.
This dissertation later focuses on building model extensions based on the framework of the projected Gaussian process. The wave directions data studied in the previous chapters also include wave height information at the same space and time resolution. Motivated by joint modeling of these important attributes of wave (wave directions and wave heights), a hierarchical framework is developed for jointly modeling spatial directional and ordinary linear observations. We show that the Bayesian model fitting under our model specification is straightforward using suitable latent variable augmentation via Markov chain Monte Carlo (MCMC). This joint model framework can easily incorporate space-time covariate information, enabling both spatial interpolation and temporal forecast.
The spatial projected Gaussian process also provides a natural application in geosciences as aspect processes for the elevation maps. Compared to conventional calculations, a fully process model for aspects is provided, allowing full inference and arbitrary interpolation. The aspect processes can directly be inferred from a sample from the surface of elevations, providing the estimate and its uncertainties of the aspect at any new location over the region.
Item Open Access Spatial Modeling of Measurement Error in Exposure to Air Pollution(2010) Gray, Simone ColetteIn environmental health studies air pollution measurements from the closest monitor are commonly used as a proxy for personal exposure. This technique assumes that air pollution concentrations are spatially homogeneous in the neighborhoods associated with the monitors and consequently introduces measurement error into a model. To model the relationship between maternal exposure to air pollution and birth weight we build a hierarchical model that accounts for the associated measurement error. We allow four possible scenarios, with increasing flexibility, for capturing this uncertainty. In the two simplest cases, we specify one model with a constant variance term and another with a variance component that allows the uncertainty in the exposure measurements to increase as the distance between maternal residence and the location of the closest monitor increases. In the remaining two models we introduce spatial dependence in these errors using spatial processes in the form of random effects models. We detail the specification for the exposure measure to reflect the sparsity of monitoring sites and discuss the issue of quantifying exposure over the course of a pregnancy. The model is illustrated using Bayesian hierarchical modeling techniques and data from the USEPA and the North Carolina Detailed Birth Records.
Item Open Access Topics in Bayesian Spatiotemporal Prediction of Environmental Exposure(2019) White, Philip AndrewWe address predictive modeling for spatial and spatiotemporal modeling in a variety of settings. First, we discuss spatial and spatiotemporal data and corresponding model types used in later chapters. Specifically, we discuss Markov random fields, Gaussian processes, and Bayesian inference. Then, we outline the dissertation.
In Chapter 2, we consider the setting where areal unit data are only partially observed. First, we consider setting where a portion of the areal units have been observed, and we seek prediction of the remainder. Second, we leverage these ideas for model comparison where we fit models of interest to a portion of the data and hold out the rest for model comparison.
In Chapters 3 and 4, we consider pollution data from Mexico City in 2017. In Chapter 3 we forecast pollution emergencies. Mexico City defines pollution emergencies using thresholds that rely on regional maxima for ozone and for particulate matter with diameter less than 10 micrometers (PM10). To predict local pollution emergencies and to assess compliance with Mexican ambient air quality standards, we analyze hourly ozone and PM10 measurements from 24 stations across Mexico City from 2017 using a bivariate spatiotemporal model. With this model, we predict future pollutant levels using current weather conditions and recent pollutant concentrations. Employing hourly pollutant projections, we predict regional maxima needed to estimate the probability of future pollution emergencies. We discuss how predicted compliance with legislated pollution limits varies across regions within Mexico City in 2017.
In Chapter 4, we propose a continuous spatiotemporal model for Mexico City ozone levels that accounts for distinct daily seasonality, as well as variation across the city and over the peak ozone season (April and May) of 2017. To account for these patterns, we use covariance models over space, circles, and time. We review relevant existing covariance models and develop new classes of nonseparable covariance models appropriate for seasonal data collected at many locations. We compare the predictive performance of a variety of models that utilize various nonseparable covariance functions. We use the best model to predict hourly ozone levels at unmonitored locations in April and May to infer compliance with Mexican air quality standards and to estimate respiratory health risk associated with ozone exposure.
Item Open Access Using Data Augmentation and Stochastic Differential Equations in Spatio Temporal Modeling(2008-12-12) Puggioni, GavinoOne of the biggest challenges in spatiotemporal modeling is indeed how to manage the large amount of missing information. Data augmentation techniques are frequently used to infer about missing values, unobserved or latent processes, approximation of continuous time processes that are discretely observed.
The literature treating the inference when modeling using stochastic differential equations (SDE) that are partially observed has been growing in recent years. Many attempts have been made to tackle this problem, from very different perspectives. The goal of this thesis is not a comparison of the different methods. The focus is, instead, on Bayesian inference for the SDE in a spatial context, using a data augmentation approach. While other methods can be less computationally intensive or more accurate in some cases, the main advantage of the Bayesian approach based on model augmentation is the general scope of applicability. In Chapter 2 we propose some methods to model space time data as noisy realizations of an underlying system of nonlinear SDEs. The parameters of this system are realizations of spatially correlated Gaussian processes. Models that are formulated in this fashion are complex and present several challenges in their estimation. Standard methods degenerate when the the level of refinement in the discretization gets larger. The innovation algorithm overcomes such problems. We present an extension of the innvoation scheme for the case of high-dimensional parameter spaces. Our algorithm, although presented in spatial SDE examples, can be actually applied in any general multivariate SDE setting.
In Chapter 3 we discuss additional insights regarding SDE with a spatial interpretation: spatial dependence is enforced through the driving Brownian motion.
In Chapter 4 we discuss some possible refinement on the SDE parameter estimation. Such refinements, that involve second order SDE approximations, have actually a more general scope than spatiotemporal modeling and can be applied in a variety of settings.
In the last chapter we propose some methodology ideas for fitting space-time models to data that are collected in a wireless sensor network when suppression and failure in transmission are considered. In this case also we make use of data augmentation techniques but in conjunction with linear constraints on the missing values.