Browsing by Subject "Nonparametrics"
Results Per Page
Sort Options
Item Open Access Non-Parametric Priors for Functional Data and Partition Labelling Models(2017) Hellmayr, Christoph StefanPrevious papers introduced a variety of extensions of the Dirichlet process to the func-
tional domain, focusing on the challenges presented by extending the stick-breaking
process. In this thesis some of these are examined in more detail for similarities
and differences in their stick-breaking extensions. Two broad classes of extensions
can be defined, differentiating by how the construction of functional mixture weights
are handled: one type of process views it as the product of a sequence of marginal
mixture weights, whereas the other specifies a joint mixture weight for an entire ob-
servation. These are termed “marginal” and “joint” labelling processes respectively,
and we show that there are significant differences in their posterior predictive perfor-
mance. Further investigation of the generalized functional Dirichlet process reveals
that a more fundamental difference exists. Whereas marginal labelling models nec-
essarily assign labels only at specific arguments, joint labelling models can allow for
the assignment of labels to random subsets of the domain of the function. This leads
naturally to the idea of a stochastic process based around a random partitioning of a
bounded domain, which we call the partitioned functional Dirichlet process. Here we
explicitly model the partitioning of the domain in a constrained manner, rather than
implicitly as happens in the generalized functional Dirichlet process. Comparisons
are made in terms of posterior predictive behaviour between this model, the general-
ized functional Dirichlet process and the functional Dirichlet process. We find that
the explicit modelling of the partitioning leads to more tractable computational and
more structured posterior predictive behaviour than in the generalized functional
Dirichlet process, while still offering increased flexibility over the functional Dirich-
let process. Finally, we extend the partitioned functional Dirichlet process to the
bivariate case.
Item Open Access Topics in Bayesian Computer Model Emulation and Calibration, with Applications to High-Energy Particle Collisions(2019) Coleman, Jacob RyanProblems involving computer model emulation arise when scientists simulate expensive experiments with computationally expensive computer models. To more quickly probe the experimental design space, statisticians build emulators that act as fast surrogates to the computationally expensive computer models. The emulators are typically Gaussian processes, in order to induce spatial correlation in the input space. Often the main scientific interest lies in inference on one or more input parameters of the computer model which do not vary in nature. Inference on these input parameters is referred to as ``calibration,'' and these inputs are referred to as ``calibration parameters.'' We first detail our emulation and calibration model for an application in high-energy particle physics; this model brings together some existing ideas in the literature on handling multivariate output, and lays out a foundation for the remainder of the thesis.
In the next two chapters, we introduce novel ideas in the field of computer model emulation and calibration. The first addresses the problem of model comparison in this context, and how to simultaneously compare competing computer models while performing calibration. Using a mixture model to facilitate the comparison, we demonstrate that by conditioning on the mixture parameter we can recover the calibration parameter posterior from an independent calibration model. This mixture is then extended in the case of correlated data, a crucial innovation for this comparison framework to be useful in the particle collision setting. Lastly, we explore two possible non-exchangeable mixture models, where model preference changes over the input space.
The second novel idea addresses density estimation when only coarse bin counts are available. We develop an estimation method which avoids costly numerical integration and maintains plausible correlation for nearby bins. Additionally, we extend the method to density regression so that full a full density can be predicted from an input parameter, having only been trained on coarse histograms. This enables inference on the input parameter, and we develop an importance sampling method that compares favorably to the foundational calibration method detailed earlier.
Item Open Access Tree-based Methods for Learning Probability Distributions(2022) Awaya, NaokiLearning probability distributions is a fundamental inferential task in statistics but challenging if a data distribution of our interest is complicated and high-dimensional. Addressing this challenging problem is the main topic of this thesis, and mainly discussed herein are two types of new tree-based methods: a single-tree method and an ensemble method. The new single tree method, the main topic of Chapter 2, is introduced by constructing a generalized Polya tree process, that is, a new Bayesian nonparametric model, equipped with a new flexible tree prior. With this new prior we can find trees that represent the distributional structures well, and the tree space is efficiently explored with a new sequential Monte Carlo algorithm. The new ensemble method discussed in Chapter 3 is proposed under a new addition rule defined for probability distributions. The new rule based on cumulative distribution functions and their generalizations enables us to smoothly introduce a new efficient boosting algorithm, inheriting the important notions such as "residuals" and "zeros"..The thesis is closed by Chapter 4 which provides concluding remarks.