# Browsing by Subject "Bayesian inference"

###### Results Per Page

###### Sort Options

Item Open Access A Bayesian Model for Nucleosome Positioning Using DNase-seq Data(2015) Zhong, JianlingAs fundamental structural units of the chromatin, nucleosomes are involved in virtually all aspects of genome function. Different methods have been developed to map genome-wide nucleosome positions, including MNase-seq and a recent chemical method requiring genetically engineered cells. However, these methods are either low resolution and prone to enzymatic sequence bias or require genetically modified cells. The DNase I enzyme has been used to probe nucleosome structure since the 1960s, but in the current high throughput sequencing era, DNase-seq has mainly been used to study regulatory sequences known as DNase hypersensitive sites. This thesis shows that DNase-seq data is also very informative about nucleosome positioning. The distinctive oscillatory DNase I cutting patterns on nucleosomal DNA are shown and discussed. Based on these patterns, a Bayes factor is proposed to be used for distinguishing nucleosomal and non-nucleosomal genome positions. The results show that this approach is highly sensitive and specific. A Bayesian method that simulates the data generation process and can provide more interpretable results is further developed based on the Bayes factor investigations. Preliminary results on a test genomic region show that the Bayesian model works well in identifying nucleosome positioning. Estimated posterior distributions also agree with some known biological observations from external data. Taken together, methods developed in this thesis show that DNase-seq can be used to identify nucleosome positioning, adding great value to this widely utilized protocol.

Item Open Access A Tapered Pareto-Poisson Model for Extreme Pyroclastic Flows: Application to the Quantification of Volcano Hazards(2015) Dai, FanThis paper intends to discuss the problems of parameter estimation in a proposed tapered Pareto-Poisson model for the assessment of large pyroclastic flows, which are essential in quantifying the size and risk of volcanic hazards. In dealing with the tapered Pareto distribution, the paper applies both maximum likelihood estimation and a Bayesian framework with objective priors and Metropolis algorithm. The techniques are further illustrated by an example of modeling extreme flow volumes at Soufriere Hills Volcano, and their simulation results are addressed.

Item Open Access Bayesian Methods for Two-Sample Comparison(2015) Soriano, JacopoTwo-sample comparison is a fundamental problem in statistics. Given two samples of data, the interest lies in understanding whether the two samples were generated by the same distribution or not. Traditional two-sample comparison methods are not suitable for modern data where the underlying distributions are multivariate and highly multi-modal, and the differences across the distributions are often locally concentrated. The focus of this thesis is to develop novel statistical methodology for two-sample comparison which is effective in such scenarios. Tools from the nonparametric Bayesian literature are used to flexibly describe the distributions. Additionally, the two-sample comparison problem is decomposed into a collection of local tests on individual parameters describing the distributions. This strategy not only yields high statistical power, but also allows one to identify the nature of the distributional difference. In many real-world applications, detecting the nature of the difference is as important as the existence of the difference itself. Generalizations to multi-sample comparison and more complex statistical problems, such as multi-way analysis of variance, are also discussed.

Item Open Access Bayesian Modeling for Identifying Selection in B cell Maturation(2023) Tang, TengjieThis thesis focuses on modeling the selection effects on B cell antibody mutations to identify amino acids under strong selection. Site-wise selection coefficients are parameterized by the fitnesses of amino acids. First, we conduct simulation studies to evaluate the accuracy of the Monte Carlo p-value approach for identifying selection for specific amino acid/location combinations. Then, we adopt Bayesian methods to infer location-specific fitness parameters for each amino acid. In particular, we propose the use of a spike-and-slab prior and implement Markov chain Monte Carlo (MCMC) algorithms for posterior sampling. Further simulation studies are conducted to evaluate the performance of the proposed Bayesian methods in inferring fitness parameters and identifying strong selection. The results demonstrate the reliable inference and detection performance of the proposed Bayesian methods. Finally, an example using real antibody sequences is provided. This work can help identify important early mutations in B cell antibodies, which is crucial for developing an effective HIV vaccine.

Item Open Access Bayesian quantile regression joint models: Inference and dynamic predictions.(Statistical methods in medical research, 2018-01) Yang, Ming; Luo, Sheng; DeSantis, StaciaIn the traditional joint models of a longitudinal and time-to-event outcome, a linear mixed model assuming normal random errors is used to model the longitudinal process. However, in many circumstances, the normality assumption is violated and the linear mixed model is not an appropriate sub-model in the joint models. In addition, as the linear mixed model models the conditional mean of the longitudinal outcome, it is not appropriate if clinical interest lies in making inference or prediction on median, lower, or upper ends of the longitudinal process. To this end, quantile regression provides a flexible, distribution-free way to study covariate effects at different quantiles of the longitudinal outcome and it is robust not only to deviation from normality, but also to outlying observations. In this article, we present and advocate the linear quantile mixed model for the longitudinal process in the joint models framework. Our development is motivated by a large prospective study of Huntington's disease where primary clinical interest is in utilizing longitudinal motor scores and other early covariates to predict the risk of developing Huntington's disease. We develop a Bayesian method based on the location-scale representation of the asymmetric Laplace distribution, assess its performance through an extensive simulation study, and demonstrate how this linear quantile mixed model-based joint models approach can be used for making subject-specific dynamic predictions of survival probability.Item Open Access Comparison of Bayesian Inference Methods for Probit Network Models(2021) Shen, YueMingThis thesis explores and compares Bayesian inference procedures for probit network models. Network data typically exhibit high dyadic correlation due to reciprocity. For binary network data, presence of dyadic correlation often leads to inefficiency of a basic implementation of Markov chain Monte Carlo (MCMC). We first explore variational inference as a fast approximation to the posterior distribution. Aware of its insufficiency in quantifying posterior uncertainties, we propose an alternative MCMC algorithm which is more efficient and accurate. In particular, we propose to update the dyadic correlation parameter $\rho$ using the marginal likelihood unconditional of the latent relations $Z$. This reduces autocorrelations in the posterior samples of $\rho$ and hence improves mixing. Simulation study and real data examples are provided to compare the performance of these Bayesian inference methods.

Item Open Access Constrained Bayesian Inference through Posterior Projection with Applications(2019) Patra, SayanIn a broad variety of settings, prior information takes the form of parameter restrictions. Bayesian approaches are appealing in parameter constrained problems in allowing a probabilistic characterization of uncertainty in finite samples, while providing a computational machinery for the incorporation of complex constraints in hierarchical models. However, the usual Bayesian strategy of directly placing a prior measure on the constrained space, and then conducting posterior computation with Markov chain Monte Carlo algorithms is often intractable. An alternative is to initially conduct computation for an unconstrained or less constrained posterior, and then project draws from this initial posterior to the constrained space through a minimal distance mapping. This approach has been successful in monotone function estimation but has not been considered in broader settings.

In this dissertation, we develop a unified theory to justify posterior projection in general Banach spaces including for infinite-dimensional functional parameter space. For tractability, in chapter 2 we focus on the case in which the constrained parameter space corresponds to a closed, convex subset of the original space. A special class of non-convex sets called Stiefel manifolds is explored later in chapter 3. Specifically, we provide a general formulation of the projected posterior and show that it corresponds to a valid posterior distribution on the constrained space for particular classes of priors and likelihood functions. We also show how the asymptotic properties of the unconstrained posterior are transferred to the projected posterior. We then illustrate our proposed methodology via multiple examples, both in simulation studies and real data applications.

In chapter 4, we extend our proposed methodology of posterior projection to that of small area estimation (SAE), which focuses on estimating population parameters when there is little to no area-specific information. ``Areas" are often spatial regions, where they might be different demographic groups or experimental conditions. To improve the precision of estimates, a common strategy in SAE methods is to borrow information across several areas. This is generally achieved by using a hierarchical or empirical Bayesian model. However, parameter constraints arising naturally from surveys pose a challenge to the estimation procedure. Examples of such constraints include the variance of the estimate of an area being proportional to the geographic size of the area or the sum of the county level estimates being equal to the state level estimates. We utilize and extend the posterior projection approach to facilitate such computing and reduce parameter uncertainty.

This dissertation develops the fundamental new approaches for constrained Bayesian inference, and there are many possible directions for future endeavors. One such important generalization is considered in chapter 5 to allow for conditional posterior projections; for example, applying projection to a subset of parameters immediately after each update step within a Markov chain Monte Carlo algorithm. We identify several scenarios where such a modified algorithm converges to the underlying true distribution and develop a general theory to ensure consistency. We conclude the dissertation by discussing future directions of research in chapter 6, outlining many directions for continued research on these topics.

Item Open Access Data augmentation for models based on rejection sampling.(Biometrika, 2016-06) Rao, Vinayak; Lin, Lizhen; Dunson, David BWe present a data augmentation scheme to perform Markov chain Monte Carlo inference for models where data generation involves a rejection sampling algorithm. Our idea is a simple scheme to instantiate the rejected proposals preceding each data point. The resulting joint probability over observed and rejected variables can be much simpler than the marginal distribution over the observed variables, which often involves intractable integrals. We consider three problems: modelling flow-cytometry measurements subject to truncation; the Bayesian analysis of the matrix Langevin distribution on the Stiefel manifold; and Bayesian inference for a nonparametric Gaussian process density model. The latter two are instances of doubly-intractable Markov chain Monte Carlo problems, where evaluating the likelihood is intractable. Our experiments demonstrate superior performance over state-of-the-art sampling algorithms for such problems.Item Open Access Developing a Predictive and Quantitative Understanding of RNA Ligand Recognition(2021) Orlovsky, NicoleRNA recognition frequently results in conformational changes that optimize

intermolecular binding. As a consequence, the overall binding affinity of RNA

to its binding partners depends not only on the intermolecular interactions

formed in the bound state, but also on the energy cost associated with changing

the RNA conformational distribution. Measuring these conformational penalties

is however challenging because bound RNA conformations tend to have equilibrium

populations in the absence of the binding partner that fall outside detection by

conventional biophysical methods.

In this work we employ as a model system HIV-1 TAR RNA and its interaction with

the ligand argininamide (ARG), a mimic of TAR’s cognate protein binding partner,

the transactivator Tat. We use NMR chemical shift perturbations (CSP) and NMR

relaxation dispersion (RD) in combination with Bayesian inference to develop a

detailed thermodynamic model of coupled conformational change and ligand

binding. Starting from a comprehensive 12-state model of the equilibrium, we

estimate the energies of six distinct detectable thermodynamic states that are

not accessible by currently available methods.

Our approach identifies a minimum of four RNA intermediates that differ in terms

of the TAR conformation and ARG-occupancy. The dominant bound TAR conformation

features two bound ARG ligands and has an equilibrium population in the absence

of ARG that is below detection limit. Consequently, even though ARG binds to TAR

with an apparent overall weak affinity ($\Kdapp \approx \SI{0.2}{\milli

\Molar}$), it binds the prefolded conformation with a $K_{\ch{d}}$ in the nM

range. Our results show that conformational penalties can be major determinants

of RNA-ligand binding affinity as well as a source of binding cooperativity,

with important implications for a predictive understanding of how RNA is

recognized and for RNA-targeted drug discovery.

Additionally, we describe in detail the development of our approach for fitting

complex ligand binding data to mathematical models using Bayesian

inference. We provide crucial benchmarks and demonstrate the

robustness of our fitting approach with the goal of application

to other systems. This thesis aims to provide new insight into

the dynamics of RNA-ligand recognition as well as provide new

methods that can be applied to achieve this goal.

Item Open Access Information-Based Sensor Management for Static Target Detection Using Real and Simulated Data(2009) Kolba, Mark PhilipIn the modern sensing environment, large numbers of sensor tasking decisions must be made using an increasingly diverse and powerful suite of sensors in order to best fulfill mission objectives in the presence of situationally-varying resource constraints. Sensor management algorithms allow the automation of some or all of the sensor tasking process, meaning that sensor management approaches can either assist or replace a human operator as well as ensure the safety of the operator by removing that operator from a dangerous operational environment. Sensor managers also provide improved system performance over unmanaged sensing approaches through the intelligent control of the available sensors. In particular, information-theoretic sensor management approaches have shown promise for providing robust and effective sensor manager performance.

This work develops information-theoretic sensor managers for a general static target detection problem. Two types of sensor managers are developed. The first considers a set of discrete objects, such as anomalies identified by an anomaly detector or grid cells in a gridded region of interest. The second considers a continuous spatial region in which targets may be located at any point in continuous space. In both types of sensor managers, the sensor manager uses a Bayesian, probabilistic framework to model the environment and tasks the sensor suite to make new observations that maximize the expected information gain for the system. The sensor managers are compared to unmanaged sensing approaches using simulated data and using real data from landmine detection and unexploded ordnance (UXO) discrimination applications, and it is demonstrated that the sensor managers consistently outperform the unmanaged approaches, enabling targets to be detected more quickly using the sensor managers. The performance improvement represented by the rapid detection of targets is of crucial importance in many static target detection applications, resulting in higher rates of advance and reduced costs and resource consumption in both military and civilian applications.

Item Open Access Model Reduction and Domain Decomposition Methods for Uncertainty Quantification(2017) Contreras, Andres AnibalThis dissertation focuses on acceleration techniques for Uncertainty Quantification (UQ). The manuscript is divided into five chapters. Chapter 1 provides an introduction and a brief summary of Chapters 2, 3, and 4. Chapter 2 introduces a model reduction strategy that is used in the context of elasticity imaging to infer the presence of an inclusion embedded in a soft matrix, mimicking tumors in soft tissues. The method relies on Polynomial Chaos (PC) expansions to build a dictionary of surrogates models, where each surrogate is constructed using a different geometrical configuration of the potential inclusion. A model selection approach is used to discriminate against the different models and eventually select the most appropriate to estimate the likelihood that an inclusion is present in the domain. In Chapter 3, we use a Domain Decomposition (DD) approach to compute the Karhunen-Loeve (KL) modes of a random process through the use of local KL expansions at the subdomain level. Furthermore, we analyze the relationship between the local random variables associated to the local KL expansions and the global random variables associated to the global KL expansions. In Chapter 4, we take advantage of these local random variables and use DD techniques to reduce the computational cost of solving a Stochastic Elliptic Equation (SEE) via a Monte Carlo sampling method. The approach takes advantage of a lower stochastic dimension at the subdomain level to construct a PC expansion of a reduced linear system that is later used to compute samples of the solution. Thus, the approach consists of two main stages: 1) a preprocessing stage in which PC expansions of a condensed problem are computed and 2) a Monte Carlo sampling stage where samples of the solution are computed in order to solve the SEE. Finally, in Chapter 5 some brief concluding remarks are provided.

Item Open Access Nonparametric Bayesian Context Learning for Buried Threat Detection(2012) Ratto, Christopher RalphThis dissertation addresses the problem of detecting buried explosive threats (i.e., landmines and improvised explosive devices) with ground-penetrating radar (GPR) and hyperspectral imaging (HSI) across widely-varying environmental conditions. Automated detection of buried objects with GPR and HSI is particularly difficult due to the sensitivity of sensor phenomenology to variations in local environmental conditions. Past approahces have attempted to mitigate the effects of ambient factors by designing statistical detection and classification algorithms to be invariant to such conditions. These methods have generally taken the approach of extracting features that exploit the physics of a particular sensor to provide a low-dimensional representation of the raw data for characterizing targets from non-targets. A statistical classification rule is then usually applied to the features. However, it may be difficult for feature extraction techniques to adapt to the highly nonlinear effects of near-surface environmental conditions on sensor phenomenology, as well as to re-train the classifier for use under new conditions. Furthermore, the search for an invariant set of features ignores that possibility that one approach may yield best performance under one set of terrain conditions (e.g., dry), and another might be better for another set of conditions (e.g., wet).

An alternative approach to improving detection performance is to consider exploiting differences in sensor behavior across environments rather than mitigating them, and treat changes in the background data as a possible source of supplemental information for the task of classifying targets and non-targets. This approach is referred to as context-dependent learning.

Although past researchers have proposed context-based approaches to detection and decision fusion, the definition of context used in this work differs from those used in the past. In this work, context is motivated by the physical state of the world from which an observation is made, and not from properties of the observation itself. The proposed context-dependent learning technique therefore utilized additional features that characterize soil properties from the sensor background, and a variety of nonparametric models were proposed for clustering these features into individual contexts. The number of contexts was assumed to be unknown a priori, and was learned via Bayesian inference using Dirichlet process priors.

The learned contextual information was then exploited by an ensemble on classifiers trained for classifying targets in each of the learned contexts. For GPR applications, the classifiers were trained for performing algorithm fusion For HSI applications, the classifiers were trained for performing band selection. The detection performance of all proposed methods were evaluated on data from U.S. government test sites. Performance was compared to several algorithms from the recent literature, several which have been deployed in fielded systems. Experimental results illustrate the potential for context-dependent learning to improve detection performance of GPR and HSI across varying environments.

Item Open Access Novel Sensing and Inference Techniques in Air and Water Environments(2015) Zhou, XiaochiEnvironmental sensing is experiencing tremendous development due largely to the advancement of sensor technology and wireless technology/internet that connects them and enable data exchange. Environmental monitoring sensor systems range from satellites that continuously monitor earth surface to miniature wearable devices that track local environment and people's activities. However, transforming these data into knowledge of the underlying physical and/or chemical processes remains a big challenge given the spatial, temporal scale, and heterogeneity of the relevant natural phenomena. This research focuses on the development and application of novel sensing and inference techniques in air and water environments. The overall goal is to infer the state and dynamics of some key environmental variables by building various models: either a sensor system or numerical simulations that capture the physical processes.

This dissertation is divided into five chapters. Chapter 1 introduces the background and motivation of this research. Chapter 2 focuses on the evaluation of different models (physically-based versus empirical) and remote sensing data (multispectral versus hyperspectral) for suspended sediment concentration (SSC) retrieval in shallow water environments. The study site is the Venice lagoon (Italy), where we compare the estimated SSC from various models and datasets against in situ probe measurements. The results showed that the physically-based model provides more robust estimate of SSC compared against empirical models when evaluated using the cross-validation method (leave-one-out). Despite the finer spectral resolution and the choice of optimal combinations of bands, the hyperspectral data is less reliable for SSC retrieval comparing to multispectral data due to its limited amount of historical dataset, information redundancy, and cross-band correlation.

Chapter 3 introduces a multipollutant sensor/sampler system that developed for use on mobile applications including aerostats and unmanned aerial vehicles (UAVs). The system is particularly applicable to open area sources such as forest fires, due to its light weight (3.5 kg), compact size (6.75 L), and internal power supply. The sensor system, termed “Kolibri”, consists of low-cost sensors measuring CO2 and CO, and samplers for particulate matter and volatile organic compounds (VOCs). The Kolibri is controlled by a microcontroller, which can record and transfer data in real time using a radio module. Selection of the sensors was based on laboratory testing for accuracy, response delay and recovery, cross-sensitivity, and precision. The Kolibri was compared against rack-mounted continuous emission monitors (CEMs) and another mobile sampling instrument (the ``Flyer'') that had been used in over ten open area pollutant sampling events. Our results showed that the time series of CO, CO2, and PM2.5 concentrations measured by the Kolibri agreed well with those from the CEMs and the Flyer. The VOC emission factors obtained using the Kolibri are comparable to existing literature values. The Kolibri system can be applied to various open area sampling challenging situations such as fires, lagoons, flares, and landfills.

Chapter 4 evaluates the trade-off between sensor quality and quantity for fenceline monitoring of fugitive emissions. This research is motivated by the new air quality standard that requires continuous monitoring of hazardous air pollutants (HAPs) along the fenceline of oil and gas refineries. Recently, the emergence of low-cost sensors enables the implementation of spatially-dense sensor network that can potentially compensate for the low quality of individual sensors. To quantify sensor inaccuracy and uncertainty of describing gas concentration that is governed by turbulent air flow, a Bayesian approach is applied to probabilistically infer the leak source and strength. Our results show that a dense sensor network can partly compensate for low-sensitivity or high noise of individual sensors. However, the fenceline monitoring approach fails to make an accurate leak detection when sensor/wind bias exists even with a dense sensor network.

Chapter 5 explores the feasibility of applying a mobile sensing approach to estimate fugitive methane emissions in suburban and rural environments. We first compare the mobile approach against a stationary method (OTM33A) proposed by the US EPA using a series of controlled release tests. Analysis shows that the mobile sensing approach can reduce estimation bias and uncertainty compared against the OTM33A method. Then, we apply this mobile sensing approach to quantify fugitive emissions from several ammonia fertilizer plants in rural areas. Significant methane emission was identified from one plant while the other two shows relatively low emissions. Sensitivity analysis of several model parameters shows that the error term in the Bayesian inference is vital for the determination of model uncertainty while others are less influential. Overall, this mobile sensing approach shows promising results for future applications of quantifying fugitive methane emission in suburban and rural environments.

Item Open Access Phylogeny and evolution of ferns (monilophytes) with a focus on the early leptosporangiate divergences.(American journal of botany, 2004-10) Pryer, Kathleen M; Schuettpelz, Eric; Wolf, Paul G; Schneider, Harald; Smith, Alan R; Cranfill, RaymondThe phylogenetic structure of ferns (= monilophytes) is explored here, with a special focus on the early divergences among leptosporangiate lineages. Despite considerable progress in our understanding of fern relationships, a rigorous and comprehensive analysis of the early leptosporangiate divergences was lacking. Therefore, a data set was designed here to include critical taxa that were not included in earlier studies. More than 5000 bp from the plastid (rbcL, atpB, rps4) and the nuclear (18S rDNA) genomes were sequenced for 62 taxa. Phylogenetic analyses of these data (1) confirm that Osmundaceae are sister to the rest of the leptosporangiates, (2) resolve a diverse set of ferns formerly thought to be a subsequent grade as possibly monophyletic (((Dipteridaceae, Matoniaceae), Gleicheniaceae), Hymenophyllaceae), and (3) place schizaeoid ferns as sister to a large clade of "core leptosporangiates" that includes heterosporous ferns, tree ferns, and polypods. Divergence time estimates for ferns are reported from penalized likelihood analyses of our molecular data, with constraints from a reassessment of the fossil record.Item Open Access Random Orthogonal Matrices with Applications in Statistics(2019) Jauch, MichaelThis dissertation focuses on random orthogonal matrices with applications in statistics. While Bayesian inference for statistical models with orthogonal matrix parameters is a recurring theme, several of the results on random orthogonal matrices may be of interest to those in the broader probability and random matrix theory communities. In Chapter 2, we parametrize the Stiefel and Grassmann manifolds, represented as subsets of orthogonal matrices, in terms of Euclidean parameters using the Cayley transform and then derive Jacobian terms for change of variables formulas. This allows for Markov chain Monte Carlo simulation from probability distributions defined on the Stiefel and Grassmann manifolds. We also establish an asymptotic independent normal approximation for the distribution of the Euclidean parameters corresponding to the uniform distribution on the Stiefel manifold. In Chapter 3, we present polar expansion, a general approach to Monte Carlo simulation from probability distributions on the Stiefel manifold. When combined with modern Markov chain Monte Carlo software, polar expansion allows for routine and flexible posterior inference in models with orthogonal matrix parameters. Chapter 4 addresses prior distributions for structured orthogonal matrices. We introduce an approach to constructing prior distributions for structured orthogonal matrices which leads to tractable posterior simulation via polar expansion. We state two main results which support our approach and offer a new perspective on approximating the entries of random orthogonal matrices.

Item Open Access Uncertainty Quantification in Earth System Models Using Polynomial Chaos Expansions(2017) Li, GuotuThis work explores stochastic responses of various earth system models to different random sources, using polynomial chaos (PC) approaches. The following earth systems are considered, namely the HYbrid Coordinate Ocean Model (HYCOM, an ocean general circulation model (OGCM)) for the study of ocean circulation in the Gulf of Mexico (GoM); the Unified Wave INterface - Coupled Model (UWIN-CM, a dynamically coupled atmosphere-wave-ocean system) for Hurricane Earl (2010) modeling; and the earthquake seismology model for Bayesian inference of fault plane configurations.

In the OGCM study, we aim at analyzing the combined impact of uncertainties in initial conditions and wind forcing fields on ocean circulation using PC expansions. Empirical Orthogonal Functions (EOF) are used to represent both spatial perturbations of initial condition and space-time wind forcing fields, namely in the form of a superposition of modal components with uniformly distributed random amplitudes. The forward deterministic HYCOM simulations are used to propagate input uncertainties in ocean circulation in the GoM during the 2010 Deepwater Horizon (DWH) oil spill, and to generate a realization ensemble based on which PC surrogate models are constructed for both localized and field quantities of interest (QoIs), focusing specifically on Sea Surface Height (SSH) and Mixed Layer Depth (MLD). These PC surrogate models are constructed using Basis Pursuit DeNoising (BPDN) methodology, and their performance is assessed through various statistical measures. A global sensitivity analysis is then performed to quantify the impact of individual random sources as well as their interactions on ocean circulation. At the basin scale, SSH in the deep GoM is mostly sensitive to initial condition perturbations, while over the shelf it is sensitive to wind forcing perturbations. On the other hand, the basin MLD is almost exclusively sensitive to wind perturbations. For both quantities, the two random sources (initial condition and wind forcing) of uncertainties have limited interactions. Finally, computations indicate that whereas local quantities can exhibit complex behavior that necessitates a large number of realizations to build PC surrogate models, the modal analysis of field sensitivities can be suitably achieved with a moderate size ensemble.

It is noted that HYCOM simulations in the aforementioned OGCM study only focus on the ocean circulation, and ignore the oceanic feedback (e.g. momentum, energy, humidity etc) to the atmosphere. A more elaborated analysis is consequently performed to understand the atmosphere dynamics in a fully-coupled atmosphere-wave-ocean system. In particular, we explore the stochastic evolution of Hurricane Earl (2010) in response to uncertainties stemming from random perturbations in the storm's initial size, strength and rotational stretch. To this end, the UWIN-CM framework is employed as the forecasting system, which is used to propagate input uncertainties and generate a realization ensemble. PC surrogate models for time evolutions of both maximum wind speed and minimum sea level pressure (SLP) are constructed. These PC surrogates provide statistical insights on probability distributions of model responses throughout the simulation time span. Statistical analysis of rapid intensification (RI) process suggests that storms with enhanced initial intensity and counter-clockwise rotation perturbations are more likely to undergo a RI process. In addition, the RI process seems mostly sensitive to the mean wind strength and rotational stretch, rather than storm size and asymmetric wind amplitude perturbations. This is consistent with global sensitivity analysis of PC surrogate models. Finally we combine parametric storm perturbations with global stochastic kinetic energy backscatter (SKEBS) forcing in UWIN-CM simulations, and conclude that whereas the storm track is substantially influenced by global perturbations, it is weakly influenced by the properties of the initial storm.

The PC framework not only provides easy access to traditional statistical insights and global sensitivity indices, but also reduces the computational burden of sampling the system response, as performed for instance in Bayesian inference. These two advantages of PC approaches are well demonstrated in the study of earthquake seismology model response to random fault plane configurations. The PC statistical analysis suggests that the hypocenter location plays a dominant role in earthquake ground motion responses (in terms of peak ground velocities, PGVs), while elliptical patch properties only show secondary influence. In addition, Our PC based Bayesian analysis successfully identified the most `likely' fault plane configuration with respect to the chosen ground motion prediction equation (GMPE) curve, i.e. the hypocenter is more likely to be in the bottom right quadrant of the fault plane and the elliptical patch centers at the bottom left quadrant. To incorporate additional physical restrictions on fault plane configurations, a novel restricted sampling methodology is introduced. The results indicate that the restricted inference is more physically sensible, while retaining plausible consistency with findings from unrestricted inference.