Browsing by Subject "extremes"
Results Per Page
Sort Options
Item Open Access A Data-Retaining Framework for Tail Estimation(2020) Cunningham, ErikaModeling of extreme data often involves thresholding, or retaining only the most extreme observations, in order that the tail may "speak" and not be overwhelmed by the bulk of the data. We describe a transformation-based framework that allows univariate density estimation to smoothly transition from a flexible, semi-parametric estimation of the bulk into a parametric estimation of the tail without thresholding. In the limit, this framework has desirable theoretical tail-matching properties to the selected parametric distribution. We develop three Bayesian models under the framework: one using a logistic Gaussian process (LGP) approach; one using a Dirichlet process mixture model (DPMM); and one using a predictive recursion approximation of the DPMM. Models produce estimates and intervals for density, distribution, and quantile functions across the full data range and for the tail index (inverse-power-decay parameter), under an assumption of heavy tails. For each approach, we carry out a simulation study to explore the model's practical usage in non-asymptotic settings, comparing its performance to methods that involve thresholding.
Among the three models proposed, the LGP has lowest bias through the bulk and highest quantile interval coverage generally. Compared to thresholding methods, its tail predictions have lower root mean squared error (RMSE) in all scenarios but the most complicated, e.g. a sharp bulk-to-tail transition. The LGP's consistent underestimation of the tail index does not hinder tail estimation in pre-extrapolation to moderate-extrapolation regions but does affect extreme extrapolations.
An interplay between the parametric transform and the natural sparsity of the DPMM sometimes causes the DPMM to favor estimation of the bulk over estimation of the tail. This can be overcome by increasing prior precision on less sparse (flatter) base-measure density shapes. A finite mixture model (FMM), substituted for the DPMM in simulation, proves effective at reducing tail RMSE over thresholding methods in some, but not all, scenarios and quantile levels.
The predictive recursion marginal posterior (PRMP) model is fast and does the best job among proposed models of estimating the tail-index parameter. This allows it to reduce RMSE in extrapolation over thresholding methods in most scenarios considered. However, bias from the predictive recursion contaminates the tail, casting doubt on the PRMP's predictions in tail regions where data should still inform estimation. We recommend the PRMP model as a quick tool for visualizing the marginal posterior over transformation parameters, which can aid in diagnosing multimodality and informing the precision needed to overcome sparsity in the mixture model approach.
In summary, there is not enough information in the likelihood alone to prevent the bulk from overwhelming the tail. However, a model that harnesses the likelihood with a carefully specified prior can allow both the bulk and tail to speak without an explicit separation of the two. Moreover, retaining all of the data under this framework reduces quantile variability, improving prediction in the tails compared to methods that threshold.
Item Open Access Intermittency and Irreversibility in the Soil-Plant-Atmosphere System(2009) Rigby, JamesThe hydrologic cycle may be described in essence as the process of water rising and falling in its various phases between land and atmosphere. In this minimal description of the hydrologic cycle two features come into focus: intermittency and irreversibility. In this dissertation intermittency and irreversibility are investigated broadly in the soil-plant-atmosphere system. The theory of intermittency and irreversibility is addressed here in three ways: (1) through its effect on components of the soil-plant-atmosphere system, (2) through development of a measure of the degree of irreversibility in time-series, and (3) by the investigation of the dynamical sources of this intermittency. First, soil infiltration and spring frost risk are treated as two examples of hydrologic intermittency with very different characters and implications for the soil plant system. An investigation of the water budget in simplified soil moisture models reveals that simple bucket models of infiltration perform well against more accurate representation of intra-storm infiltration dynamics in determining the surface water partitioning. Damaging spring frost is presented as a ``biologically-defined extreme event'' and thus as a more subtle form of hydrologic intermittency. This work represents the first theoretical development of a biologically-defined extreme and highlights the importance of the interplay between daily temperature mean and variance in determining the changes in damaging frost risk in a warming climate. Second, a statistical measure of directionality/asymmetry is developed for stationary time-series based on analogies with the theory of nonequilibrium thermodynamics. This measure is then applied to a set of DNA sequences as an example of a discrete sequence with limited state-space. The DNA sequences are found to be statistically asymmetric and further that the local degree of asymmetry is a reliable indicator of the coding/noncoding status of the DNA segment. Third, the phenomenology of rainfall occurrence is compared with canonical examples of dynamical intermittency to determine whether these simple dynamical features may display a dominant signature in rainfall processes. Summer convective rainfall is found to be broadly consistent with Type-III intermittency. Following on this result we studied daytime atmospheric boundary layer dynamics with a view toward developing simplified models that may further elucidate the interaction the interaction between land surface conditions and convective rainfall triggering.