Browsing Masters Theses by Department "Statistical Science"
Now showing items 120 of 46

A Bayesian DirichletMultinomial Test for CrossGroup Differences
(2016)Testing for differences within data sets is an important issue across various applications. Our work is primarily motivated by the analysis of microbiomial composition, which has been increasingly relevant and important ... 
A Bayesian Forward Simulation Approach to Establishing a Realistic Prior Model for Complex Geometrical Objects
(2018)Geology is a descriptive science making itself hard to provide quantification. We develop a Bayesian forward simulation approach to formulate a realistic prior model for geological images using the Approximate Bayesian ... 
A Bayesian Model for Nucleosome Positioning Using DNaseseq Data
(2015)As fundamental structural units of the chromatin, nucleosomes are involved in virtually all aspects of genome function. Different methods have been developed to map genomewide nucleosome positions, including MNaseseq and ... 
A Bayesian Strategy to the 20 Question Game with Applications to Recommender Systems
(2017)In this paper, we develop an algorithm that utilizes a Bayesian strategy to determine a sequence of questions to play the 20 Question game. The algorithm is motivated with an application to active recommender systems. We ... 
A Privacy Preserving Algorithm to Release Sparse Highdimensional Histograms
(2017)Differential privacy (DP) aims to design methods and algorithms that satisfy rigorous notions of privacy while simultaneously providing utility with valid statistical inference. More recently, an emphasis has been placed ... 
A Tapered ParetoPoisson Model for Extreme Pyroclastic Flows: Application to the Quantification of Volcano Hazards
(2015)This paper intends to discuss the problems of parameter estimation in a proposed tapered ParetoPoisson model for the assessment of large pyroclastic flows, which are essential in quantifying the size and risk of volcanic ... 
A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results
(2018)Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the ... 
An Analysis of NBA SpatioTemporal Data
(2017)This project examines the utility of spatiotemporal tracking data from professional basketball games by fitting models predicting whether a player will make a shot. The first part of the project involved the exploration ... 
Applied Dynamic Factor Analysis for Macroeconomic Forecasting
(2018)The use of dynamic factor analysis in statistical modeling has broad utility across an array of applications. This paper presents a novel hierachical structure suited to a particular class of predictive problems  those ... 
Bayesian Density Regression With a Jump Discontinuity at a Given Threshold
(2019)Standard regression discontinuity design usually concentrates on the causal effects by assigning a threshold above or below which an intervention is assigned. By com paring the real values of observations near the threshold, ... 
Bayesian Inference Via Partitioning Under Differential Privacy
(2018)In this thesis, I develop differentially private methods to report posterior probabilities and posterior quantiles of linear regression coefficients. I accomplish this by randomly partitioning the data, taking an intermediate ... 
Bayesian Models for Relating Gene Expression and Morphological Shape Variation in Sea Urchin Larvae
(2012)A general goal of biology is to understand how two or more sets of traits in an organism are related  for example, disease state and genetics, physiology and behavior, or phenotypic variation and gene function. Many of ... 
Bayesian Statistical Models of CellCycle Progression at SingleCell and Population Levels
(2014)Cell division is a biological process fundamental to all life. One aspect of the process that is still under investigation is whether or not cells in a lineage are correlated in their cellcycle progression. Data on cellcycle ... 
Classical Music Composition Using Hidden Markov Models
(2017)Hidden Markov Models are a widely used class of probabilistic models for sequential data that have found particular success in areas such as speech recognition. Algorithmic composition of music has a long history and with ... 
Clustering Multiple Related Datasets with a Hierarchical Dirichlet Process
(2011)I consider the problem of clustering multiple related groups of data. My approach entails mixture models in the context of hierarchical Dirichlet processes, focusing on their ability to perform inference on the unknown ... 
ClusteringEnhanced Stochastic Gradient MCMC for Hidden Markov Models
(2019)MCMC algorithms for hidden Markov models, which often rely on the forwardbackward sampler, suffer with large sample size due to the temporal dependence inherent in the data. Recently, a number of approaches have been developed ... 
Exploiting Big Data in Logistics Risk Assessment via Bayesian Nonparametrics
(2014)In cargo logistics, a key performance measure is transport risk, defined as the deviation of the actual arrival time from the planned arrival time. Neither earliness nor tardiness is desirable for the customer and freight ... 
Forecasting the Term Structure of Interest Rates: A Bayesian Dynamic Graphical Modeling Approach
(2019)This thesis addresses the financial econometric problem of forecasting the term structure of interest rates by using classes of Dynamic Dependence Network Models (DDNMs). This Bayesian econometric framework defines structured ... 
Gaussian beta process
(2014)This thesis presents a new framework for constituting a group of dependent completely random measures, unifying and extending methods in the literature. The dependent completely random measures are constructed based on a ... 
Improving the Modeling of Government Surveillance Data: a Case Study on Malaria in the Brazilian Amazon
(2013)The study of the effect of the environment (e.g., climate and land use) on disease typically relies on aggregate disease data collected by the government surveillance network. The usual approach to analyze these data, however, ...