dc.contributor.advisor |
West, Mike |
|
dc.contributor.author |
de Oliveira Sales, Ana Paula |
|
dc.date.accessioned |
2012-05-29T16:36:52Z |
|
dc.date.available |
2012-11-25T05:30:16Z |
|
dc.date.issued |
2011 |
|
dc.identifier.uri |
https://hdl.handle.net/10161/5617 |
|
dc.description.abstract |
<p>I consider the problem of clustering multiple related groups of data. My approach
entails mixture models in the context of hierarchical Dirichlet processes, focusing
on their ability to perform inference on the unknown number of components in the mixture,
as well as to facilitate the sharing of information and borrowing of strength across
the various data groups. Here, I build upon the hierarchical Dirichlet process model
proposed by Muller <italics>et al.</italics> (2004), revising some relevant aspects
of the model, as well as improving the MCMC sampler's convergence by combining local
Gibbs sampler moves with global Metropolis-Hastings split-merge moves. I demonstrate
the strengths of my model by employing it to cluster both synthetic and real datasets.</p>
|
|
dc.subject |
Statistics |
|
dc.subject |
Bayesian statistics |
|
dc.subject |
Clustering |
|
dc.subject |
Dirichlet process |
|
dc.subject |
Hierarchical Dirichlet process |
|
dc.subject |
Nonparametric Bayesian models |
|
dc.title |
Clustering Multiple Related Datasets with a Hierarchical Dirichlet Process |
|
dc.type |
Master's thesis |
|
dc.department |
Statistical Science |
|
duke.embargo.months |
6 |
|