Logistic-tree Normal Mixture for Clustering Microbiome Compositions
Date
2023
Authors
Advisors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Abstract
Human microbiome has become an interesting research topic in recent years and a common task in the analysis of these data is to cluster microbiome compositions into subtypes. This task serves as an intermediary step in achieving personalized diagnosis and treatment. However, this seemingly standard task is very challenging in the microbiome composition context due to several key features of such data. Common distance-based algorithms can not produce reliable results as they do not take into account the heterogeneity of the cross-sample variability among the bacterial taxa. In addition, existing model-based approaches are not flexible enough to capture the complex within-cluster variation from cross-cluster variation. An useful Bayesian generative model Dirichlet-tree multinomial mixtures (DTMM) has been proposed to overcome these challenges. DTMM indeed achieves reliable results, but it is still not flexible enough in characterizing covariance structure among taxa and lacks the scalability to higher dimensions. Hence we propose another generative model, called the "Logistic-tree normal mixture" (LTNM), that addresses this need. The LTN kernel incorporates the tree-based decomposition as the Dirichlet-tree does, but it also models the branching probability using a multivariate logistic-normal distribution. Hence it has a rich covariance structure along with computationally efficiency through Pólya-Gamma data augmentation technique. This thesis will be organized as follows: first we briefly review some popular existing algorithms; then we will introduce LTNM in detail; then we will do extensive simulation study to compare LTNM and other existing methods; at last we apply LTNM to a real microbiome study, the American Gut Project (AGP) to analyze the inference results of LTNM.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Wang, Jiongran (2023). Logistic-tree Normal Mixture for Clustering Microbiome Compositions. Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/27843.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.