Logistic-tree Normal Mixture for Clustering Microbiome Compositions

Loading...

Date

2023

Advisors

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

81
views
91
downloads

Abstract

Human microbiome has become an interesting research topic in recent years and a common task in the analysis of these data is to cluster microbiome compositions into subtypes. This task serves as an intermediary step in achieving personalized diagnosis and treatment. However, this seemingly standard task is very challenging in the microbiome composition context due to several key features of such data. Common distance-based algorithms can not produce reliable results as they do not take into account the heterogeneity of the cross-sample variability among the bacterial taxa. In addition, existing model-based approaches are not flexible enough to capture the complex within-cluster variation from cross-cluster variation. An useful Bayesian generative model Dirichlet-tree multinomial mixtures (DTMM) has been proposed to overcome these challenges. DTMM indeed achieves reliable results, but it is still not flexible enough in characterizing covariance structure among taxa and lacks the scalability to higher dimensions. Hence we propose another generative model, called the "Logistic-tree normal mixture" (LTNM), that addresses this need. The LTN kernel incorporates the tree-based decomposition as the Dirichlet-tree does, but it also models the branching probability using a multivariate logistic-normal distribution. Hence it has a rich covariance structure along with computationally efficiency through Pólya-Gamma data augmentation technique. This thesis will be organized as follows: first we briefly review some popular existing algorithms; then we will introduce LTNM in detail; then we will do extensive simulation study to compare LTNM and other existing methods; at last we apply LTNM to a real microbiome study, the American Gut Project (AGP) to analyze the inference results of LTNM.

Description

Provenance

Subjects

Statistics

Citation

Citation

Wang, Jiongran (2023). Logistic-tree Normal Mixture for Clustering Microbiome Compositions. Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/27843.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.