Tree-based Generative Models with Applications to Microbiome Compositional Data
Abstract
In this work, we study conditional generative models of microbiome compositional data. Specifically, we propose three approaches for the conditional generation of such data: (1) a parametric model called the "logistic-tree normal" (LTN) model, (2) a conditional normalizing flow based on piecewise linear transforms defined with dyadic partition trees on the outcome space, and (3) learning the marginal distribution of the compositions and the conditional distribution of covariates given the compositions separately, followed by conditional sampling of the compositions using Langevin dynamics. We demonstrated through extensive experiments that the LTN model can incorporate complex design features, such as longitudinal sampling, and can be integrated into hierarchical models as a building block for inferential tasks, such as cross-group comparison. The non-parametric approaches better model the complex joint distributions of the compositions, and in practice, one can choose between these two depending on the covariates.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Citation
Wang, Zhuoqun (2024). Tree-based Generative Models with Applications to Microbiome Compositional Data. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32570.
Collections
Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.