Nonparametric Bayes Modeling of Populations of Networks
Date
2017-06-27
Authors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Citation Stats
Abstract
© 2017 American Statistical Association Replicated network data are increasingly available in many research fields. For example, in connectomic applications, interconnections among brain regions are collected for each patient under study, motivating statistical models which can flexibly characterize the probabilistic generative mechanism underlying these network-valued data. Available models for a single network are not designed specifically for inference on the entire probability mass function of a network-valued random variable and therefore lack flexibility in characterizing the distribution of relevant topological structures. We propose a flexible Bayesian nonparametric approach for modeling the population distribution of network-valued data. The joint distribution of the edges is defined via a mixture model that reduces dimensionality and efficiently incorporates network information within each mixture component by leveraging latent space representations. The formulation leads to an efficient Gibbs sampler and provides simple and coherent strategies for inference and goodness-of-fit assessments. We provide theoretical results on the flexibility of our model and illustrate improved performance—compared to state-of-the-art models—in simulations and application to human brain networks. Supplementary materials for this article are available online.
Type
Department
Description
Provenance
Subjects
Citation
Permalink
Published Version (Please cite this version)
Publication Info
Durante, D, DB Dunson and JT Vogelstein (2017). Nonparametric Bayes Modeling of Populations of Networks. Journal of the American Statistical Association. pp. 1–15. 10.1080/01621459.2016.1219260 Retrieved from https://hdl.handle.net/10161/15590.
This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.
Collections
Scholars@Duke
David B. Dunson
My research focuses on developing new tools for probabilistic learning from complex data - methods development is directly motivated by challenging applications in ecology/biodiversity, neuroscience, environmental health, criminal justice/fairness, and more. We seek to develop new modeling frameworks, algorithms and corresponding code that can be used routinely by scientists and decision makers. We are also interested in new inference framework and in studying theoretical properties of methods we develop.
Some highlight application areas:
(1) Modeling of biological communities and biodiversity - we are considering global data on fungi, insects, birds and animals including DNA sequences, images, audio, etc. Data contain large numbers of species unknown to science and we would like to learn about these new species, community network structure, and the impact of environmental change and climate.
(2) Brain connectomics - based on high resolution imaging data of the human brain, we are seeking to developing new statistical and machine learning models for relating brain networks to human traits and diseases.
(3) Environmental health & mixtures - we are building tools for relating chemical and other exposures (air pollution etc) to human health outcomes, accounting for spatial dependence in both exposures and disease. This includes an emphasis on infectious disease modeling, such as COVID-19.
Some statistical areas that play a prominent role in our methods development include models for low-dimensional structure in data (latent factors, clustering, geometric and manifold learning), flexible/nonparametric models (neural networks, Gaussian/spatial processes, other stochastic processes), Bayesian inference frameworks, efficient sampling and analytic approximation algorithms, and models for "object data" (trees, networks, images, spatial processes, etc).
Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.