Latent protein trees

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats


Citation Stats


Unbiased, label-free proteomics is becoming a powerful technique for measuring protein expression in almost any biological sample. The output of these measurements after preprocessing is a collection of features and their associated intensities for each sample. Subsets of features within the data are from the same peptide, subsets of peptides are from the same protein, and subsets of proteins are in the same biological pathways, therefore, there is the potential for very complex and informative correlational structure inherent in these data. Recent attempts to utilize this data often focus on the identification of single features that are associated with a particular phenotype that is relevant to the experiment. However, to date, there have been no published approaches that directly model what we know to be multiple different levels of correlation structure. Here we present a hierarchical Bayesian model which is specifically designed to model such correlation structure in unbiased, label-free proteomics. This model utilizes partial identification information from peptide sequencing and database lookup as well as the observed correlation in the data to appropriately compress features into latent proteins and to estimate their correlation structure. We demonstrate the effectiveness of the model using artificial/benchmark data and in the context of a series of proteomics measurements of blood plasma from a collection of volunteers who were infected with two different strains of viral influenza. © Institute of Mathematical Statistics, 2013.






Published Version (Please cite this version)


Publication Info

Henao, R, JW Thompson, MA Moseley, GS Ginsburg, L Carin and JE Lucas (2013). Latent protein trees. Annals of Applied Statistics, 7(2). pp. 691–713. 10.1214/13-AOAS639 Retrieved from

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.



Ricardo Henao

Associate Professor in Biostatistics & Bioinformatics

J. Will Thompson

Adjunct Assistant Professor in the Department of Pharmacology & Cancer Biology

Dr. Thompson's research focuses on the development and deployment of proteomics and metabolomics mass spectrometry techniques for the analysis of biological systems. He served as the Assistant Director of the Proteomics and Metabolomics Shared Resource in the Duke School of Medicine from 2007-2021. He currently maintains collaborations in metabolomics and proteomics research at Duke, and develops new tools for chemical analysis as a Principal Scientist at 908 Devices in Carrboro, NC.


Martin Arthur Moseley

Adjunct Professor in the Department of Cell Biology

Geoffrey Steven Ginsburg

Adjunct Professor in the Department of Medicine

Dr. Geoffrey S. Ginsburg's research interests are in the development of novel paradigms for developing and translating genomic information into medical practice and the integration of personalized medicine into health care.


Lawrence Carin

Professor of Electrical and Computer Engineering

Lawrence Carin earned the BS, MS, and PhD degrees in electrical engineering at the University of Maryland, College Park, in 1985, 1986, and 1989, respectively. In 1989 he joined the Electrical Engineering Department at Brooklyn Polytechnic Institute (now part of NYU) as an Assistant Professor, and became an Associate Professor there in 1994. In September 1995 he joined the Electrical and Computer Engineering (ECE) Department at Duke University, where he is now a Professor. He was ECE Department Chair from 2011-2014, and Vice Provost and Vice President for Research from 2014-2020. He was the Provost at King Abdullah University of Science & Technology (KAUST) from 2020-2023, returning to Duke in 2023. From 2003-2014 he held the William H. Younger Distinguished Professorship, and since 2018 he has held the James L. Meriam Distinguished Professorship. Dr. Carin's research focuses on machine learning (ML) and artificial intelligence (AI). He publishes widely in the main ML/AI forums, and has addressed many applications of AI, including in  medicine and security. He was co-founder of the small business Signal Innovations Group, which was acquired by BAE Systems in 2014, and in 2017 he co-founded the company Infinia ML, which was acquired by Aspirion in 2023. He is an IEEE Fellow.

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.