Machine learning for early detection of sepsis: an internal and temporal validation study.



Determine if deep learning detects sepsis earlier and more accurately than other models. To evaluate model performance using implementation-oriented metrics that simulate clinical practice.

Materials and methods

We trained internally and temporally validated a deep learning model (multi-output Gaussian process and recurrent neural network [MGP-RNN]) to detect sepsis using encounters from adult hospitalized patients at a large tertiary academic center. Sepsis was defined as the presence of 2 or more systemic inflammatory response syndrome (SIRS) criteria, a blood culture order, and at least one element of end-organ failure. The training dataset included demographics, comorbidities, vital signs, medication administrations, and labs from October 1, 2014 to December 1, 2015, while the temporal validation dataset was from March 1, 2018 to August 31, 2018. Comparisons were made to 3 machine learning methods, random forest (RF), Cox regression (CR), and penalized logistic regression (PLR), and 3 clinical scores used to detect sepsis, SIRS, quick Sequential Organ Failure Assessment (qSOFA), and National Early Warning Score (NEWS). Traditional discrimination statistics such as the C-statistic as well as metrics aligned with operational implementation were assessed.


The training set and internal validation included 42 979 encounters, while the temporal validation set included 39 786 encounters. The C-statistic for predicting sepsis within 4 h of onset was 0.88 for the MGP-RNN compared to 0.836 for RF, 0.849 for CR, 0.822 for PLR, 0.756 for SIRS, 0.619 for NEWS, and 0.481 for qSOFA. MGP-RNN detected sepsis a median of 5 h in advance. Temporal validation assessment continued to show the MGP-RNN outperform all 7 clinical risk score and machine learning comparisons.


We developed and validated a novel deep learning model to detect sepsis. Using our data elements and feature set, our modeling approach outperformed other machine learning methods and clinical scores.





Published Version (Please cite this version)


Publication Info

Bedoya, Armando D, Joseph Futoma, Meredith E Clement, Kristin Corey, Nathan Brajer, Anthony Lin, Morgan G Simons, Michael Gao, et al. (2020). Machine learning for early detection of sepsis: an internal and temporal validation study. JAMIA open, 3(2). pp. 252–260. 10.1093/jamiaopen/ooaa006 Retrieved from

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.



Armando Diego Bedoya

Assistant Professor of Medicine

Kristin Corey

House Staff

Cara Louise O'Brien

Assistant Professor of Medicine

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.