Probabilistic Time-to-Event Modeling Approaches for Risk Profiling
Modern health data science applications leverage abundant molecular and electronic health data, providing opportunities for machine learning to build statistical models to support clinical practice. Time-to-event analysis, also called survival analysis, stands as one of the most representative examples of such statistical models. Models for predicting the time of a future event are crucial for risk assessment, across a diverse range of applications, i.e., drug development, risk profiling, and clinical trials, and such data are also relevant in fields like manufacturing (e.g., for equipment monitoring). Existing time-to-event (survival) models have focused primarily on preserving the pairwise ordering of estimated event times (i.e., relative risk).
In this dissertation, we propose neural time-to-event models that account for calibration and uncertainty, while predicting accurate absolute event times. Specifically, we introduce an adversarial nonparametric model for estimating matched time-to-event distributions for probabilistically concentrated and accurate predictions. We consider replacing the discriminator of the adversarial nonparametric model with a survival-function matching estimator that accounts for model calibration. The proposed estimator can be used as a means of estimating and comparing conditional survival distributions while accounting for the predictive uncertainty of probabilistic models.
Moreover, we introduce a theoretically grounded unified counterfactual inference framework for survival analysis, which adjusts for bias from two sources, namely, confounding (from covariates influencing both the treatment assignment and the outcome) and censoring (informative or non-informative). To account for censoring biases, a proposed flexible and nonparametric probabilistic model is leveraged for event times. Then, we formulate a model-free nonparametric hazard ratio metric for comparing treatment effects or leveraging prior randomized real-world experiments in longitudinal studies. Further, the proposed model-free hazard-ratio estimator can be used to identify or stratify heterogeneous treatment effects. For stratifying risk profiles, we formulate an interpretable time-to-event driven clustering method for observations (patients) via a Bayesian nonparametric stick-breaking representation of the Dirichlet Process.
Finally, through experiments on real-world datasets, consistent improvements in predictive performance and interpretability are demonstrated relative to existing state-of-the-art survival analysis models.
calibration and uncertainty
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info