Domain-Guided Machine Learning: Trustworthy Methods for Causal Inference and Rare Event Prediction

Loading...

Date

2025

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

9
views
10
downloads

Attention Stats

Abstract

As machine learning becomes increasingly capable, it is being applied across a growing number of fields to tackle complex, real-world problems. This widespread adoption has brought machine learning into high-stakes domains, such as healthcare, public policy, and criminal justice. In well-scoped, data-rich environments, machine learning systems can often operate effectively autonomously. However, in more nuanced or resource-constrained settings, machine learning struggles to contextual subtleties and domain-specific considerations while optimizing for performance.

My research focuses on developing machine learning-aided approaches for high-stakes settings that center domain expertise throughout the analysis pipeline. This overarching goal is addressed across three distinct problem areas: (i) observational causal inference, (ii) causal inference via data fusion, and (iii) rare event prediction. By actively incorporating the insights and experience of domain experts, this work aims to harness the strengths of modern machine learning while promoting interpretability, robustness, and trust in the final results.

In the observational causal inference setting, we present Variable Importance Matching (VIM) as a flexible, accurate, and auditable approach for estimating treatment effects (Chapter 2). We extend this framework to estimate dynamic treatment regimes for patients experiencing severe seizures in the intensive care unit (Chapter 3), showcasing the real-world benefits of incorporating domain expertise into the analysis pipeline.

Moving beyond purely observational data, we explore causal inference in data fusion settings that combine observational and experimental data. We propose a method for partial identification under the violations of key assumptions, enabling domain experts to conduct structured sensitivity analyses and assess the robustness of causal conclusions (Chapter 4).

Finally, we turn to the challenge of rare event prediction. The last chapter proposes a multi-label learning approach that leverages experts' knowledge of related events to improve predictive performance on extremely rare outcomes (Chapter 5).

Description

Provenance

Subjects

Biostatistics, Statistics, Computer science, causal inference, interpretability, matching methods, partial identification, rare event prediction, trustworthiness

Citation

Citation

Lanners, Quinn Michael (2025). Domain-Guided Machine Learning: Trustworthy Methods for Causal Inference and Rare Event Prediction. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/33308.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.