Machine Learning for Uncertainty with Application to Causal Inference
Effective decision making requires understanding the uncertainty inherent in a problem. This covers a wide scope in statistics, from deriving an estimator to training a predictive model. In this thesis, I will spend three chapters discussing new uncertainty methods developed for solving individual and population level inference problems with their theory and applications in causal inference. I will also detail the limitations of existing approaches and why my proposed methods lead to better performance.
In the first chapter, I will introduce a novel approach, Collaborating Networks (CN), to capture predictive distributions in regression. It defines two neural networks with two distinct loss functions to approximate the cumulative distribution function and its inverse respectively and collectively. This gives CN extra flexibility through bypassing the necessity of assuming an explicit distribution family like Gaussian. Empirically, CN generates sharp intervals with reliable coverage.
In the second chapter, I extend CN to estimate the individual treatment effect in observational studies. It is augmented by a new adjustment scheme developed through representation learning, which is shown to effectively alleviate the imbalance between treatment groups. Moreover, a new evaluation criterion is suggested by combing the estimated uncertainty and variation in utility functions (e.g., variability in risk tolerance) for more comprehensive decision making, while traditional approaches only study an individual’s outcome change due to a potential treatment.
In the last chapter, I will present an analysis pipeline for causal inference with propensity score weighting. Comparing to other pipelines for similar purposes, this package comprises a wider range of functionalities to provide an exhaustive design and analysis platform that enables users to construct different estimators and assess their uncertainties. Itoffers six major advantages: it incorporates (i) visualization and diagnostic tools of checking covariate overlap and balance, (ii) a general class of balancing weights, (iii) comparison for multiple treatments, (iv) simple and augmented (doubly-robust) weighting estimators, (iv) nuisance-adjusted sandwich variances, and (v) ratio estimands for binary and count outcomes.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info