Modeling and Methodological Advances in Causal Inference

Thumbnail Image




Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



This thesis presents several novel modeling or methodological advancements to causal inference. First, we investigate the use of propensity score weighting in the randomized trials for covariate adjustment. We introduce the class of balancing weights and study its theoretical property. We demonstrate that it is asymptotically equivalent to the analysis of covariance (ANCOVA) and derive the closed-form variance estimator. We further recommend the overlap weighting estimator based on its semiparametric efficiency and good finite-sample performance. Next, we focus on comparative effectiveness studies with survival outcomes. As opposed to the approach coupling with a Cox proportional hazards model, we follow an ``once for all'' approach and construct pseudo-observations of the censored outcomes. We study the theoretical property of propensity score weighting estimator based on pseudo-observations and provide closed-form variance estimators. The third contribution lies in the domain of causal mediation analysis, which studies how much of the treatment effect is mediated or explained through a given intermediate variable. The existing approaches are not directly applicable to scenario where both the mediator and outcome are measured on the sparse and irregular time grids. We propose a causal mediation framework by treating the sparse and irregular data as realizations of smooth processes and provide the assumptions for nonparametric identifications. We also provide a functional principal component analysis (FPCA) approach for estimation and carries out inference with a Bayesian paradigm. Furthermore, we study how to achieve double robustness with machine learning approaches. We develop a new algorithm that learns the double-robust representations in observational studies. The proposed method can learn the low-dimensional representations as well as the balancing weights simultaneously. Lastly, we study how to build a robust prediction model by exploiting the causal relationships. From a causal perspective, we argue robust models should capture the stable causal relationships as opposed to the spurious correlations. We propose a causal transfer random forest method learning the stable causal relationships efficiently from a large scale of observational data and a small amount of randomized data. We provide theoretical justifications and validate the algorithm empirically with synthetic experiments and real world prediction tasks.

In summary, this thesis makes contributions to the following three major areas in causal inference: (i) propensity score weighting methods for randomized experiments and observational studies, which consists of (a) randomized controlled trial (Chapter 2}) (b) survival outcome (Chapter 3); (ii) causal mediation analysis with sparse and irregular longitudinal data (Chapter 4); (iii) machine learning methods for causal inference, which consists of (a) double robustness (Chapter 5), (b) causal transfer random forest (Chapter 6).





Zeng, Shuxi (2021). Modeling and Methodological Advances in Causal Inference. Dissertation, Duke University. Retrieved from


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.