# Browsing by Subject "Causal Inference"

###### Results Per Page

###### Sort Options

Item Open Access An Investigation into the Bias and Variance of Almost Matching Exactly Methods(2021) Morucci, MarcoThe development of interpretable causal estimation methods is a fundamental problem for high-stakes decision settings in which results must be explainable. Matching methods are highly explainable, but often lack the accuracy of black-box nonparametric models for causal effects. In this work, we propose to investigate theoretically the statistical bias and variance of Almost Matching Exactly (AME) methods for causal effect estimation. These methods aim to overcome the inaccuracy of matching by learning on a separate training dataset an optimal metric to match units on. While these methods are both powerful and interpretable, we currently lack an understanding of their statistical properties. In this work we present a theoretical characterization of the finite-sample and asymptotic properties of AME. We show that AME with discrete data has bounded bias in finite samples, and is asymptotically normal and consistent at a root-n rate. Additionally, we show that AME methods for matching on networked data also have bounded bias and variance in finite-samples, and achieve asymptotic consistency in sparse enough graphs. Our results can be used to motivate the construction of approximate confidence intervals around AME causal estimates, providing a way to quantify their uncertainty.

Item Open Access Bayesian Estimation and Sensitivity Analysis for Causal Inference(2019) Zaidi, Abbas MThis disseration aims to explore Bayesian methods for causal inference. In chapter 1, we present an overview of fundamental ideas from causal inference along with an outline of the methodological developments that we hope to tackle. In chapter 2, we develop a Gaussian-process mixture model for heterogeneous treatment effect estimation that leverages the use of transformed outcomes. The approach we will present attempts to improve point estimation and uncertainty quantification relative to past work that has used transformed variable related methods as well as traditional outcome modeling. Earlier work on modeling treatment effect heterogeneity using transformed outcomes has relied on tree based methods such as single regression trees and random forests. Under the umbrella of non-parametric models, outcome modeling has been performed using Bayesian additive regression trees and various flavors of weighted single trees. These approaches work well when large samples are available, but suffer in smaller samples where results are more sensitive to model misspecification -- our method attempts to garner improvements in inference quality via a correctly specified model rooted in Bayesian non-parametrics. Furthermore, while we begin with a model that assumes that the treatment assignment mechanism is known, an extension where it is learnt from the data is presented for applications to observational studies. Our approach is applied to simulated and real data to demonstrate our theorized improvements in inference with respect to two causal estimands: the conditional average treatment effect and the average treatment effect. By leveraging our correctly specified model, we are able to more accurately estimate the treatment effects while reducing their variance. In chapter 3, we parametrically and hierarchically estimate the average causal effects of different lengths of stay in the Udayan Ghar Program under the assumption that selection into different lengths is based on a set of observed covariates. This program was piloted in New Delhi, India as a means of providing a residential surrogate to vulnerable and at risk children with the hope of improving their psychological development. We find that the estimated effects on the psychological ideas of self concept and ego resilience (measured by the standardized Piers-Harris score) increase with the length of the time spent in the program. We are also able to conclude that there are measurable differences that exist between male and female children that spend time in the program. In chapter 4, we supplement the estimation of hierarchical dose-response function estimation by introducing a novel sensitivity-analysis and summarization strategy for assessing the robustness of our results to violations of the assumption of unconfoundedness. Finally, in chapter 5, we summarize what this dissertation has achieved, and briefly outline important areas where our work warrants further development.

Item Open Access CAUSAL INFERENCE FOR HIGH-STAKES DECISIONS(2023) Parikh, Harsh JCausal inference methods are commonly used across domains to aid high-stakes decision-making. The validity of causal studies often relies on strong assumptions that might not be realistic in high-stakes scenarios. Inferences based on incorrect assumptions frequently result in sub-optimal decisions with high penalties and long-term consequences. Unlike prediction or machine learning methods, it is particularly challenging to evaluate the performance of causal methods using just the observed data because the ground truth causal effects are missing for all units. My research presents frameworks to enable validation of causal inference methods in one of the following three ways: (i) auditing the estimation procedure by a domain expert, (ii) studying the performance using synthetic data, and (iii) using placebo tests to identify biases. This work enables decision-makers to reason about the validity of the estimation procedure by thinking carefully about the underlying assumptions. Our Learning-to-Match framework is an auditable-and-accurate approach that learns an optimal distance metric for estimating heterogeneous treatment effects. We augment Learning-to-Match framework with pharmacological mechanistic knowledge to study the long-term effects of untreated seizure-like brain activities in critically ill patients. Here, the auditability of the estimator allowed neurologists to qualitatively validate the analysis via a chart-review. We also propose Credence, a synthetic data based framework to validate causal inference methods. Credence simulates data that is stochastically indistinguishable from the observed data while allowing for user-designed treatment effects and selection biases. We demonstrate Credence's ability to accurately assess the relative performance of causal estimation techniques in an extensive simulation study and two real-world data applications. We also discuss an approach to combines experimental and observational studies. Our approach provides a principled approach to test for the violations of no-unobserved confounder assumption and estimate treatment effects under this violation.

Item Open Access Communities in Social Networks: Detection, Heterogeneity and Experimentation(2022) Mathews, HeatherThe study of network data in the social and health sciences frequently concentrates on understanding how and why connections form. In particular, the task of determining latent mechanisms driving connection has received a lot of attention across statistics, machine learning, and information theory. In social networks, this mechanism often manifests as community structure. As a result, this work provides methods for discovering and leveraging these communities to better understand networks and the data they generate.

We provide three main contributions. First, we present methodology for performing community detection in challenging regimes. Existing literature has focused on modeling the spectral embedding of a network using Gaussian mixture models (GMMs) in scaling regimes where the ability to detect community memberships improves with the size of the network. However, these regimes are not very realistic. As such, we provide tractable methodology motivated by new theoretical results for networks with non-vanishing noise by using GMMs that incorporate truncation and shrinkage effects.

Further, when covariate information is available, often we want to understand how covariates impact connections. It is likely that the effects of covariates on edge formation differ between communities (e.g. age might play a different role in friendship formation in communities across a city). To address this issue, we introduce a latent space network model where coefficients associated with certain covariates can depend on latent community membership of the nodes. We show that ignoring such structure can lead to either over- or under-estimation of covariate importance to edge formation and propose a Markov Chain Monte Carlo approach for simultaneously learning the latent community structure and the community specific coefficients.

Finally, we consider how community structure can impact experimentation. It is evident that communities can act in different ways, and it is natural that this propagates into experimental design. As as result, this observation motivates our development of community informed experimental design. This design recognizes that information between individuals likely flows along within community edges rather than across community edges. We demonstrate that this design improves estimation of global average treatment effect, even when the community structure of the graph needs to be estimated.

Item Open Access Interpretable Almost-Matching Exactly with Instrumental Variables(2019) Liu, YamengWe aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework.

The method proposed in this work aims to match units on a weighted Hamming distance, taking into account the relative importance of the covariates; To match units on as many relevant variables as possible, the algorithm creates a hierarchy of covariate combinations on which to match (similar to downward closure), in the process solving an optimization problem for each unit in order to construct the optimal matches. The algorithm uses a single dynamic program to solve all of the units' optimization problems simultaneously. Notable advantages of our method over existing matching procedures are its high-quality interpretable matches, versatility in handling different data distributions that may have irrelevant variables, and ability to handle missing data by matching on as many available covariates as possible. We also adapt the matching framework by using instrumental variables (IV) to the presence of observed categorical confounding that breaks the randomness assumptions and propose an approximate algorithm which speedily generates high-quality interpretable solutions.We show that our algorithms construct better matches than other existing methods on simulated datasets, produce interesting results in applications to crime intervention and political canvassing.

Item Open Access “Let the Sunshine In”: The Impact of Industry Payment Disclosure on Physician Prescription Behavior(Marketing Science, 2020-01-01) Guo, T; Sriram, S; Manchanda, PU.S. pharmaceutical companies frequently pay doctors to promote their drugs. This has raised concerns about conflict of interest, which policy makers have attempted to address by introducing payment disclosure laws. However, it is unclear if such disclosure has an effect on physician prescription behavior. We use individual-level claims data from a major provider of health insurance in the United States and employ a difference-indifferences research design to study the effect of the payment disclosure law introduced in Massachusetts in June 2009. The research design exploits the fact that, although physicians operating in Massachusetts were impacted by the legislation, their counterparts in the neighboring states of Connecticut, New York, New Hampshire, and Rhode Island were not. In order to keep the groups of physicians comparable, we restrict our analysis to physicians in the counties that are on the border of these states. We find that the Massachusetts disclosure law resulted in a decline in prescriptions in all three drug classes studied: statins, antidepressants, and antipsychotics. Our findings are robust to alternative control groups, time periods and estimation methods. We also show that the effect is highly heterogeneous across physician groups. Finally, we explore potential mechanisms driving these results.Item Open Access Machine Learning for Uncertainty with Application to Causal Inference(2022) Zhou, TianhuiEffective decision making requires understanding the uncertainty inherent in a problem. This covers a wide scope in statistics, from deriving an estimator to training a predictive model. In this thesis, I will spend three chapters discussing new uncertainty methods developed for solving individual and population level inference problems with their theory and applications in causal inference. I will also detail the limitations of existing approaches and why my proposed methods lead to better performance.

In the first chapter, I will introduce a novel approach, Collaborating Networks (CN), to capture predictive distributions in regression. It defines two neural networks with two distinct loss functions to approximate the cumulative distribution function and its inverse respectively and collectively. This gives CN extra flexibility through bypassing the necessity of assuming an explicit distribution family like Gaussian. Empirically, CN generates sharp intervals with reliable coverage.

In the second chapter, I extend CN to estimate the individual treatment effect in observational studies. It is augmented by a new adjustment scheme developed through representation learning, which is shown to effectively alleviate the imbalance between treatment groups. Moreover, a new evaluation criterion is suggested by combing the estimated uncertainty and variation in utility functions (e.g., variability in risk tolerance) for more comprehensive decision making, while traditional approaches only study an individual’s outcome change due to a potential treatment.

In the last chapter, I will present an analysis pipeline for causal inference with propensity score weighting. Comparing to other pipelines for similar purposes, this package comprises a wider range of functionalities to provide an exhaustive design and analysis platform that enables users to construct different estimators and assess their uncertainties. Itoffers six major advantages: it incorporates (i) visualization and diagnostic tools of checking covariate overlap and balance, (ii) a general class of balancing weights, (iii) comparison for multiple treatments, (iv) simple and augmented (doubly-robust) weighting estimators, (iv) nuisance-adjusted sandwich variances, and (v) ratio estimands for binary and count outcomes.

Item Open Access Multisensory Integration, Segregation, and Causal Inference in the Superior Colliculus(2020) Mohl, Jeffrey ThomasThe environment is sampled by multiple senses, which are woven together to produce a unified perceptual state. However, unifying these senses requires assigning particular signals to the same or different underlying objects or events. Sensory signals originating from the same source should be integrated together, while signals originating from separate sources should be segregated from one another. Each of these computations is associated with different neural encoding strategies, and it is unknown how these strategies interact. Here, we begin to characterize how this problem is solved in the primate brain. First, we developed a behavioral paradigm and applied a computational modeling approach to demonstrate that monkeys, like humans, implement a form of Bayesian causal inference to decide whether two stimuli (one auditory and one visual) originated from the same source. We then recorded single unit neural activity from a representative multisensory brain region, the superior colliculus (SC), while monkeys performed this task. We found that SC neurons encoded either segregated unisensory or integrated multisensory target representations in separate sub-populations of neurons. These responses were well described by a weighted linear combination of unisensory responses which did not account for spatial separation between targets, suggesting that SC sensory responses did not immediately discriminate between common cause and separate cause conditions as predicted by Bayesian causal inference. These responses became less linear as the trial progressed, hinting that such a causal inference may evolve over time. Finally, we implemented a single trial analysis method to determine whether the observed linearity was indicative of true weighted combinations on each trial, or whether this observation was an artifact of pooling data across trials. We found that initial sensory responses (0-150 ms) were well described by linear models even at the single trial level, but that later sustained (150-600 ms) and saccade period responses were instead better described as fluctuating between encoding either the auditory or visual stimulus alone. We also found that these fluctuations were correlated with behavior, suggesting that they may reflect a convergence from the SC encoding all potential targets to preferentially encoding only a specific target on a given trial. Together, these results demonstrate that non-human primates (like humans) perform an idealized version of Bayesian causal inference, that this inference may depend on separate sub-populations of neurons maintaining either integrated or segregated stimulus representations, and that these responses then evolve over time to reflect more complex encoding rules.

Item Open Access Principled Deep Learning for Healthcare Applications(2023) Assaad, SergeHealthcare stands to benefit from the advent of deep learning on account of (i) the massive amounts of data generated by the health system and (ii) the ability of deep models to make predictions from complex inputs. This dissertation centers on two applications of deep learning to challenging problems in healthcare.

First, we discuss deep learning for treatment effect/counterfactual estimation in the observational setting, i.e., where the treatment assignment is not randomized (Chapters 2 and 3). For example, we may want to know the causal effect of a drug on a patient's blood pressure. We combine deep learning with classical weighting techniques to estimate average and conditional average treatment effects from observational data. We show theoretical properties of our method, including guarantees about when "balance" can be achieved between treatment groups. We then weaken the typical "ignorability" assumption and generate treatment effect intervals (instead of point-estimates).

Second, we explore the use of deep learning applied to a difficult problem in medical imaging: classifying malignancy from thyroid cytopathology slides (Chapters 4, 5, and 6). The difficulty of this problem arises from the image size, which is typically on the order of tens of gigabytes (i.e., around 3 to 4 orders of magnitude larger than image sizes in popular deep learning architectures). Our approach is a two-step process: (i) automatically finding image regions containing follicular cell groups, (ii) classifying each region and aggregating the predictions. We show that our system works well for mobile phone images of thyroid biopsy slides, and that our system compares favorably with state-of-the-art genetic testing for malignancy.

Finally, after my Ph.D. I plan to enter a career in autonomous driving. As an "epilogue" of this dissertation (Chapter 7), we present a method to make deep learning point-cloud models for autonomous driving which are invariant (or equivariant) to rotations. Intuitively, this is an important requirement -- a rotated bicycle should still be classified as a bicycle, and driving behavior should be independent of direction of travel. However, most deep learning models used in autonomous driving today do not satisfy these properties exactly. We propose a practical model (based on the Transformer architecture) to address this pitfall, and we showcase its performance on point-cloud classification and trajectory forecasting tasks.

Item Open Access Rethinking Nonlinear Instrumental Variables(2019) Li, ChunxiaoInstrumental variable (IV) models are widely used in the social and health sciences in situations where a researcher would like to measure a causal eect but cannot perform an experiment. Formally checking the assumptions of an IV model with a given dataset is impossible, leading many researchers to take as given a linear functional form and two-stage least squares tting procedure. In this paper, we propose a method for evaluating the validity of IV models using observed data and show that, in some cases, a more flexible nonlinear model can address violations of the IV conditions. We also develop a test that detects violations in the instrument that are present in the observed data. We introduce a new version of the validity check that is suitable for machine learning and provides optimization-based techniques to answer these questions. We demonstrate the method using both the simulated data and a real-world dataset.

Item Open Access Topics and Applications of Weighting Methods in Case-Control and Observational Studies(2019) Li, FanWeighting methods have been widely used in statistics and related applications. For example, the inverse probability weighting is a standard approach to correct for survey non-response. The case-control design, frequently seen in epidemiologic or genetic studies, can be regarded as a special type of survey design; analogous inverse probability weighting approaches have been explored when the interest is the association between exposures and the disease (primary analysis) as well as when the interest is the association among exposures (secondary analysis). Meanwhile, in observational comparative effectiveness research, inverse probability weighting has been suggested as a valid approach to correct for confounding bias. This dissertation develops and extends weighting methods for case-control and observational studies.

The first part of this dissertation extends the inverse probability weighting approach for secondary analysis of case-control data. We revisit an inverse probability weighting estimator to offer new insights and extensions. Specifically, we construct its more general form by generalized least squares (GLS). Such a construction allows us to connect the GLS estimator with the generalized method of moments and motivates a new specification test designed to assess the adequacy of the inverse probability weights. The specification test statistic measures the weighted discrepancy between the case and control subsample estimators, and asymptotically follows a Chi-squared distribution under correct model specification. We illustrate the GLS estimator and specification test using a case-control sample of peripheral arterial disease, and use simulations to shed light on the operating characteristics of the specification test. The second part develops a robust difference-in-differences (DID) estimator for estimating causal effect with observational before-after data. Within the DID framework, two common estimation strategies are outcome regression and propensity score weighting. Motivated by a real application in traffic safety research, we propose a new double-robust DID estimator that hybridizes outcome regression and propensity score weighting. We show that the proposed estimator possesses the desirable large-sample robustness property, namely the consistency only requires either one of the outcome model or the propensity score model to be correctly specified. We illustrate the new estimator to study the causal effect of rumble strips in reducing vehicle crashes, and conduct a simulation study to examine its finite-sample performance. The third part discusses a unified framework, the balancing weights, for estimating causal effects in observational studies with multiple treatments. These weights incorporate the generalized propensity scores to balance the weighted covariate distribution of each treatment group, all weighted toward a common pre-specified target population. Within this framework, we further develop the generalized overlap weights, constructed as the product of the inverse probability weights and the harmonic mean of the generalized propensity scores. The generalized overlap weights corresponds to the target population with the most overlap in covariates between treatments, similar to the population in equipoise in clinical trials. We show that the generalized overlap weights minimize the total asymptotic variance of the nonparametric estimators for the pairwise contrasts within the class of balancing weights. We apply the new weighting method to study the racial disparities in medical expenditure and further examine its operating characteristics by simulations.