dame-flame: A Python Library Providing Fast Interpretable Matching for Causal Inference.
Repository Usage Stats
dame-flame is a Python package for performing matching for observational causal inference on datasets containing discrete covariates. This package implements the Dynamic Almost Matching Exactly (DAME) and Fast Large-Scale Almost Matching Exactly (FLAME) algorithms, which match treatment and control units on subsets of the covariates. The resulting matched groups are interpretable, because the matches are made on covariates (rather than, for instance, propensity scores), and high-quality, because machine learning is used to determine which covariates are important to match on. DAME solves an optimization problem that matches units on as many covariates as possible, prioritizing matches on important covariates. FLAME approximates the solution found by DAME via a much faster backward feature selection procedure. The package provides several adjustable parameters to adapt the algorithms to specific applications, and can calculate treatment effects after matching. Descriptions of these parameters, details on estimating treatment effects, and further examples, can be found in the documentation at https://almost-matching-exactly.github.io/DAME-FLAME-Python-Package/
I joined the Department of Computer Science at Duke University in Fall 2015.
I graduated from the University of Pennsylvania with a Ph.D. in Computer and Information Science where I was advised by Prof. Susan Davidson and Prof. Sanjeev Khanna. During my Ph.D., I did two internships at IBM Research, Almaden,and received a Google PhD fellowship in Structured Data in 2011.I obtained my master's and bachelor's degrees in Computer Science from Indian Institute of Technology, Kanpur and Jadavpur University respectively.
Research Interests I am broadly interested in data and information management with a focus on foundational aspects of big data analysis. My research objective is to help users with heterogenous backgrounds and interests leverage the maximum benefit from the available data. While my ongoing work on explanations in databases directly aims to assist users get deep insights into data by providing rich explanations to their questions, my work in the areas of data and workow provenance, probabilistic databases, and crowd-sourcing probes into compelling, fundamental questions that need to be answered to enable end-to-end processing and analysis of unstructured, noisy, and unreliable data in today's world while preserving its entire context.
Cynthia Rudin is a professor of computer science, electrical and computer engineering, statistical science, and biostatistics & bioinformatics at Duke University, and directs the Interpretable Machine Learning Lab. Previously, Prof. Rudin held positions at MIT, Columbia, and NYU. She holds an undergraduate degree from the University at Buffalo, and a PhD from Princeton University. She is the recipient of the 2022 Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity from the Association for the Advancement of Artificial Intelligence (AAAI). This award, similar only to world-renowned recognitions, such as the Nobel Prize and the Turing Award, carries a monetary reward at the million-dollar level. She is also a three-time winner of the INFORMS Innovative Applications in Analytics Award, was named as one of the "Top 40 Under 40" by Poets and Quants in 2015, and was named by Businessinsider.com as one of the 12 most impressive professors at MIT in 2015. She is a fellow of the American Statistical Association and a fellow of the Institute of Mathematical Statistics.
She is past chair of both the INFORMS Data Mining Section and the Statistical Learning and Data Science Section of the American Statistical Association. She has also served on committees for DARPA, the National Institute of Justice, AAAI, and ACM SIGKDD. She has served on three committees for the National Academies of Sciences, Engineering and Medicine, including the Committee on Applied and Theoretical Statistics, the Committee on Law and Justice, and the Committee on Analytic Research Foundations for the Next-Generation Electric Grid. She has given keynote/invited talks at several conferences including KDD (twice), AISTATS, CODE, Machine Learning in Healthcare (MLHC), Fairness, Accountability and Transparency in Machine Learning (FAT-ML), ECML-PKDD, and the Nobel Conference. Her work has been featured in news outlets including the NY Times, Washington Post, Wall Street Journal, the Boston Globe, Businessweek, and NPR.
I am interested in theory and methodology for network analysis, causal inference and statistical/computational tradeoffs and in applications in the social sciences. Modern data streams frequently do not follow the traditional paradigms of n independent observations on p quantities of interest. They can include complex dependencies among the observations (e.g. interference in the study of causal effects) or among the quantities of interest (e.g. probabilities of edge formation in a network). My research is concerned with developing theory and methodological tools for approaching such modern data structures by better understanding these underlying dependence structures. My work concentrates on better understanding Kronecker covariance structures as they are related to network analysis and high dimensional unbalanced factorial designs. I work on theory and methodology for high dimensional data as it relates to network analysis, causal inference and computational and statistical tradeoffs. My primary applied interest is in the health and social sciences with past and ongoing collaborations studying friendship formation in high schools, employment outcomes for college graduates and job mobility as a function of an underlying social network.
Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.