This paper studies the robustness of estimated policy effects to changes in the distribution of covariates, a key exercise to evaluate the external validity of (quasi)-experimental results. I propose a novel scalar robustness metric. It measures the magnitude of the smallest covariate shift needed to invalidate a claim on the policy effect (say, ATE greater than 0) supported by the (quasi)-experimental evidence. I estimate my robustness metric using de-biased GMM, which guarantees a parametric convergence rate while allowing for machine learning-based estimators of policy effect heterogeneity (including LASSO, random forest, boosting, neural nets). I apply my procedure to study the robustness of policy effects’ estimates for health-care utilization and financial strain outcomes in the Oregon Health Insurance experiment. I find that, among all outcomes, the effect of the insurance policy on outpatient visits is most robust to shifts in the distribution of context-specific covariates.
Supplementary notes can be added here, including code and math.