Statistical Science

Comment: Will Competition-Winning Methods for Causal Inference Also Succeed in Practice?

Qingyuan Zhao, Luke J. Keele, and Dylan S. Small

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


First, we would like to congratulate the authors for successfully hosting the causal inference data competition (referred to as Competition henceforth) and contributing a unique and thought-provoking article to the literature. The authors have provided a comprehensive and timely platform to evaluate the ever-growing number of methods used for covariate adjustment in observational studies. In our comment, we don’t generally question the results of the competition, but we do wish to emphasize several other key elements about the role statistics plays in causal inference and observational studies.

Article information

Statist. Sci., Volume 34, Number 1 (2019), 72-76.

First available in Project Euclid: 12 April 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Observational studies machine learning study design


Zhao, Qingyuan; Keele, Luke J.; Small, Dylan S. Comment: Will Competition-Winning Methods for Causal Inference Also Succeed in Practice?. Statist. Sci. 34 (2019), no. 1, 72--76. doi:10.1214/18-STS680.

Export citation


  • Angrist, J. D. and Krueger, A. B. (1999). Empirical strategies in labor economics. In Handbook of Labor Economics (O. Ashenfelter and D. Card, eds.) 3A 1277–1366. Elsevier, Amsterdam.
  • Box, G. E. (1979). Some problems of statistics and everyday life. J. Amer. Statist. Assoc. 74 1–4.
  • Breiman, L. (2001). Statistical modeling: The two cultures. Statist. Sci. 16 199–231.
  • Cook, T. D., Campbell, D. T. and Shadish, W. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin, Boston, MA.
  • Cook, T. D., Shadish, W. R. and Wong, V. C. (2008). Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons. J. Policy Anal. Manage. 27 724–750.
  • Hill, A. B. (1965). The environment and disease: Association or causation? J. R. Soc. Med. 58 295–300.
  • Imbens, G. W. (2003). Sensitivity to exogeneity assumptions in program evaluation. Am. Econ. Rev. Pap. Proc. 93 126–132.
  • Keele, L. and Small, D. (2018). Comparing covariate prioritization via matching to machine learning methods for causal inference using five empirical applications. Preprint. Available at arXiv:1805.03743.
  • Lipsitch, M., Tchetgen Tchetgen, E. and Cohen, T. (2010). Negative controls: A tool for detecting confounding and bias in observational studies. Epidemiology 21 383–388.
  • Pimentel, S. D., Kelz, R. R., Silber, J. H. and Rosenbaum, P. R. (2015). Large, sparse optimal matching with refined covariate balance in an observational study of the health outcomes produced by new surgeons. J. Amer. Statist. Assoc. 110 515–527.
  • Rosenbaum, P. R. (1987). Sensitivity analysis for certain permutation inferences in matched observational studies. Biometrika 74 13–26.
  • Rosenbaum, P. R. (2001). Replicating effects and biases. Amer. Statist. 55 223–227.
  • Rosenbaum, P. R. (2002). Observational Studies, 2nd ed. Springer Series in Statistics. Springer, New York.
  • Rosenbaum, P. R. (2005). Heterogeneity and causality: Unit heterogeneity and design sensitivity in observational studies. Amer. Statist. 59 147–152.
  • Rosenbaum, P. R. (2006). Differential effects and generic biases in observational studies. Biometrika 93 573–586.
  • Rubin, D. B. (2008). For objective causal inference, design trumps analysis. Ann. Appl. Stat. 2 808–804.
  • Shadish, W. R., Clark, M. H. and Steiner, P. M. (2008). Can nonrandomized experiments yield accurate answers? A randomized experiment comparing random and nonrandom assignments. J. Amer. Statist. Assoc. 103 1334–1343.
  • Zhao, Q. (2019). Covariate balancing propensity score by tailored loss functions. Ann. Statist. 47 965–993.
  • Zubizarreta, J. R. (2012). Using mixed integer programming for matching in an observational study of kidney failure after surgery. J. Amer. Statist. Assoc. 107 1360–1371.

See also

  • Main article: Automated versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition.