The Annals of Applied Statistics

For objective causal inference, design trumps analysis

Donald B. Rubin
Source: Ann. Appl. Stat. Volume 2, Number 3 (2008), 808-840.

Abstract

For obtaining causal inferences that are objective, and therefore have the best chance of revealing scientific truths, carefully designed and executed randomized experiments are generally considered to be the gold standard. Observational studies, in contrast, are generally fraught with problems that compromise any claim for objectivity of the resulting causal inferences. The thesis here is that observational studies have to be carefully designed to approximate randomized experiments, in particular, without examining any final outcome data. Often a candidate data set will have to be rejected as inadequate because of lack of data on key covariates, or because of lack of overlap in the distributions of key covariates between treatment and control groups, often revealed by careful propensity score analyses. Sometimes the template for the approximating randomized experiment will have to be altered, and the use of principal stratification can be helpful in doing this. These issues are discussed and illustrated using the framework of potential outcomes to define causal effects, which greatly clarifies critical issues.

First Page: Show Hide
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1223908042
Digital Object Identifier: doi:10.1214/08-AOAS187
Zentralblatt MATH identifier: 1149.62089
Mathematical Reviews number (MathSciNet): MR2516795

References

Ahmed, A., Husain, A., Love, T., Gambassi, G., Dell’Italia, L., Francis, G., Gheorghiade, M., Allman, R., Meleth, S. and Bourge, R. (2006). Heart failure, chronic diuretic use, and increase in mortality and hospitalization: An observational study using propensity score methods. Eur. Heart J. 27 1431–1439.
Angrist, J., Imbens, G. and Rubin, D. (1996). Identification of causal effects using instrumental variables. J. Amer. Statist. Assoc. 91 444–472.
Barnard, J., Frangakis, C., Hill, J. and Rubin, D. (2003). Principal stratification approach to broken randomized experiments: A case study of school choice vouchers in New York city. J. Amer. Statist. Assoc. 98 299–323.
Mathematical Reviews (MathSciNet): MR1995712
Zentralblatt MATH: 1047.62120
Digital Object Identifier: doi:10.1198/016214503000071
Blalock, H. (1964). Causal Inference in Nonexperimental Research. Univ. North Carolina Press, Chapel Hill.
Campbell, D. and Stanley, J. (1963). Experimental and quasi-experimental designs for research and teaching. In Handbook of Research on Teaching (N. L. Gage, ed.). Rand McNally, Chicago.
Cochran, W. (1965). The planning of observational studies of human populations. J. Roy. Statist. Soc. A 128 234–265.
Cochran, W. (1968). The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics 2 295–313.
Mathematical Reviews (MathSciNet): MR228136
Digital Object Identifier: doi:10.2307/2528036
Cochran, W. (1983). Planning and Analysis of Observational Studies. Wiley, New York.
Mathematical Reviews (MathSciNet): MR720048
Cochran, W. and Cox, G. (1950). Experimental Designs. Wiley, New York.
Cook, T. and Campbell, D. (1979). Quasi-Experimentation: Design and Analysis for Field Settings. Rand McNally, Chicago.
Cox, D. (1958). The Planning of Experiments. Wiley, New York.
Mathematical Reviews (MathSciNet): MR95561
D’Agostino, R. Jr. and D’Agostino, R. Sr. (2007). Estimating treatment effects using observational data. J. Amer. Med. Assoc. 297 314–316.
Dorn, H. (1953). Philosophy of inference from retrospective studies. Amer. J. Publ. Health 43 677–683.
Fisher, R. (1925). Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh.
Fisher, R. (1935). Design of Experiments. Oliver and Boyd, Edinburgh.
Frangakis, C. and Rubin, D. (2002). Principal stratification in causal inference. Biometrics 58 21–29.
Mathematical Reviews (MathSciNet): MR1891039
Digital Object Identifier: doi:10.1111/j.0006-341X.2002.00021.x
Haavelmo, T. (1944). The probability approach in econometrics. Econometrica 15 413–419.
Mathematical Reviews (MathSciNet): MR10953
Holland, P. (1986). Statistics and causal inference. J. Amer. Statist. Assoc. 81 945–960.
Mathematical Reviews (MathSciNet): MR867618
Zentralblatt MATH: 0607.62001
Digital Object Identifier: doi:10.1080/01621459.1986.10478354
Holland, P. (1988). Causal inference, path analysis, and recursive structural equations models. Sociological Methodology 18 449–484.
Holland, P. and Rubin, D. (1983). On Lord’s paradox. Principles of Modern Psychological Measurement: A Festschrift for Frederick Lord 3–25. Erlbaum, New Jersey.
Imbens, G. and Rubin, D. (1997). Bayesian inference for causal effects in randomized experiments with noncompliance. Ann. Statist. 25 305–327.
Mathematical Reviews (MathSciNet): MR1429927
Zentralblatt MATH: 0877.62005
Digital Object Identifier: doi:10.1214/aos/1034276631
Project Euclid: euclid.aos/1034276631
Imbens, G. and Rubin, D. (2008a). Rubin causal model. The New Palgrave Dictionary of Economics (S. Durlauf and C. Blume, eds.), 2nd ed. Palgrave McMillan, New York.
Imbens, G. and Rubin, D. (2008b). Causal Inference in Statistics, and in the Social and Biomedical Sciences. Cambridge Univ. Press, New York. To appear.
Jin, H. and Rubin, D. (2008). Principal stratification for causal inference with extended partial compliance: Application to Efron–Feldman data. J. Amer. Statist. Assoc. 103 101–111.
Mathematical Reviews (MathSciNet): MR2463484
Zentralblatt MATH: 05564472
Digital Object Identifier: doi:10.1198/016214507000000347
Kempthorne, O. (1952). The Design and Analysis of Experiments. Wiley, New York.
Mathematical Reviews (MathSciNet): MR45368
Zentralblatt MATH: 0049.09901
Langenskold, S. and Rubin, D. (2008). Outcome-free design of observational studies with application to investigating peer effects on college freshman smoking behaviors. In Les Annales d’Economie et de Statistique. To appear.
Kenny, D. A. (1979). Correlation and Causation. Wiley, New York.
Mathematical Reviews (MathSciNet): MR576750
Zentralblatt MATH: 0504.62109
Lilienfeld, A. and Lilienfeld, D. (1976). Foundations of Epidemiology. Oxford Univ. Press, New York.
Maddala, G. (1977). Econometrics. McGraw-Hill, New York.
Morgan, S. L. and Winship, C. (2007). Counterfactuals and Causal Inference: Methods and Principles for Social Research. Cambridge Univ. Press, Cambridge.
Neyman, J. (1923). On the application of probability theory to agricultural experiments: Essay on principles, Section 9. Translated in Statist. Sci. 5 465–480.
Mathematical Reviews (MathSciNet): MR1092986
Project Euclid: euclid.ss/1177012031
Neyman, J. (1990). On the application of probability theory to agricultural experiments: Essay on principles, Section 9. Ann. Agric. Sci. 1923. Translated in Statist. Sci. 5 465–472.
Mathematical Reviews (MathSciNet): MR1092986
Project Euclid: euclid.ss/1177012031
Reinisch, L., Sanders, S., Mortensen, E. and Rubin, D. (1995). In utero exposure to phenobarbital and intelligence deficits in adult men. J. Amer. Med. Assoc. 274 1518–1525.
Rosenbaum, P. and Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
Mathematical Reviews (MathSciNet): MR742974
Zentralblatt MATH: 0522.62091
Digital Object Identifier: doi:10.1093/biomet/70.1.41
Rosenbaum, P. and Rubin, D. (1985). Constructing a control group using multivariate matched sampling incorporating the propensity core. Amer. Statist. 39 33–38.
Rothman, K. J. (1986). Modern Epidemiology. Little, Brown and Company, Boston.
Roy, A. (1951). Some thoughts on the distribution of earnings. Oxford Economic Papers 3 135–146.
Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66 688–701.
Rubin, D. (1975). Bayesian inference for causality: The importance of randomization. In The Proceedings of the Social Statistics Section of the American Statistical Association 233–239. American Statistical Association, Alexandria, VA.
Rubin, D. (1976a). Inference and missing data. Biometrika 63 581–592. With discussion and reply.
Mathematical Reviews (MathSciNet): MR455196
Zentralblatt MATH: 0344.62034
Digital Object Identifier: doi:10.1093/biomet/63.3.581
Rubin, D. (1976b). Multivariate matching methods that are equal percent bias reducing, II: Maximums on bias reduction for fixed sample sizes. Biometrics 32 121–132.
Mathematical Reviews (MathSciNet): MR400556
Digital Object Identifier: doi:10.2307/2529343
Rubin, D. (1977). Assignment to treatment group on the basis of a covariate. J. Educ. Statist. 2 1–26.
Rubin, D. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34–58.
Mathematical Reviews (MathSciNet): MR472152
Zentralblatt MATH: 0383.62021
Digital Object Identifier: doi:10.1214/aos/1176344064
Project Euclid: euclid.aos/1176344064
Rubin, D. (1979a). Discussion of “Conditional independence in statistical theory” by A.P. Dawid. J. Roy. Statist. Soc. Ser. B 41 27–28.
Mathematical Reviews (MathSciNet): MR535541
Rubin, D. (1979b). Using multivariate matched sampling and regression adjustment to control bias in observational studies. J. Amer. Statist. Assoc. 74 318–328.
Rubin, D. (1980). Discussion of “Randomization analysis of experimental data in the Fisher randomization test” by Basu. J. Amer. Statist. Assoc. 75 591–593.
Mathematical Reviews (MathSciNet): MR590687
Zentralblatt MATH: 0444.62089
Digital Object Identifier: doi:10.1080/01621459.1980.10477512
Rubin, D. (1984). William G. Cochran’s contributions to the design, analysis, and evaluation of observational studies. In W. G. Cochran’s Impact on Statistics (P. S. R. S. Rao and J. Sedransk, eds.) 37–69. Wiley, New York.
Mathematical Reviews (MathSciNet): MR758447
Rubin, D. (1990a). Neyman (1923) and causal inference in experiments and observational studies. Statist. Sci. 5 472–480.
Mathematical Reviews (MathSciNet): MR1092987
Project Euclid: euclid.ss/1177012032
Rubin, D. (1990b). Formal modes of statistical inference for causal effects. J. Statist. Plann. Inference 25 279–292.
Rubin, D. (1997). Estimating causal effects from large data sets using propensity scores. Ann. Internal Med. 127 757–763.
Rubin, D. (2002). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Serv. and Outcomes Res. Methodol. 2 169–188.
Rubin, D. (2005). Causal inference using potential outcomes: Design, modeling, decisions. 2004 Fisher lecture. J. Amer. Statist. Assoc. 100 322–331.
Mathematical Reviews (MathSciNet): MR2166071
Zentralblatt MATH: 1117.62418
Digital Object Identifier: doi:10.1198/016214504000001880
Rubin, D. (2006). Matched Sampling for Causal Effects. Cambridge Univ. Press, New York.
Mathematical Reviews (MathSciNet): MR2307965
Zentralblatt MATH: 1118.62113
Rubin, D. (2007). The design versus the analysis of observational studies for causal effects: Parallels with the design of randomized trials. Stat. Med. 26 20–30.
Mathematical Reviews (MathSciNet): MR2312697
Digital Object Identifier: doi:10.1002/sim.2739
Rubin, D. (2008). Statistical inference for causal effects, with emphasis on applications in epidemiology and medical statistics. II. In Handbook of Statisics: Epidemiology and Medical Statistics (C. R. Rao, J. P. Miller and D. C. Rao, eds.). Elsevier, The Netherlands.
Mathematical Reviews (MathSciNet): MR2500431
Digital Object Identifier: doi:10.1016/S0169-7161(07)27002-6
Rubin, D. and Thomas, N. (1992). Characterizing the effect of matching using linear propensity score methods with normal covariates. Biometrika 79 797–809.
Mathematical Reviews (MathSciNet): MR1209479
Zentralblatt MATH: 0765.62098
Digital Object Identifier: doi:10.1093/biomet/79.4.797
Rubin, D. and Thomas, N. (2000). Combining propensity score matching with additional adjustments for prognostic covariates. J. Amer. Statist. Assoc. 95 573–585.
Rubin, D., Wang, X., Yin, L. and Zell, E. (2008). Bayesian causal inference: Approaches to estimating the effect of treating hospital type on cancer survival in Sweden using principal stratification. In Handbook of Applied Bayesian Analysis (T. O’Hagan and M. West, eds.). Oxford Univ. Press, Oxford.
Mathematical Reviews (MathSciNet): MR2790361
Rubin, D. and Waterman, R. (2006). Estimating causal effects of marketing interventions using propensity score methodology. Statist. Sci. 21 206–222.
Mathematical Reviews (MathSciNet): MR2324079
Digital Object Identifier: doi:10.1214/088342306000000259
Project Euclid: euclid.ss/1154979822
Shadish, W. R., Cook, T. D. and Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin Company, Boston.
Zell, E., Kuwanda, M. Rubin, D., Cutland, C., Patel, R., Velaphi S., Madhi, S. and Schrag, S. (2007). Conducting and analyzing a single-blind clinical trial in a developing country: Prevention of perinatal sepsis, soweto, South Africa. In Proceedings of the International Statistical Institute (CD-ROM).

2013 © Institute of Mathematical Statistics

The Annals of Applied Statistics

The Annals of Applied Statistics

Turn MathJax Off
What is MathJax?