For obtaining causal inferences that are objective, and therefore
have the best chance of revealing scientific truths, carefully
designed and executed randomized experiments are generally
considered to be the gold standard. Observational studies, in
contrast, are generally fraught with problems that compromise
any claim for objectivity of the resulting causal inferences.
The thesis here is that observational studies have to be
carefully designed to approximate randomized experiments, in
particular, without examining any final outcome data. Often a
candidate data set will have to be rejected as inadequate
because of lack of data on key covariates, or because of lack of
overlap in the distributions of key covariates between treatment
and control groups, often revealed by careful propensity score
analyses. Sometimes the template for the approximating
randomized experiment will have to be altered, and the use of
principal stratification can be helpful in doing this. These
issues are discussed and illustrated using the framework of
potential outcomes to define causal effects, which greatly
clarifies critical issues.
Ahmed, A., Husain, A., Love, T., Gambassi, G., Dell’Italia, L., Francis, G., Gheorghiade, M., Allman, R., Meleth, S. and Bourge, R. (2006). Heart failure, chronic diuretic use, and increase in mortality and hospitalization: An observational study using propensity score methods. Eur. Heart J. 27 1431–1439.
Angrist, J., Imbens, G. and Rubin, D. (1996). Identification of causal effects using instrumental variables. J. Amer. Statist. Assoc. 91 444–472.
Barnard, J., Frangakis, C., Hill, J. and Rubin, D. (2003). Principal stratification approach to broken randomized experiments: A case study of school choice vouchers in New York city. J. Amer. Statist. Assoc. 98 299–323.
Blalock, H. (1964). Causal Inference in Nonexperimental Research. Univ. North Carolina Press, Chapel Hill.
Campbell, D. and Stanley, J. (1963). Experimental and quasi-experimental designs for research and teaching. In Handbook of Research on Teaching (N. L. Gage, ed.). Rand McNally, Chicago.
Cochran, W. (1965). The planning of observational studies of human populations. J. Roy. Statist. Soc. A 128 234–265.
Cochran, W. (1968). The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics 2 295–313.
Mathematical Reviews (MathSciNet): MR228136
Cochran, W. (1983). Planning and Analysis of Observational Studies. Wiley, New York.
Mathematical Reviews (MathSciNet): MR720048
Cochran, W. and Cox, G. (1950). Experimental Designs. Wiley, New York.
Cook, T. and Campbell, D. (1979). Quasi-Experimentation: Design and Analysis for Field Settings. Rand McNally, Chicago.
Cox, D. (1958). The Planning of Experiments. Wiley, New York.
Mathematical Reviews (MathSciNet): MR95561
D’Agostino, R. Jr. and D’Agostino, R. Sr. (2007). Estimating treatment effects using observational data. J. Amer. Med. Assoc. 297 314–316.
Dorn, H. (1953). Philosophy of inference from retrospective studies. Amer. J. Publ. Health 43 677–683.
Fisher, R. (1925). Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh.
Fisher, R. (1935). Design of Experiments. Oliver and Boyd, Edinburgh.
Frangakis, C. and Rubin, D. (2002). Principal stratification in causal inference. Biometrics 58 21–29.
Haavelmo, T. (1944). The probability approach in econometrics. Econometrica 15 413–419.
Mathematical Reviews (MathSciNet): MR10953
Holland, P. (1986). Statistics and causal inference. J. Amer. Statist. Assoc. 81 945–960.
Mathematical Reviews (MathSciNet): MR867618
Holland, P. (1988). Causal inference, path analysis, and recursive structural equations models. Sociological Methodology 18 449–484.
Holland, P. and Rubin, D. (1983). On Lord’s paradox. Principles of Modern Psychological Measurement: A Festschrift for Frederick Lord 3–25. Erlbaum, New Jersey.
Imbens, G. and Rubin, D. (1997). Bayesian inference for causal effects in randomized experiments with noncompliance. Ann. Statist. 25 305–327.
Imbens, G. and Rubin, D. (2008a). Rubin causal model. The New Palgrave Dictionary of Economics (S. Durlauf and C. Blume, eds.), 2nd ed. Palgrave McMillan, New York.
Imbens, G. and Rubin, D. (2008b). Causal Inference in Statistics, and in the Social and Biomedical Sciences. Cambridge Univ. Press, New York. To appear.
Jin, H. and Rubin, D. (2008). Principal stratification for causal inference with extended partial compliance: Application to Efron–Feldman data. J. Amer. Statist. Assoc. 103 101–111.
Kempthorne, O. (1952). The Design and Analysis of Experiments. Wiley, New York.
Mathematical Reviews (MathSciNet): MR45368
Langenskold, S. and Rubin, D. (2008). Outcome-free design of observational studies with application to investigating peer effects on college freshman smoking behaviors. In Les Annales d’Economie et de Statistique. To appear.
Kenny, D. A. (1979). Correlation and Causation. Wiley, New York.
Mathematical Reviews (MathSciNet): MR576750
Lilienfeld, A. and Lilienfeld, D. (1976). Foundations of Epidemiology. Oxford Univ. Press, New York.
Maddala, G. (1977). Econometrics. McGraw-Hill, New York.
Morgan, S. L. and Winship, C. (2007). Counterfactuals and Causal Inference: Methods and Principles for Social Research. Cambridge Univ. Press, Cambridge.
Neyman, J. (1923). On the application of probability theory to agricultural experiments: Essay on principles, Section 9. Translated in Statist. Sci. 5 465–480.
Neyman, J. (1990). On the application of probability theory to agricultural experiments: Essay on principles, Section 9. Ann. Agric. Sci. 1923. Translated in Statist. Sci. 5 465–472.
Reinisch, L., Sanders, S., Mortensen, E. and Rubin, D. (1995). In utero exposure to phenobarbital and intelligence deficits in adult men. J. Amer. Med. Assoc. 274 1518–1525.
Rosenbaum, P. and Rubin, D. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
Mathematical Reviews (MathSciNet): MR742974
Rosenbaum, P. and Rubin, D. (1985). Constructing a control group using multivariate matched sampling incorporating the propensity core. Amer. Statist. 39 33–38.
Rothman, K. J. (1986). Modern Epidemiology. Little, Brown and Company, Boston.
Roy, A. (1951). Some thoughts on the distribution of earnings. Oxford Economic Papers 3 135–146.
Rubin, D. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66 688–701.
Rubin, D. (1975). Bayesian inference for causality: The importance of randomization. In The Proceedings of the Social Statistics Section of the American Statistical Association 233–239. American Statistical Association, Alexandria, VA.
Rubin, D. (1976a). Inference and missing data. Biometrika 63 581–592. With discussion and reply.
Mathematical Reviews (MathSciNet): MR455196
Rubin, D. (1976b). Multivariate matching methods that are equal percent bias reducing, II: Maximums on bias reduction for fixed sample sizes. Biometrics 32 121–132.
Mathematical Reviews (MathSciNet): MR400556
Rubin, D. (1977). Assignment to treatment group on the basis of a covariate. J. Educ. Statist. 2 1–26.
Rubin, D. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34–58.
Mathematical Reviews (MathSciNet): MR472152
Rubin, D. (1979a). Discussion of “Conditional independence in statistical theory” by A.P. Dawid. J. Roy. Statist. Soc. Ser. B 41 27–28.
Mathematical Reviews (MathSciNet): MR535541
Rubin, D. (1979b). Using multivariate matched sampling and regression adjustment to control bias in observational studies. J. Amer. Statist. Assoc. 74 318–328.
Rubin, D. (1980). Discussion of “Randomization analysis of experimental data in the Fisher randomization test” by Basu. J. Amer. Statist. Assoc. 75 591–593.
Mathematical Reviews (MathSciNet): MR590687
Rubin, D. (1984). William G. Cochran’s contributions to the design, analysis, and evaluation of observational studies. In W. G. Cochran’s Impact on Statistics (P. S. R. S. Rao and J. Sedransk, eds.) 37–69. Wiley, New York.
Mathematical Reviews (MathSciNet): MR758447
Rubin, D. (1990a). Neyman (1923) and causal inference in experiments and observational studies. Statist. Sci. 5 472–480.
Rubin, D. (1990b). Formal modes of statistical inference for causal effects. J. Statist. Plann. Inference 25 279–292.
Rubin, D. (1997). Estimating causal effects from large data sets using propensity scores. Ann. Internal Med. 127 757–763.
Rubin, D. (2002). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Serv. and Outcomes Res. Methodol. 2 169–188.
Rubin, D. (2005). Causal inference using potential outcomes: Design, modeling, decisions. 2004 Fisher lecture. J. Amer. Statist. Assoc. 100 322–331.
Rubin, D. (2006). Matched Sampling for Causal Effects. Cambridge Univ. Press, New York.
Rubin, D. (2007). The design versus the analysis of observational studies for causal effects: Parallels with the design of randomized trials. Stat. Med. 26 20–30.
Rubin, D. (2008). Statistical inference for causal effects, with emphasis on applications in epidemiology and medical statistics. II. In Handbook of Statisics: Epidemiology and Medical Statistics (C. R. Rao, J. P. Miller and D. C. Rao, eds.). Elsevier, The Netherlands.
Rubin, D. and Thomas, N. (1992). Characterizing the effect of matching using linear propensity score methods with normal covariates. Biometrika 79 797–809.
Rubin, D. and Thomas, N. (2000). Combining propensity score matching with additional adjustments for prognostic covariates. J. Amer. Statist. Assoc. 95 573–585.
Rubin, D., Wang, X., Yin, L. and Zell, E. (2008). Bayesian causal inference: Approaches to estimating the effect of treating hospital type on cancer survival in Sweden using principal stratification. In Handbook of Applied Bayesian Analysis (T. O’Hagan and M. West, eds.). Oxford Univ. Press, Oxford.
Rubin, D. and Waterman, R. (2006). Estimating causal effects of marketing interventions using propensity score methodology. Statist. Sci. 21 206–222.
Shadish, W. R., Cook, T. D. and Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin Company, Boston.
Zell, E., Kuwanda, M. Rubin, D., Cutland, C., Patel, R., Velaphi S., Madhi, S. and Schrag, S. (2007). Conducting and analyzing a single-blind clinical trial in a developing country: Prevention of perinatal sepsis, soweto, South Africa. In Proceedings of the International Statistical Institute (CD-ROM).