Statistical Science

External Validity: From Do-Calculus to Transportability Across Populations

Judea Pearl and Elias Bareinboim

Full-text: Open access


The generalizability of empirical findings to new environments, settings or populations, often called “external validity,” is essential in most scientific explorations. This paper treats a particular problem of generalizability, called “transportability,” defined as a license to transfer causal effects learned in experimental studies to a new population, in which only observational studies can be conducted. We introduce a formal representation called “selection diagrams” for expressing knowledge about differences and commonalities between populations of interest and, using this representation, we reduce questions of transportability to symbolic derivations in the do-calculus. This reduction yields graph-based procedures for deciding, prior to observing any data, whether causal effects in the target population can be inferred from experimental findings in the study population. When the answer is affirmative, the procedures identify what experimental and observational findings need be obtained from the two populations, and how they can be combined to ensure bias-free transport.

Article information

Statist. Sci., Volume 29, Number 4 (2014), 579-595.

First available in Project Euclid: 15 January 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Experimental design generalizability causal effects external validity


Pearl, Judea; Bareinboim, Elias. External Validity: From Do-Calculus to Transportability Across Populations. Statist. Sci. 29 (2014), no. 4, 579--595. doi:10.1214/14-STS486.

Export citation


  • Adelman, L. (1991). Experiments, quasi-experiments, and case studies: A review of empirical methods for evaluating decision support systems. IEEE Transactions on Systems, Man and Cybernetics 21 293–301.
  • Balke, A. and Pearl, J. (1995). Counterfactuals and policy analysis in structural models. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (P. Besnard and S. Hanks, eds.) 11–18. Morgan Kaufmann, San Francisco, CA.
  • Bareinboim, E., Brito, C. and Pearl, J. (2012). Local characterizations of causal Bayesian networks. In Graph Structures for Knowledge Representation and Reasoning. Lecture Notes in Artificial Intelligence 7205 1–17. Springer, Berlin.
  • Bareinboim, E. and Pearl, J. (2012). Transportability of causal effects: Completeness results. In Proceedings of the Twenty-Sixth National Conference on Artificial Intelligence 698–704. AAAI Press, Menlo Park, CA.
  • Bareinboim, E. and Pearl, J. (2013a). Causal transportability with limited experiments. In Proceedings of the Twenty-Seventh National Conference on Artificial Intelligence 95–101. AAAI Press, Menlo Park, CA.
  • Bareinboim, E. and Pearl, J. (2013b). A general algorithm for deciding transportability of experimental results. J. Causal Inference 1 107–134.
  • Bareinboim, E. and Pearl, J. (2013c). Meta-transportability of causal effects: A formal approach. In Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2013). J. Mach. Learn. Res. 31 135–143.
  • Bareinboim, E., Tian, J. and Pearl, J. (2014). Recovering from selection bias in causal and statistical inference. In Proceedings of The Twenty-Eighth Conference on Artificial Intelligence (C. E. Brodley and P. Stone, eds.). AAAI Press, Menlo Park, CA. To appear.
  • Berkson, J. (1946). Limitations of the application of fourfold table analysis to hospital data. Biometrics 2 47–53.
  • Bollen, K. A. and Pearl, J. (2013). Eight myths about causality and structural equation models. In Handbook of Causal Analysis for Social Research (S. L. Morgan, ed.) Chapter 15. Springer, New York.
  • Campbell, D. and Stanley, J. (1963). Experimental and Quasi-Experimental Designs for Research. Wadsworth, Chicago.
  • Cole, S. R. and Stuart, E. A. (2010). Generalizing evidence from randomized clinical trials to target populations: The ACTG 320 trial. Am. J. Epidemiol. 172 107–115.
  • Davis, J. A. (1984). Extending Rosenberg’s technique for standardizing percentage tables. Social Forces 62 679–708.
  • Dawid, A. P. (2002). Influence diagrams for causal modelling and inference. Internat. Statist. Rev. 70 161–189.
  • Ellenberg, S. S. and Hamilton, J. M. (1989). Surrogate endpoints in clinical trials: Cancer. Stat. Med. 8 405–413.
  • Gelman, A. and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Analytical Methods for Social Research. Cambridge Univ. Press, New York.
  • Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher 5 3–8.
  • Glymour, M. M. and Greenland, S. (2008). Causal diagrams. In Modern Epidemiology, 3rd ed. (K. J. Rothman, S. Greenland and T. L. Lash, eds.) 183–209. Lippincott Williams & Wilkins, Philadelphia, PA.
  • Haavelmo, T. (1943). The statistical implications of a system of simultaneous equations. Econometrica 11 1–12.
  • Hayduk, L., Cummings, G., Stratkotter, R., Nimmo, M., Grygoryev, K., Dosman, D., Gillespie, M., Pazderka-Robinson, H. and Boadu, K. (2003). Pearl’s d-separation: One more step into causal thinking. Struct. Equ. Model. 10 289–311.
  • Heckman, J. J. (1979). Sample selection bias as a specification error. Econometrica 47 153–161.
  • Hedges, L. V. and Olkin, I. (1985). Statistical Methods for Meta-Analysis. Academic Press, Orlando, FL.
  • Heise, D. R. (1975). Causal Analysis. Wiley, New York.
  • Hernán, M. A., Hernández-Díaz, S. and Robins, J. M. (2004). A structural approach to selection bias. Epidemiology 15 615–625.
  • Hernán, M. A. and VanderWeele, T. J. (2011). Compound treatments and transportability of causal inference. Epidemiology 22 368–377.
  • Höfler, M., Gloster, A. T. and Hoyer, J. (2010). Causal effects in psychotherapy: Counterfactuals counteract overgeneralization. Psychotherapy Research 20 668–679. DOI:10.1080/10503307.2010.501041.
  • Huang, Y. and Valtorta, M. (2006). Pearl’s calculus of intervention is complete. In Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (R. Dechter and T. S. Richardson, eds.) 217–224. AUAI Press, Corvallis, OR.
  • Joffe, M. M. and Greene, T. (2009). Related causal frameworks for surrogate outcomes. Biometrics 65 530–538.
  • Koller, D. and Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA.
  • Lane, P. W. and Nelder, J. A. (1982). Analysis of covariance and standardization as instances of prediction. Biometrics 38 613–621.
  • Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer, New York.
  • Manski, C. (2007). Identification for Prediction and Decision. Harvard Univ. Press, Cambridge, MA.
  • Neyman, J. (1923). Sur les applications de la thar des probabilities aux experiences Agaricales: Essay des principle. English translation of excerpts by D. Dabrowska and T. Speed in Statist. Sci. 5 (1990) 463–472.
  • Owen, A. B. (2009). Karl Pearson’s meta-analysis revisited. Ann. Statist. 37 3867–3892.
  • Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo, CA.
  • Pearl, J. (1993). Graphical models, causality, and intervention. Statist. Sci. 8 266–273.
  • Pearl, J. (1995). Causal diagrams for empirical research. Biometrika 82 669–710.
  • Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge Univ. Press, Cambridge.
  • Pearl, J. (2009a). Causal inference in statistics: An overview. Stat. Surv. 3 96–146.
  • Pearl, J. (2009b). Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge Univ. Press, Cambridge.
  • Pearl, J. (2011). The structural theory of causation. In Causality in the Sciences (P. McKay Illari, F. Russo and J. Williamson, eds.) 697–727. Clarendon Press, Oxford.
  • Pearl, J. (2012a). The causal foundations of structural equation modeling. In Handbook of Structural Equation Modeling (R. H. Hoyle, ed.). Guilford Press, New York.
  • Pearl, J. (2012b). Some thoughts concerning transfer learning, with applications to meta-analysis and data-sharing estimation. Technical Report R-387, Cognitive Systems Laboratory, Dept. Computer Science, UCLA.
  • Pearl, J. (2013). Linear models: A useful “microscope” for causal analysis. J. Causal Inference 1 155–170.
  • Pearl, J. (2014). Trygve Haavelmo and the emergence of causal calculus. Econometric Theory, Special Issue on Haavelmo Centennial. Published online: 10 June 2014. DOI:10.1017/S0266466614000231.
  • Pearl, J. and Bareinboim, E. (2011). Transportability across studies: A formal approach. Technical Report R-372, Cognitive Systems Laboratory, Dept. Computer Science, UCLA.
  • Petersen, M. L. (2011). Compound treatments, transportability, and the structural causal model: The power and simplicity of causal graphs. Epidemiology 22 378–381.
  • Prentice, R. L. (1989). Surrogate endpoints in clinical trials: Definition and operational criteria. Stat. Med. 8 431–440.
  • Richardson, T. (2003). Markov properties for acyclic directed mixed graphs. Scand. J. Stat. 30 145–157.
  • Robins, J. (1986). A new approach to causal inference in mortality studies with a sustained exposure period—Application to control of the healthy worker survivor effect. Math. Modelling 7 1393–1512.
  • Robins, J., Orellana, L. and Rotnitzky, A. (2008). Estimation and extrapolation of optimal treatment and testing strategies. Stat. Med. 27 4678–4721.
  • Rosenthal, R. (1995). Writing meta-analytic reviews. Psychological Bulletin 118 183–192.
  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educational Psychology 66 688–701.
  • Shadish, W. R., Cook, T. D. and Campbell, D. T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference, 2nd ed. Houghton-Mifflin, Boston.
  • Shpitser, I. and Pearl, J. (2006). Identification of conditional interventional distributions. In Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (R. Dechter and T. S. Richardson, eds.) 437–444. AUAI Press, Corvallis, OR.
  • Spirtes, P., Glymour, C. and Scheines, R. (1993). Causation, Prediction, and Search. Lecture Notes in Statistics 81. Springer, New York.
  • Spirtes, P., Glymour, C. and Scheines, R. (2000). Causation, Prediction, and Search, 2nd ed. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA.
  • Strotz, R. H. and Wold, H. O. A. (1960). Recursive vs. nonrecursive systems: An attempt at synthesis. Econometrica 28 417–427.
  • Tian, J. and Pearl, J. (2002). A general identification condition for causal effects. In Proceedings of the Eighteenth National Conference on Artificial Intelligence 567–573. AAAI Press/The MIT Press, Menlo Park, CA.
  • Verma, T. and Pearl, J. (1988). Causal networks: Semantics and expressiveness. In Proceedings of the Fourth Workshop on Uncertainty in Artificial Intelligence 352–359. Mountain View, CA. Also in Uncertainty in AI 4 (1990) (R. Shachter, T. S. Levitt, L. N. Kanal and J. F. Lemmer, eds.) 69–76. North-Holland, Amsterdam.
  • Westergaard, H. (1916). Scope and method of statistics. Publ. Amer. Statist. Assoc. 15 229–276.
  • White, H. and Chalak, K. (2009). Settable systems: An extension of Pearl’s causal model with optimization, equilibrium, and learning. J. Mach. Learn. Res. 10 1759–1799.
  • Wright, S. (1921). Correlation and causation. J. Agricultural Research 20 557–585.
  • Yule, G. U. (1934). On some points relating to vital statistics, more especially statistics of occupational mortality. J. Roy. Statist. Soc. 97 1–84.