The Annals of Statistics

On the definition of a confounder

Tyler J. VanderWeele and Ilya Shpitser

Full-text: Open access


The causal inference literature has provided a clear formal definition of confounding expressed in terms of counterfactual independence. The literature has not, however, come to any consensus on a formal definition of a confounder, as it has given priority to the concept of confounding over that of a confounder. We consider a number of candidate definitions arising from various more informal statements made in the literature. We consider the properties satisfied by each candidate definition, principally focusing on (i) whether under the candidate definition control for all “confounders” suffices to control for “confounding” and (ii) whether each confounder in some context helps eliminate or reduce confounding bias. Several of the candidate definitions do not have these two properties. Only one candidate definition of those considered satisfies both properties. We propose that a “confounder” be defined as a pre-exposure covariate $C$ for which there exists a set of other covariates $X$ such that effect of the exposure on the outcome is unconfounded conditional on $(X,C)$ but such that for no proper subset of $(X,C)$ is the effect of the exposure on the outcome unconfounded given the subset. We also provide a conditional analogue of the above definition; and we propose a variable that helps reduce bias but not eliminate bias be referred to as a “surrogate confounder.” These definitions are closely related to those given by Robins and Morgenstern [Comput. Math. Appl. 14 (1987) 869–916]. The implications that hold among the various candidate definitions are discussed.

Article information

Ann. Statist. Volume 41, Number 1 (2013), 196-220.

First available in Project Euclid: 26 March 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62A01: Foundations and philosophical topics
Secondary: 68T30: Knowledge representation 62J99: None of the above, but in this section

Causal inference causal diagrams counterfactual confounder minimal sufficiency


VanderWeele, Tyler J.; Shpitser, Ilya. On the definition of a confounder. Ann. Statist. 41 (2013), no. 1, 196--220. doi:10.1214/12-AOS1058.

Export citation


  • Barnow, B. S., Cain, G. G. and Goldberger, A. S. (1980). Issues in the analysis of selectivity bias. In Evaluation Studies (E. Stromsdorfer and G. Farkas, eds.) 5. Sage, San Francisco.
  • Breslow, N. E. and Day, N. E. (1980). Statistical Methods in Cancer Research, Vol. 1: The Analysis of Case–Control Studies. International Agency for Research on Cancer, Lyon, France.
  • Cox, D. R. (1958). Planning of Experiments. Wiley, New York.
  • Dawid, A. P. (2002). Influence diagrams for causal modeling and inference. Int. Statist. Rev. 70 161–189.
  • Geng, Z., Guo, J. and Fung, W.-K. (2002). Criteria for confounders in epidemiological studies. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 3–15.
  • Geng, Z. and Li, G. (2002). Conditions for non-confounding and collapsibility without knowledge of completely constructed causal diagrams. Scand. J. Stat. 29 169–181.
  • Geng, Z., Guo, J., Lau, T. S. and Fung, W.-K. (2001). Confounding, homogeneity and collapsibility for causal effects in epidemiologic studies. Statist. Sinica 11 63–75.
  • Glymour, M. M. and Greenland, S. (2008). Causal diagrams. In Modern Epidemiology, 3rd ed. (K. J. Rothman, S. Greenland and T. L. Lash, eds.) 12. Lippincott Williams and Wilkins, Philadelphia, PA.
  • Greenland, S. (2003). Quantifying biases in causal models: Classical confounding versus collider-stratification bias. Epidemiology 14 300–306.
  • Greenland, S. and Morgenstern, H. (2001). Confounding in health research. Annual Rev. Public Health 22 189–212.
  • Greenland, S., Pearl, J. and Robins, J. M. (1999). Causal diagrams for epidemiologic research. Epidemiology 10 37–48.
  • Greenland, S. and Pearl, J. (2007). Causal diagrams. In Encyclopedia of Epidemiology (S. Boslaugh, ed.) 149–156. Sage, Thousand Oaks, CA.
  • Greenland, S. and Pearl, J. (2011). Adjustments and their consequences—collapsibility analysis using graphical models. International Statistical Review 79 401–426.
  • Greenland, S. and Robins, J. M. (1986). Identifiability, exchangeability, and epidemiological confounding. Int. J. Epidemiol. 15 413–419.
  • Greenland, S., Robins, J. M. and Pearl, J. (1999). Confounding and collapsibility in causal inference. Statist. Sci. 14 29–46.
  • Greenland, S. and Robins, J. M. (2009). Identifiability, exchangeability and confounding revisited. Epidemiol. Perspect. Innov. 6 4.
  • Hernán, M. A. (2008). Confounding. In Encyclopedia of Quantitative Risk Assessment and Analysis (B. Everitt and E. Melnick, eds.) 353–362. Wiley, Chichester, UK.
  • Hernán, M. A., Hernánez-Díaz, S., Werler, M. M. and Mitchell, A. A. (2002). Causal knowledge as a prerequisite for confounding evaluation: An application to birth defects epidemiology. American Journal of Epidemiology 155 176–184.
  • Imbens, G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: A review. Rev. Econom. Statist. 86 4–29.
  • Kleinbaum, D. G., Kupper, L. L. and Morgenstern, H. (1982). Epidemiologic Research: Principles and Quantitative Methods. Lifetime Learning Publications [Wadsworth], Belmont, CA.
  • Lauritzen, S. L. (1996). Graphical Models. Oxford Univ. Press, New York.
  • Miettinen, O. S. (1974). Confounding and effect modification. Am. J. Epidemiol. 100 350–353.
  • Miettinen, O. S. (1976). Stratification by a multivariate confounder score. Am. J. Epidemiol. 104 609–620.
  • Miettinen, O. S. and Cook, E. F. (1981). Confounding: Essence and detection. Am. J. Epidemiol. 114 593–603.
  • Morabia, A. (2011). History of the modern epidemiological concept of confounding. J. Epidemiol. Community Health 65 297–300.
  • Neyman, J. (1923). Sur les applications de la thar des probabilities aux experiences Agaricales: Essay des principle. Excerpts reprinted (1990) in English (D. Dabrowska and T. Speed, trans.). Statist. Sci. 5 463–472.
  • Ogburn, E. L. and VanderWeele, T. J. (2012). On the nondifferential misclassification of a binary confounder. Epidemiology 23 433–439.
  • Pearl, J. (1995). Causal diagrams for empirical research. Biometrika 82 669–710.
  • Pearl, J. (2009). Causality: Models, Reasoning, and Inference, 2nd ed. Cambridge Univ. Press, Cambridge.
  • Robins, J. (1992). Estimation of the time-dependent accelerated failure time model in the presence of confounding factors. Biometrika 79 321–334.
  • Robins, J. M. (1997). Causal inference from complex longitudinal data. In Latent Variable Modeling and Applications to Causality (Los Angeles, CA, 1994) (M. Berkane, ed.). Lecture Notes in Statistics 120 69–117. Springer, New York.
  • Robins, J. M. and Greenland, S. (1986). The role of model selection in causal inference from nonexperimental data. Am. J. Epidemiol. 123 392–402.
  • Robins, J. M. and Morgenstern, H. (1987). The foundations of confounding in epidemiology. Comput. Math. Appl. 14 869–916.
  • Robins, J. M. and Richardson, T. S. (2010). Alternative graphical causal models and the identification of direct effects. In Causality and Psychopathology: Finding the Determinants of Disorders and Their Cures (P. E. Shrout, K. M. Keyes and K. Ornstein, eds.) 103–158. Oxford Univ. Press, New York.
  • Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
  • Rubin, D. B. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34–58.
  • Rubin, D. B. (1990). Formal modes of statistical inference for causal effects. J. Statist. Plann. Inference 25 279–292.
  • Shpitser, I., VanderWeele, T. J. and Robins, J. M. (2010). On the validity of covariate adjustment for estimating causal effects. In Proceedings of the 26th Conference on Uncertainty and Artificial Intelligence 527–536. AUAI Press, Corvallis, OR.
  • Spirtes, P., Glymour, C. and Scheines, R. (1993). Causation, Prediction, and Search. Lecture Notes in Statistics 81. Springer, New York.
  • VanderWeele, T. J. (2012). Confounding and effect modification: Distribution and measure. Epidemiologic Methods 1 55–82.
  • VanderWeele, T. J. and Shpitser, I. (2011). A new criterion for confounder selection. Biometrics 67 1406–1413.