The causal inference literature has provided a clear formal definition of confounding expressed in terms of counterfactual independence. The literature has not, however, come to any consensus on a formal definition of a confounder, as it has given priority to the concept of confounding over that of a confounder. We consider a number of candidate definitions arising from various more informal statements made in the literature. We consider the properties satisfied by each candidate definition, principally focusing on (i) whether under the candidate definition control for all “confounders” suffices to control for “confounding” and (ii) whether each confounder in some context helps eliminate or reduce confounding bias. Several of the candidate definitions do not have these two properties. Only one candidate definition of those considered satisfies both properties. We propose that a “confounder” be defined as a pre-exposure covariate $C$ for which there exists a set of other covariates $X$ such that effect of the exposure on the outcome is unconfounded conditional on $(X,C)$ but such that for no proper subset of $(X,C)$ is the effect of the exposure on the outcome unconfounded given the subset. We also provide a conditional analogue of the above definition; and we propose a variable that helps reduce bias but not eliminate bias be referred to as a “surrogate confounder.” These definitions are closely related to those given by Robins and Morgenstern [Comput. Math. Appl. 14 (1987) 869–916]. The implications that hold among the various candidate definitions are discussed.
"On the definition of a confounder." Ann. Statist. 41 (1) 196 - 220, February 2013. https://doi.org/10.1214/12-AOS1058