In causal inference, principal stratification is a framework for dealing with a posttreatment intermediate variable between a treatment and an outcome. In this framework, the principal strata are defined by the joint potential values of the intermediate variable. Because the principal strata are not fully observable, the causal effects within them, also known as the principal causal effects, are not identifiable without additional assumptions. Several previous empirical studies leveraged auxiliary variables to improve the inference of principal causal effects. We establish a general theory for the identification and estimation of principal causal effects with auxiliary variables, which provides a solid foundation for statistical inference and more insights for model building in empirical research. In particular, we consider two commonly used assumptions for principal stratification problems: principal ignorability and the conditional independence between the auxiliary variable and the outcome given principal strata and covariates. Under each assumption, we give nonparametric and semiparametric identification results without modeling the outcome. When neither assumption is plausible, we propose a large class of flexible parametric and semiparametric models for identifying principal causal effects. Our theory not only establishes formal identification results of several models that have been used in previous empirical studies but also generalizes them to allow for different types of outcomes and intermediate variables.
"Identification of Causal Effects Within Principal Strata Using Auxiliary Variables." Statist. Sci. 36 (4) 493 - 508, November 2021. https://doi.org/10.1214/20-STS810