Annals of Applied Statistics

Propensity score weighting for causal inference with multiple treatments

Fan Li and Fan Li

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Causal or unconfounded descriptive comparisons between multiple groups are common in observational studies. Motivated from a racial disparity study in health services research, we propose a unified propensity score weighting framework, the balancing weights, for estimating causal effects with multiple treatments. These weights incorporate the generalized propensity scores to balance the weighted covariate distribution of each treatment group, all weighted toward a common prespecified target population. The class of balancing weights include several existing approaches such as the inverse probability weights and trimming weights as special cases. Within this framework, we propose a set of target estimands based on linear contrasts. We further develop the generalized overlap weights, constructed as the product of the inverse probability weights and the harmonic mean of the generalized propensity scores. The generalized overlap weighting scheme corresponds to the target population with the most overlap in covariates across the multiple treatments. These weights are bounded and thus bypass the problem of extreme propensities. We show that the generalized overlap weights minimize the total asymptotic variance of the moment weighting estimators for the pairwise contrasts within the class of balancing weights. We consider two balance check criteria and propose a new sandwich variance estimator for estimating the causal effects with generalized overlap weights. We apply these methods to study the racial disparities in medical expenditure between several racial groups using the 2009 Medical Expenditure Panel Survey (MEPS) data. Simulations were carried out to compare with existing methods.

Article information

Source
Ann. Appl. Stat., Volume 13, Number 4 (2019), 2389-2415.

Dates
Received: September 2018
Revised: June 2019
First available in Project Euclid: 28 November 2019

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1574910049

Digital Object Identifier
doi:10.1214/19-AOAS1282

Mathematical Reviews number (MathSciNet)
MR4037435

Zentralblatt MATH identifier
07160944

Keywords
Balancing weights generalized propensity score generalized overlap weights health services research pairwise comparison racial disparity

Citation

Li, Fan; Li, Fan. Propensity score weighting for causal inference with multiple treatments. Ann. Appl. Stat. 13 (2019), no. 4, 2389--2415. doi:10.1214/19-AOAS1282. https://projecteuclid.org/euclid.aoas/1574910049


Export citation

References

  • Abadie, A. and Imbens, G. W. (2012). A martingale representation for matching estimators. J. Amer. Statist. Assoc. 107 833–843.
  • Athey, S., Imbens, G. W. and Wager, S. (2018). Approximate residual balancing: Debiased inference of average treatment effects in high dimensions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 80 597–623.
  • Balsa, A. I., Cao, Z. and McGuire, T. G. (2007). Does managed health care reduce health care disparities between minorities and Whites? J. Health Econ. 27 781–807.
  • Buntin, M. B. and Zaslavsky, A. M. (2004). Too much ado about two-part models and transformation? Comparing methods of modeling medicare expenditures. J. Health Econ. 23 525–542.
  • Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W. and Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. Econom. J. 21 C1–C68.
  • Cook, B. L., McGuire, T. G. and Zaslavsky, A. M. (2012). Measuring racial/ethnic disparities in health care: Methods and practical issues. Health Serv. Res. 47 1232–1254.
  • Cook, B. L., McGuire, T. G., Meara, E. and Zaslavsky, A. M. (2009). Adjusting for health status in non-linear models of health care disparities. Health Serv. Outcomes Res. Methodol. 9 1–21.
  • Cook, B. L., Mcguire, T. G., Lock, K. and Zaslavsky, A. M. (2010). Comparing methods of racial and ethnic disparities measurement across different settings of mental health care. Health Serv. Res. 45 825–847.
  • Crump, R. K., Hotz, V. J., Imbens, G. W. and Mitnik, O. A. (2009). Dealing with limited overlap in estimation of average treatment effects. Biometrika 96 187–199.
  • Ding, P. and Li, F. (2018). Causal inference: A missing data perspective. Statist. Sci. 33 214–237.
  • Dudoit, S. and van der Laan, M. J. (2005). Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Stat. Methodol. 2 131–154.
  • Feng, P., Zhou, X.-H., Zou, Q.-M., Fan, M.-Y. and Li, X.-S. (2012). Generalized propensity score for estimating the average treatment effect of multiple treatments. Stat. Med. 31 681–697.
  • Hainmueller, J. (2012). Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies. Polit. Anal. 1 25–46.
  • Haneuse, S. and Rotnitzky, A. (2013). Estimation of the effect of interventions that modify the received treatment. Stat. Med. 32 5260–5277.
  • Hirano, K., Imbens, G. W. and Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71 1161–1189.
  • Hirshberg, D. A. and Zubizarreta, J. R. (2017). On two approaches to weighting in causal inference. Epidemiology 28 812–816.
  • Imbens, G. W. (2000). The role of the propensity score in estimating dose-response functions. Biometrika 87 706–710.
  • Imbens, G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: A review. Rev. Econ. Stat. 86 4–29.
  • IOM (2003). Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. National Academies Press, Washington, DC.
  • Jørgensen, B. (1997). The Theory of Dispersion Models. Monographs on Statistics and Applied Probability 76. CRC Press, London.
  • Lechner, M. (2002). Program heterogeneity and propensity score matching: An application to the evaluation of active labor market policies. Rev. Econ. Stat. 84 205–220.
  • Li, L. and Greene, T. (2013). A weighting analogue to pair matching in propensity score analysis. Int. J. Biostat. 9 215–234.
  • Li, F. and Li, F. (2019a). Supplement to “Propensity score weighting for causal inference with multiple treatments.” DOI:10.1214/19-AOAS1282SUPP.
  • Li, F. and Li, F. (2019b). Double-robust estimation in difference-in-differences with an application to traffic safety evaluation. Observational Studies 5 1–20.
  • Li, F., Morgan, K. L. and Zaslavsky, A. M. (2018). Balancing covariates via propensity score weighting. J. Amer. Statist. Assoc. 113 390–400.
  • Li, F., Thomas, L. E. and Li, F. (2019). Addressing extreme propensity scores via the overlap weights. Am. J. Epidemiol. 1 250–257.
  • Li, F., Zaslavsky, A. M. and Landrum, M. B. (2013). Propensity score weighting with multilevel data. Stat. Med. 32 3373–3387.
  • Lopez, M. J. and Gutman, R. (2017). Estimation of causal effects with multiple treatments: A review and new ideas. Statist. Sci. 32 432–454.
  • Manning, W. G. and Mullahy, J. (2001). Estimating log models: To transform or not to transform? J. Health Econ. 20 461–494.
  • McCaffrey, D. F., Ridgeway, G. and Morral, A. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol. Methods 9 403–425.
  • McCaffrey, D. F., Griffin, B. A., Almirall, D., Slaughter, M. E., Ramchand, R. and Burgette, L. F. (2013). A tutorial on propensity score estimation for multiple treatments using generalized boosted models. Stat. Med. 32 3388–3414.
  • McGuire, T. G., Alegria, M., Cook, B. L., Wells, K. B. and Zaslavsky, A. M. (2006). Implementing the institute of medicine definition of disparities: An application to mental health care. Health Serv. Res. 41 1979–2005.
  • Moore, K. L., Neugebauer, R., van der Laan, M. J. and Tager, I. B. (2012). Causal inference in epidemiological studies with strong confounding. Stat. Med. 31 1380–1404.
  • Muñoz, I. D. and van der Laan, M. (2012). Population intervention causal effects based on stochastic interventions. Biometrics 68 541–549.
  • Park, R. (1966). Estimation with heteroscedastic error terms. Econometrica 34 888.
  • Pirracchio, R., Petersen, M. L. and van der Laan, M. (2015). Improving propensity score estimators’ robustness to model misspecification using super learner. Am. J. Epidemiol. 181 108–119.
  • Rassen, J. A., Shelat, A. A., Franklin, J. M., Glynn, R. J., Solomon, D. H. and Schneeweiss, S. (2013). Matching by propensity score in cohort studies with three treatment groups. Epidemiology 24 401–409.
  • Robins, J. M., Rotnitzky, A. and Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. J. Amer. Statist. Assoc. 89 846–866.
  • Robins, J., Li, L., Tchetgen, E. T. and van der Vaart, A. (2008). Higher order influence functions and minimax estimation of nonlinear functionals. Probability and Statistics: Essays in Honor of David A. Freedman 2 335–421.
  • Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.
  • Stefanski, L. A. and Boos, D. D. (2002). The calculus of $M$-estimation. Amer. Statist. 56 29–38.
  • VanderWeele, T. J. and Robinson, W. R. (2014a). On the causal interpretation of race in regressions adjusting for confounding and mediating variables. Epidemiology 25 473–484.
  • VanderWeele, T. J. and Robinson, W. R. (2014b). Rejoinder: How to reduce racial disparities?: Upon what to intervene? Epidemiology 25 491–493.
  • van der Laan, M. J. and Petersen, M. L. (2007). Causal effect models for realistic individualized treatment and intention to treat rules. Int. J. Biostat. 3 Art. 3, 54.
  • Yang, S., Imbens, G. W., Cui, Z., Faries, D. E. and Kadziola, Z. (2016). Propensity score matching and subclassification in observational studies with multi-level treatments. Biometrics 72 1055–1065.
  • Yoshida, K., Hernández-Díaz, S., Solomon, D. H., Jackson, J. W., Gagne, J. J., Glynn, R. J. and Franklin, J. M. (2017). Matching weights to simultaneously compare three treatment groups comparison to three-way matching. Epidemiology 28 387–395.
  • Zanutto, E., Lu, B. and Hornik, R. (2005). Using propensity score subclassification for multiple treatment doses to evaluate a national antidrug media campaign. J. Educ. Behav. Stat. 30 59–73.
  • Zaslavsky, A. M. and Ayanian, J. Z. (2005). Integrating research on racial and ethnic disparities in health care over place and time. Med. Care 43 303–307.
  • Zubizarreta, J. R. (2015). Stable weights that balance covariates for estimation with incomplete outcome data. J. Amer. Statist. Assoc. 110 910–922.

Supplemental materials

  • Supplement to “Propensity score weighting for causal inference with multiple treatments”. Supplement A: On Transitivity. We provide a detailed discussion on transitivity of the target estimands for pairwise comparisons. Supplement B: Proof of Propositions. We present detailed proofs of Propositions 1 to 3 in Section 2.3. Supplement C: Proof of Theorem 1. We provide the derivation and related discussions of the variance estimator for the generalized overlap weighting. Supplement D: Additional Simulation Results. We present additional figures and numerical results for the simulation study in Section 5.