Electronic Journal of Statistics

Estimating multiple treatment effects using two-phase semiparametric regression estimators

Cindy Yu, Jason Legg, and Bin Liu

Full-text: Open access


We propose a semiparametric two-phase regression estimator with a semiparametric generalized propensity score estimator for estimating average treatment effects in the presence of the first-phase sampling. The proposed estimator can be easily extended to any number of treatments and does not rely on a prespecified form of the response or outcome functions. The proposed estimator is shown to reduce bias found in standard estimators that ignore the first-phase sample design, and can have improved efficiency compared to the inverse propensity weighted estimators. Results from simulation studies and from an empirical study of NHANES are presented.

Article information

Electron. J. Statist., Volume 7 (2013), 2737-2761.

First available in Project Euclid: 18 November 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Propensity score semiparametric treatment effects two-phase regression estimator


Yu, Cindy; Legg, Jason; Liu, Bin. Estimating multiple treatment effects using two-phase semiparametric regression estimators. Electron. J. Statist. 7 (2013), 2737--2761. doi:10.1214/13-EJS856. https://projecteuclid.org/euclid.ejs/1384783409

Export citation


  • [1] Abadie, A. and Imbens, G. W. (2006). Large Sample Properties of Matching Estimators for Average Treatment Effects., Econometrica, 74(1), 235–267.
  • [2] Bang, H. and Robins, J. M. (2005). Doubly Robust Estimation in Missing Data and Causal Inference Models., Biometrics, 61, 962–972.
  • [3] Breidt, F. J., Claeskens, G. and Opsomer, J. D. (2005). Model-Assisted Estimation for Complex Surveys Using Penalised Splines., Biometrika, 92(4), 831–846.
  • [4] Cattaneo, M. D. (2010). Efficient Semiparametric Estimation of Multi-valued Treatment Effects under Ignorability., Journal of Econometrics, 155(2), 138–154.
  • [5] Drichoutis, A. C., Nayga, R. M. and Lazaridis, P. (2009). Can Nutritional Label Use Influence Body Weight Outcomes?, Kyklos, 62, 500–525.
  • [6] Fuller, W. A. (2009)., Sampling Statistics, John Wiley & Sons.
  • [7] Giffin, R. B. and Woodcock, J. (2010). Comparative Effectiveness Research: Who Will Do The Studies., Health Affairs, 29(11), 2075–2081.
  • [8] Glynn, A. N. and Quinn, K. M. (2010). An Introduction to the Augmented Inverse Propensity Weighted Estimator., Political Analysis, 18, 36–56.
  • [9] Hahn, J. (1998). On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects., Econometrica, 66(2), 315–331.
  • [10] Hajek, J. (1971). Comment on An Essay on the Logical Foundations of Survey Sampling by Basu, D. in Godambe, V.P. and Sprott, D.A. eds., Foundations of Statistical Inference, Holt, Rinehart and Winston, page 236.
  • [11] Hirano, K., Imbens, G. and Ridder, G. (2003). Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score., Econometrica, 71(4), 1161–1189.
  • [12] Hong, G. (2010). Marginal Mean Weighting Through Stratification: Adjustment for Selection Bias in Multilevel Data., Journal of Educational and Behavioral Statistics, 35(5), 499–531.
  • [13] Horvitz, D. G. and Thompson, D. J. (1952). A Generalization of Sampling Without Replacement From a Finite Universe., Journal of the American Statistical Association, 47, 663–685.
  • [14] Iglehart, J. K. (2009). Prioritizing Comparative-Effectiveness Research — IOM Recommendations., The New England Journal of Medicine, 361, 325–328.
  • [15] Imbens, G. W. and Wooldridge, J. M. (2009). Recent Developments in the Econometrics of Program Evaluation., Journal of Economic Literature, American Economic Association, 47(1), 5–86.
  • [16] Isaki, C. T. and Fuller, W. A. (1982). Survey Design Under the Regression Superpopulation Model., Journal of the American Statistical Association, 77, 89–96.
  • [17] Kang, J. D. Y. and Schafer J. L. (2007). Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data., Statistical Science, 22(4), 523–539.
  • [18] Kim, J. K. and Haziza, D. (2010). Doubly Robust Inference with Missing Data in Survey Sampling., Joint Statistical Meetings, Vancouver, Canada.
  • [19] Korn, E. and Graubard, B. (1991). Epidemiologic Studies Utilizing Surveys: Accounting for the Sampling Design., American Journal of Public Health, 81(9), 1166–1173.
  • [20] Lorentz, G. (1986)., Approximation of Functions, New York: Chelsea Publishing Company.
  • [21] Pfeffermann, D. and Sverchkov, M. (1999). Parametric and Semi-Parametric Estimation of Regression Models Fitted to Survey Data., Sankhya: The Indian Journal of Statistics, Series B, 61(1), 166–186.
  • [22] Särndal, C. E., Swensson, B. and Wretman, J. (1992)., Model Assisted Survey Sampling, Springer.
  • [23] Sekhon, J. S. (2011). Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching package for R., Journal of Statistical Software, 42(7), 1–52.
  • [24] Sorenson, C. (2010). Use of Comparative Effectiveness Research in Drug Coverage and Pricing Decisions: A Six-Country Comparison., The Commonwealth Fund, 91, 1–14.
  • [25] Tan, Z. (2006). Regression and Weighting Methods for Causal Inference Using Instrumental Variables., Journal of the American Statistical Association, 101, 1607–1618.
  • [26] Tan, Z. (2010). Bounded, Efficient and Doubly Robust Estimation with Inverse Weighting., Biometrika, 94(2), 1–22.
  • [27] Wang, W., Scharfstein, D., Tan, Z. and MacKenzie, E. J. (2009). Causal Inference in Outcome-dependent Two-phase Sampling Designs., Journal of Royal Statistical Society Series B, 71, 947–969.