Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 7, Number 1 (2013), 443-470.
Estimating treatment effect heterogeneity in randomized program evaluation
Full-text: Open access
Abstract
When evaluating the efficacy of social programs and medical treatments using randomized experiments, the estimated overall average causal effect alone is often of limited value and the researchers must investigate when the treatments do and do not work. Indeed, the estimation of treatment effect heterogeneity plays an essential role in (1) selecting the most effective treatment from a large number of available treatments, (2) ascertaining subpopulations for which a treatment is effective or harmful, (3) designing individualized optimal treatment regimes, (4) testing for the existence or lack of heterogeneous treatment effects, and (5) generalizing causal effect estimates obtained from an experimental sample to a target population. In this paper, we formulate the estimation of heterogeneous treatment effects as a variable selection problem. We propose a method that adapts the Support Vector Machine classifier by placing separate sparsity constraints over the pre-treatment parameters and causal heterogeneity parameters of interest. The proposed method is motivated by and applied to two well-known randomized evaluation studies in the social sciences. Our method selects the most effective voter mobilization strategies from a large number of alternative strategies, and it also identifies the characteristics of workers who greatly benefit from (or are negatively affected by) a job training program. In our simulation studies, we find that the proposed method often outperforms some commonly used alternatives.
Article information
Source
Ann. Appl. Stat., Volume 7, Number 1 (2013), 443-470.
Dates
First available in Project Euclid: 9 April 2013
Permanent link to this document
https://projecteuclid.org/euclid.aoas/1365527206
Digital Object Identifier
doi:10.1214/12-AOAS593
Mathematical Reviews number (MathSciNet)
MR3086426
Zentralblatt MATH identifier
1376.62036
Keywords
Causal inference individualized treatment rules LASSO moderation variable selection
Citation
Imai, Kosuke; Ratkovic, Marc. Estimating treatment effect heterogeneity in randomized program evaluation. Ann. Appl. Stat. 7 (2013), no. 1, 443--470. doi:10.1214/12-AOAS593. https://projecteuclid.org/euclid.aoas/1365527206
References
- Bradley, P. and Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines. In Machine Learning Proceedings of the Fifteenth International Conference 82–90. Morgan Kaufmann, San Francisco, CA.
- Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth Advanced Books and Software, Belmont, CA.
- Cai, T., Tian, L., Wong, P. H. and Wei, L. J. (2011). Analysis of randomized comparative clinical trial data for personalized treatment selections. Biostatistics 12 270–282.
- Chipman, H. A., George, E. I. and McCulloch, R. E. (2010). Bart: Bayesian additive regression trees. Ann. Appl. Stat. 4 266–298.Mathematical Reviews (MathSciNet): MR2758172
Zentralblatt MATH: 1189.62066
Digital Object Identifier: doi:10.1214/09-AOAS285
Project Euclid: euclid.aoas/1273584455 - Cole, S. R. and Stuart, E. A. (2010). Generalizing evidence from randomized clinical trials to target populations: The ACTG 320 trial. Am. J. Epidemiol. 172 107–115.
- Crump, R. K., Hotz, V. J., Imbens, G. W. and Mitnik, O. A. (2008). Nonparametric tests for treatment effect heterogeneity. The Review of Economics and Statistics 90 389–405.
- Davison, A. C. (1992). Treatment effect heterogeneity in paired data. Biometrika 79 463–474.Zentralblatt MATH: 1073.62501
- Dehejia, R. H. and Wahba, S. (1999). Causal effects in nonexperimental studies: Reevaluating the evaluation of training programs. J. Amer. Statist. Assoc. 94 1053–1062.
- Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499.Mathematical Reviews (MathSciNet): MR2060166
Zentralblatt MATH: 1091.62054
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935 - Franc, V., Zien, A. and Schölkopf, B. (2011). Support vector machines as probabilistic models. In The 28th International Conference on Machine Learning 665–672. ACM, Bellevue, WA.
- Frangakis, C. (2009). The calibration of treatment effects from clinical trials to target populations. Clin. Trials 6 136–140.
- Freund, Y. and Schapire, R. E. (1999). A short introduction to boosting. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence 1401–1406. Morgan Kaufmann, San Francisco, CA.
- Friedman, J. H., Hastie, T. and Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software 33 1–22.
- Gail, M. and Simon, R. (1985). Testing for qualitative interactions between treatment effects and patient subsets. Biometrics 41 361–372.
- Gelman, A., Jakulin, A., Pittau, M. G. and Su, Y.-S. (2008). A weakly informative default prior distribution for logistic and other regression models. Ann. Appl. Stat. 2 1360–1383.Mathematical Reviews (MathSciNet): MR2655663
Zentralblatt MATH: 1156.62017
Digital Object Identifier: doi:10.1214/08-AOAS191
Project Euclid: euclid.aoas/1231424214 - Gerber, A. S. and Green, D. P. (2000). The effects of canvassing, telephone calls, and direct mail on voter turnout: A field experiment. American Political Science Review 94 653–663.
- Gerber, A., Green, D. and Larimer, C. (2008). Social pressure and voter turnout: Evidence from a large-scale field experiment. American Political Science Review 102 33–48.
- Green, D. P. and Kern, H. L. (2010a). Detecting heterogenous treatment effects in large-scale experiments using Bayesian additive regression trees. In The Annual Summer Meeting of the Society of Political Methodology. Univ. Iowa.
- Green, D. P. and Kern, H. L. (2010b). Generalizing experimental results. In The Annual Meeting of the American Political Science Association. Washington, D.C.
- Gunter, L., Zhu, J. and Murphy, S. A. (2011). Variable selection for qualitative interactions. Stat. Methodol. 8 42–55.Mathematical Reviews (MathSciNet): MR2741508
Digital Object Identifier: doi:10.1016/j.stamet.2009.05.003 - Hartman, E., Grieve, R. and Sekhon, J. S. (2010). From SATE to PATT: The essential role of placebo test combining experimental and observational studies. In The Annual Meeting of the American Political Science Association. Washington, D.C.
- Hill, J. L. (2011). Challenges with propensity score matching in a high-dimensional setting and a potential alternative. Multivariate and Behavioral Research 46 477–513.
- Hothorn, T., Hornik, K. and Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. J. Comput. Graph. Statist. 15 651–674.Mathematical Reviews (MathSciNet): MR2291267
Digital Object Identifier: doi:10.1198/106186006X133933 - Imai, K. (2005). Do get-out-the-vote calls reduce turnout?: The importance of statistical methods for field experiments. American Political Science Review 99 283–300.
- Imai, K. and Strauss, A. (2011). Estimation of heterogeneous treatment effects from randomized experiments, with application to the optimal planning of the get-out-the-vote campaign. Political Analysis 19 1–19.
- Kang, J., Su, X., Hitsman, B., Liu, K. and Lloyd-Jones, D. (2012). Tree-structured analysis of treatment effects with large observational data. J. Appl. Stat. 39 513–529.Mathematical Reviews (MathSciNet): MR2880431
Digital Object Identifier: doi:10.1080/02664763.2011.602056 - Lagakos, S. W. (2006). The challenge of subgroup analyses–reporting without distorting. N. Engl. J. Med. 354 1667–1669.
- LaLonde, R. J. (1986). Evaluating the econometric evaluations of training programs with experimental data. American Economic Review 76 604–620.
- LeBlanc, M. and Kooperberg, C. (2010). Boosting predictions of treatment success. Proc. Natl. Acad. Sci. USA 107 13559–13560.
- Lee, Y., Lin, Y. and Wahba, G. (2004). Multicategory support vector machines: Theory and application to the classification of microarray data and satellite radiance data. J. Amer. Statist. Assoc. 99 67–81.Mathematical Reviews (MathSciNet): MR2054287
Zentralblatt MATH: 1089.62511
Digital Object Identifier: doi:10.1198/016214504000000098 - Lin, Y. (2002). Support vector machines and the Bayes rule in classification. Data Min. Knowl. Discov. 6 259–275.
- Lipkovich, I., Dmitrienko, A., Denne, J. and Enas, G. (2011). Subgroup identification based on differential effect search—a recursive partitioning method for establishing response to treatment in patient subpopulations. Stat. Med. 30 2601–2621.Mathematical Reviews (MathSciNet): MR2815438
- Loh, W. Y., Piper, M. E., Schlam, T. R., Fiore, M. C., Smith, S. S., Jorenby, D. E., Cook, J. W., Bolt, D. M. and Baker, T. B. (2012). Should all smokers use combination smoking cessation pharmacotherapy? Using novel analytic methods to detect differential treatment effects over eight weeks of pharmacotherapy. Nicotine and Tobacco Research 14 131–141.
- Manski, C. F. (2004). Statistical treatment rules for heterogeneous populations. Econometrica 72 1221–1246.Mathematical Reviews (MathSciNet): MR2064712
Digital Object Identifier: doi:10.1111/j.1468-0262.2004.00530.x - Menon, A. K., Jiang, X., Vembu, S., Elkan, C. and Ohno-Machado, L. (2012). Predicting accurate probabilities with a ranking loss. In Proceedings of the 29th International Conference on Machine Learning. Edinburgh, Scotland.
- Moodie, E. E. M., Platt, R. W. and Kramer, M. S. (2009). Estimating response-maximized decision rules with applications to breastfeeding. J. Amer. Statist. Assoc. 104 155–165.
- Murphy, S. A. (2003). Optimal dynamic treatment regimes. J. R. Stat. Soc. Ser. B Stat. Methodol. 65 331–366.Mathematical Reviews (MathSciNet): MR1983752
Zentralblatt MATH: 1065.62006
Digital Object Identifier: doi:10.1111/1467-9868.00389 - Nickerson, D. W. (2008). Is voting contagious?: Evidence from two field experiments. American Political Science Review 102 49–57.
- Pineau, J., Bellemare, M. G., Rush, A. J., Ghizaru, A. and Murphy, S. A. (2007). Constructing evidence-based treatment strategies using methods from computer science. Drug and Alcohol Dependence 88S S52–S60.
- Platt, J. (1999). Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In Advances in Large Margin Classifiers 61–74. MIT Press, Cambridge, MA.
- Qian, M. and Murphy, S. A. (2011). Performance guarantees for individualized treatment rules. Ann. Statist. 39 1180–1210.Mathematical Reviews (MathSciNet): MR2816351
Zentralblatt MATH: 1216.62178
Digital Object Identifier: doi:10.1214/10-AOS864
Project Euclid: euclid.aos/1304947047 - Ratkovic, M. and Imai, K. (2012). FindIt: R package for finding heterogeneous treatment effects. Available at Comprehensive R Archive Network (http://cran.r-project.org/package=FindIt).
- Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika 70 41–55.Mathematical Reviews (MathSciNet): MR742974
Zentralblatt MATH: 0522.62091
Digital Object Identifier: doi:10.1093/biomet/70.1.41 - Rothwell, P. M. (2005). Subgroup analysis in randomized controlled trials: Importance, indications, and interpretation. The Lancet 365 176–186.
- Rubin, D. B. (1990). Comment on J. Neyman and causal inference in experiments and observational studies: “On the application of probability theory to agricultural experiments. Essay on principles. Section 9” [Ann. Agric. Sci. 10 (1923) 1–51]. Statist. Sci. 5 472–480.
- Sollich, P. (2002). Bayesian methods for support vector machines: Evidence and predictive class probabilities. Machine Learning 46 21–52.
- Stuart, E. A., Cole, S. R., Bradshaw, C. P. and Leaf, P. J. (2011). The use of propensity scores to assess the generalizability of results from randomized trials. J. Roy. Statist. Soc. Ser. A 174 369–386.Mathematical Reviews (MathSciNet): MR2898850
Digital Object Identifier: doi:10.1111/j.1467-985X.2010.00673.x - Su, X., Tsai, C. L., Wang, H., Nickerson, D. M. and Li, B. (2009). Subgroup analysis via recursive partitioning. J. Mach. Learn. Res. 10 141–158.
- Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.Mathematical Reviews (MathSciNet): MR1379242
- Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New York.Mathematical Reviews (MathSciNet): MR1367965
- Wahba, G. (1990). Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics 59. SIAM, Philadelphia, PA.
- Wahba, G. (2002). Soft and hard classification by reproducing kernel Hilbert space methods. Proc. Natl. Acad. Sci. USA 99 16524–16530 (electronic).Mathematical Reviews (MathSciNet): MR1947755
Zentralblatt MATH: 1106.62338
Digital Object Identifier: doi:10.1073/pnas.242574899 - Yang, Y. and Zou, H. (2012). An efficient algorithm for computing the HHSVM and its generalizations. J. Comput. Graph. Statist. To appear.
- Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 49–67.Mathematical Reviews (MathSciNet): MR2212574
Zentralblatt MATH: 1141.62030
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00532.x - Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Statist. 32 56–85.Mathematical Reviews (MathSciNet): MR2051001
Zentralblatt MATH: 1105.62323
Digital Object Identifier: doi:10.1214/aos/1079120130
Project Euclid: euclid.aos/1079120130 - Zhang, H. H. (2006). Variable selection for support vector machines via smoothing spline ANOVA. Statist. Sinica 16 659–674.
- Zhang, B., Tsiatis, A. A., Laber, E. B. and Davidian, M. (2012). A robust method for estimating optimal treatment regimes. Biometrics. To appear.
- Zhao, Y., Zeng, D., Socinski, M. A. and Kosorok, M. R. (2011). Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer. Biometrics 67 1422–1433.Mathematical Reviews (MathSciNet): MR2872393
Digital Object Identifier: doi:10.1111/j.1541-0420.2011.01572.x - Zhao, Y., Zeng, D., Rush, J. A. and Kosorok, M. R. (2012). Estimating individualized treatment rules using outcome weighted learning. J. Amer. Statist. Assoc. 107 1106–1118.
- Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 301–320.Mathematical Reviews (MathSciNet): MR2137327
Zentralblatt MATH: 1069.62054
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00503.x - Zou, H., Hastie, T. and Tibshirani, R. (2007). On the “degrees of freedom” of the lasso. Ann. Statist. 35 2173–2192.Mathematical Reviews (MathSciNet): MR2363967
Digital Object Identifier: doi:10.1214/009053607000000127
Project Euclid: euclid.aos/1194461726

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- Exploiting multiple outcomes in Bayesian principal stratification analysis with application to the evaluation of a job training program
Mattei, Alessandra, Li, Fan, and Mealli, Fabrizia, Annals of Applied Statistics, 2013 - Evaluating the causal effect of university grants on student dropout: Evidence from a regression discontinuity design using principal stratification
Li, Fan, Mattei, Alessandra, and Mealli, Fabrizia, Annals of Applied Statistics, 2015 - A robust and efficient approach to causal inference based on sparse sufficient dimension reduction
Ma, Shujie, Zhu, Liping, Zhang, Zhiwei, Tsai, Chih-Ling, and Carroll, Raymond J., Annals of Statistics, 2019
- Exploiting multiple outcomes in Bayesian principal stratification analysis with application to the evaluation of a job training program
Mattei, Alessandra, Li, Fan, and Mealli, Fabrizia, Annals of Applied Statistics, 2013 - Evaluating the causal effect of university grants on student dropout: Evidence from a regression discontinuity design using principal stratification
Li, Fan, Mattei, Alessandra, and Mealli, Fabrizia, Annals of Applied Statistics, 2015 - A robust and efficient approach to causal inference based on sparse sufficient dimension reduction
Ma, Shujie, Zhu, Liping, Zhang, Zhiwei, Tsai, Chih-Ling, and Carroll, Raymond J., Annals of Statistics, 2019 - Bayesian Regression Tree Models for Causal Inference: Regularization, Confounding, and Heterogeneous Effects (with Discussion)
Hahn, P. Richard, Murray, Jared S., and Carvalho, Carlos M., Bayesian Analysis, 2020 - Identification, Inference and Sensitivity Analysis for Causal Mediation Effects
Imai, Kosuke, Keele, Luke, and Yamamoto, Teppei, Statistical Science, 2010 - High-dimensional $A$-learning for optimal dynamic treatment regimes
Shi, Chengchun, Fan, Ailin, Song, Rui, and Lu, Wenbin, Annals of Statistics, 2018 - Latent demographic profile estimation in
hard-to-reach groups
McCormick, Tyler H. and Zheng, Tian, Annals of Applied Statistics, 2012 - Clinician preferences and the estimation of causal treatment differences
Korn, Edward L. and Baumrind, Sheldon, Statistical Science, 1998 - Estimating population average causal effects in the presence of non-overlap: The effect of natural gas compressor station exposure on cancer mortality
Nethery, Rachel C., Mealli, Fabrizia, and Dominici, Francesca, Annals of Applied Statistics, 2019 - Causal Inference Through Potential Outcomes and Principal Stratification: Application to Studies with “Censoring” Due to Death
Rubin, Donald B., Statistical Science, 2006
