The Annals of Statistics

Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy

Abstract

We consider challenges that arise in the estimation of the mean outcome under an optimal individualized treatment strategy defined as the treatment rule that maximizes the population mean outcome, where the candidate treatment rules are restricted to depend on baseline covariates. We prove a necessary and sufficient condition for the pathwise differentiability of the optimal value, a key condition needed to develop a regular and asymptotically linear (RAL) estimator of the optimal value. The stated condition is slightly more general than the previous condition implied in the literature. We then describe an approach to obtain root-$n$ rate confidence intervals for the optimal value even when the parameter is not pathwise differentiable. We provide conditions under which our estimator is RAL and asymptotically efficient when the mean outcome is pathwise differentiable. We also outline an extension of our approach to a multiple time point problem. All of our results are supported by simulations.

Article information

Source
Ann. Statist., Volume 44, Number 2 (2016), 713-742.

Dates
Revised: September 2015
First available in Project Euclid: 17 March 2016

https://projecteuclid.org/euclid.aos/1458245733

Digital Object Identifier
doi:10.1214/15-AOS1384

Mathematical Reviews number (MathSciNet)
MR3476615

Zentralblatt MATH identifier
1338.62089

Subjects
Primary: 62G05: Estimation
Secondary: 62N99: None of the above, but in this section

Citation

Luedtke, Alexander R.; van der Laan, Mark J. Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy. Ann. Statist. 44 (2016), no. 2, 713--742. doi:10.1214/15-AOS1384. https://projecteuclid.org/euclid.aos/1458245733

References

• Athreya, K. B. and Lahiri, S. N. (2006). Measure Theory and Probability Theory. Springer, New York.
• Audibert, J.-Y. and Tsybakov, A. B. (2007). Fast learning rates for plug-in classifiers. Ann. Statist. 35 608–633.
• Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ. Press, Baltimore, MD.
• Chakraborty, B., Laber, E. B. and Zhao, Y.-Q. (2014). Inference about the expected performance of a data-driven dynamic treatment regime. Clin. Trials 11 408–417.
• Chakraborty, B. and Moodie, E. E. M. (2013). Statistical Methods for Dynamic Treatment Regimes. Springer, New York.
• Chen, J. (2004). Notes on the bias–variance trade-off phenomenon. In A Festschrift for Herman Rubin. Institute of Mathematical Statistics Lecture Notes—Monograph Series 45 207–217. IMS, Beachwood, OH.
• Gaenssler, P., Strobel, J. and Stute, W. (1978). On central limit theorems for martingale triangular arrays. Acta Math. Acad. Sci. Hungar. 31 205–216.
• Goldberg, Y., Song, R., Zeng, D. and Kosorok, M. R. (2014). Comment on “Dynamic treatment regimes: Technical challenges and applications” [MR3263118]. Electron. J. Stat. 8 1290–1300.
• Hirano, K. and Porter, J. R. (2012). Impossibility results for nondifferentiable functionals. Econometrica 80 1769–1790.
• Laber, E. B. and Murphy, S. A. (2011). Adaptive confidence intervals for the test error in classification. J. Amer. Statist. Assoc. 106 904–913.
• Laber, E. B., Lizotte, D. J., Qian, M., Pelham, W. E. and Murphy, S. A. (2014a). Dynamic treatment regimes: Technical challenges and applications. Electron. J. Stat. 8 1225–1272.
• Laber, E. B., Lizotte, D. J., Qian, M., Pelham, W. E. and Murphy, S. A. (2014b). Rejoinder of “Dynamic treatment regimes: Technical challenges and applications.” Electron. J. Stat. 8 1312–1321.
• Langford, J., Li, L. and Zhang, T. (2009). Sparse online learning via truncated gradient. In Advances in Neural Information Processing Systems 21 908–915. Curran Associates, Red Hook, NY.
• Liu, R. C. and Brown, L. D. (1993). Nonexistence of informative unbiased estimators in singular problems. Ann. Statist. 21 1–13.
• Luedtke, A. R. and van der Laan, M. J. (2014). Super-learning of an optimal dynamic treatment rule. Technical Report 326, Division of Biostatistics, Univ. California, Berkeley. Available at http://www.bepress.com/ucbbiostat/.
• Luedtke, A. R. and van der Laan, M. J. (2015). Supplement to “Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy.” DOI:10.1214/15-AOS1384SUPP.
• Luts, J., Broderick, T. and Wand, M. P. (2014). Real-time semiparametric regression. J. Comput. Graph. Statist. 23 589–615.
• Qian, M. and Murphy, S. A. (2011). Performance guarantees for individualized treatment rules. Ann. Statist. 39 1180–1210.
• R Core Team (2014). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Available at http://www.r-project.org/.
• Robins, J. M. (2004). Optimal structural nested models for optimal sequential decisions. In Proceedings of the Second Seattle Symposium in Biostatistics. Lecture Notes in Statist. 179 189–326. Springer, New York.
• Robins, J. and Rotnitzky, A. (2014). Discussion of “Dynamic treatment regimes: Technical challenges and applications” [MR3263118]. Electron. J. Stat. 8 1273–1289.
• Rubin, D. B. and van der Laan, M. J. (2012). Statistical issues and limitations in personalized medicine research with clinical trials. Int. J. Biostat. 8 Article 18.
• Tsybakov, A. B. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135–166.
• van der Laan, M. J. and Lendle, S. D. (2014). Online targeted learning. Technical Report 330, Division of Biostatistics, Univ. California, Berkeley. Available at http://www.bepress.com/ucbbiostat/.
• van der Laan, M. J. and Luedtke, A. R. (2014a). Targeted learning of an optimal dynamic treatment, and statistical inference for its mean outcome. Technical Report 329, Division of Biostatistics, Univ. California, Berkeley. Available at http://www.bepress.com/ucbbiostat/.
• van der Laan, M. J. and Luedtke, A. R. (2014b). Targeted learning of the mean outcome under an optimal dynamic treatment rule. J. Causal Inference 3 61–95.
• van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
• Zhang, T. (2004). Solving large scale linear prediction problems using stochastic gradient descent algorithms. In ICML’04 Proceedings of the Twenty-First International Conference on Machine Learning 116. ACM, New York.
• Zhang, B., Tsiatis, A., Davidian, M., Zhang, M. and Laber, E. (2012a). A robust method for estimating optimal treatment regimes. Biometrics 68 1010–1018.
• Zhang, B., Tsiatis, A. A., Davidian, M., Zhang, M. and Laber, E. (2012b). Estimating optimal treatment regimes from a classification perspective. Statistics 68 103–114.
• Zhao, Y., Zeng, D., Rush, A. J. and Kosorok, M. R. (2012). Estimating individualized treatment rules using outcome weighted learning. J. Amer. Statist. Assoc. 107 1106–1118.

Supplemental materials

• Supplementary appendices: Proofs and extension to multiple time point case. Supplementary Appendix A contains all the proofs of all of the results in the main text. Supplementary Appendix B contains an outline of the extension to the multiple time point case. Supplementary Appendix C contains additional figures referenced in the main text.