The Annals of Applied Statistics

The use of covariates and random effects in evaluating predictive biomarkers under a potential outcome framework

Zhiwei Zhang, Lei Nie, Guoxing Soon, and Aiyi Liu

Full-text: Open access


Predictive or treatment selection biomarkers are usually evaluated in a subgroup or regression analysis with focus on the treatment-by-marker interaction. Under a potential outcome framework (Huang, Gilbert and Janes [Biometrics 68 (2012) 687–696]), a predictive biomarker is considered a predictor for a desirable treatment benefit (defined by comparing potential outcomes for different treatments) and evaluated using familiar concepts in prediction and classification. However, the desired treatment benefit is unobservable because each patient can receive only one treatment in a typical study. Huang et al. overcome this problem by assuming monotonicity of potential outcomes, with one treatment dominating the other in all patients. Motivated by an HIV example that appears to violate the monotonicity assumption, we propose a different approach based on covariates and random effects for evaluating predictive biomarkers under the potential outcome framework. Under the proposed approach, the parameters of interest can be identified by assuming conditional independence of potential outcomes given observed covariates, and a sensitivity analysis can be performed by incorporating an unobserved random effect that accounts for any residual dependence. Application of this approach to the motivating example shows that baseline viral load and CD4 cell count are both useful as predictive biomarkers for choosing antiretroviral drugs for treatment-naive patients.

Article information

Ann. Appl. Stat., Volume 8, Number 4 (2014), 2336-2355.

First available in Project Euclid: 19 December 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Conditional independence counterfactual ROC regression sensitivity analysis treatment effect heterogeneity treatment selection


Zhang, Zhiwei; Nie, Lei; Soon, Guoxing; Liu, Aiyi. The use of covariates and random effects in evaluating predictive biomarkers under a potential outcome framework. Ann. Appl. Stat. 8 (2014), no. 4, 2336--2355. doi:10.1214/14-AOAS773.

Export citation


  • Cohen, C. J., Andrade-Villanueva, J., Clotet, B., Fourie, J., Johnson, M. A., Ruxrungtham, K., Wu, H., Zorrilla, C., Crauwels, H., Rimsky, L. T., Vanveggel, S., Boven, K. and THRIVE study group (2011). Rilpivirine versus efavirenz with two background nucleoside or nucleotide reverse transcriptase inhibitors in treatment-naive adults infected with HIV-1 (THRIVE): A phase 3, randomised, non-inferiority trial. Lancet 378 229–237.
  • Dodd, L. E. and Pepe, M. S. (2003). Semiparametric regression for the area under the receiver operating characteristic curve. J. Amer. Statist. Assoc. 98 409–417.
  • Foster, J. C., Taylor, J. M. G. and Ruberg, S. J. (2011). Subgroup identification from randomized clinical trial data. Stat. Med. 30 2867–2880.
  • Gadbury, G. L. and Iyer, H. K. (2000). Unit-treatment interaction and its practical consequences. Biometrics 56 882–885.
  • Gail, M. and Simon, R. (1985). Testing for qualitative interactions between treatment effects and patient subsets. Biometrics 41 361–372.
  • Holland, P. W. (1986). Statistics and causal inference (with discussion). J. Amer. Statist. Assoc. 81 945–970.
  • Huang, Y., Gilbert, P. B. and Janes, H. (2012). Assessing treatment-selection markers using a potential outcomes framework. Biometrics 68 687–696.
  • Karapetis, C. S., Khambata-Ford, S., Jonker, D. J., O’Callaghan, C. J., Tu, D., Tebbutt, N. C., Simes, R. J., Chalchal, H., Shapiro, J. D., Robitaille, S., Price, T. J., Shepherd, L., Au, H.-J., Langer, C., Moore, M. J. and Zalcberg, J. R. (2008). K-ras mutations and benefit from cetuximab in advanced colorectal cancer. N. Engl. J. Med. 359 1757–1765.
  • Manski, C. F. (2003). Partial Identification of Probability Distributions. Springer, New York.
  • Paik, S., Shak, S., Tang, G., Kim, C., Baker, J., Cronin, M., Baehner, F. L., Walker, M. G., Watson, D., Park, T., Hiller, W., Fisher, E. R., Wickerham, D. L., Bryant, J. and Wolmark, N. (2004). A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N. Engl. J. Med. 351 2817–2826.
  • Pepe, M. S. (2003). The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford Statistical Science Series 28. Oxford Univ. Press, Oxford.
  • Pocock, S. J., Assmann, S. E., Enos, L. E. and Kasten, L. E. (2002). Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: Current practice and problems. Stat. Med. 21 2917–2930.
  • Poulson, R. S., Gadbury, G. L. and Allison, D. B. (2012). Treatment heterogeneity and individual qualitative interaction. Amer. Statist. 66 16–24.
  • Qian, M. and Murphy, S. A. (2011). Performance guarantees for individualized treatment rules. Ann. Statist. 39 1180–1210.
  • Russek-Cohen, E. and Simon, R. M. (1998). Evaluating treatments when a gender by treatment interaction may exist. Stat. Med. 16 455–464.
  • Simon, R. (2008). Development and validation of biomarker classifiers for treatment selection. J. Statist. Plann. Inference 138 308–320.
  • Simon, R. (2010). Clinical trials for predictive medicine: New challenges and paradigms. Clin. Trials 7 516–524.
  • Su, X., Zhou, T., Yan, X., Fan, J. and Yang, S. (2008). Interaction trees with censored survival data. Int. J. Biostat. 4 Art. 2, 28.
  • Tian, L., Alizadeh, A. A., Gentles, A. J. and Tibshirani, R. (2012). A simple method for detecting interactions between a treatment and a large number of covariates. Available at arXiv:1212.2995.
  • van der Laan, M. J. and Robins, J. M. (2003). Unified Methods for Censored Longitudinal Data and Causality. Springer, New York.
  • Zhang, B., Tsiatis, A. A., Laber, E. B. and Davidian, M. (2012). A robust method for estimating optimal treatment regimes. Biometrics 68 1010–1018.
  • Zhang, Z., Wang, C., Nie, L. and Soon, G. (2013). Assessing the heterogeneity of treatment effects via potential outcomes of individual patients. J. R. Stat. Soc. Ser. C. Appl. Stat. 62 687–704.
  • Zhang, Z., Nie, L., Soon, G. and Liu, A. (2014). Supplement to “The use of covariates and random effects in evaluating predictive biomarkers under a potential outcome framework.” DOI:10.1214/14-AOAS773SUPP.
  • Zhou, X.-H., Obuchowski, N. A. and McClish, D. K. (2002). Statistical Methods in Diagnostic Medicine. Wiley, New York.
  • Zou, K. H., Liu, A., Bandos, A. I., Ohno-Machado, L. and Rockette, H. E. (2011). Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis. CRC Press, Boca Raton, FL.

Supplemental materials