The Annals of Applied Statistics

Robust mixed effects model for clustered failure time data: Application to Huntington’s disease event measures

Tanya P. Garcia, Yanyuan Ma, Karen Marder, and Yuanjia Wang

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


An important goal in clinical and statistical research is properly modeling the distribution for clustered failure times which have a natural intra-class dependency and are subject to censoring. We handle these challenges with a novel approach that does not impose restrictive modeling or distributional assumptions. Using a logit transformation, we relate the distribution for clustered failure times to covariates and a random, subject-specific effect. The covariates are modeled with unknown functional forms, and the random effect may depend on the covariates and have an unknown and unspecified distribution. We introduce pseudovalues to handle censoring and splines for functional covariate effects, and frame the problem into fitting an additive logistic mixed effects model. Unlike existing approaches for fitting such models, we develop semiparametric techniques that estimate the functional model parameters without specifying or estimating the random effect distribution. We show both theoretically and empirically that the resulting estimators are consistent for any choice of random effect distribution and any dependency structure between the random effect and covariates. Last, we illustrate the method’s utility in an application to a Huntington’s disease study where our method provides new insights into differences between motor and cognitive impairment event times in at-risk subjects.

Article information

Ann. Appl. Stat., Volume 11, Number 2 (2017), 1085-1116.

Received: May 2016
Revised: February 2017
First available in Project Euclid: 20 July 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Additive model clustered failure times logistic mixed model varying coefficient model semiparametric estimator splines


Garcia, Tanya P.; Ma, Yanyuan; Marder, Karen; Wang, Yuanjia. Robust mixed effects model for clustered failure time data: Application to Huntington’s disease event measures. Ann. Appl. Stat. 11 (2017), no. 2, 1085--1116. doi:10.1214/17-AOAS1038.

Export citation


  • Andersen, P. K. and Pohar Perme, M. (2010). Pseudo-observations in survival analysis. Stat. Methods Med. Res. 19 71–99.
  • Bennett, S. (1983). Analysis of survival data by the proportional odds model. Stat. Med. 2 273–277.
  • Chen, M.-C. and Bandeen-Roche, K. (2005). A diagnostic for association in bivariate survival models. Lifetime Data Anal. 11 245–264.
  • Chen, Y., Chen, K. and Ying, Z. (2010). Analysis of multivariate failure time data using marginal proportional hazards model. Statist. Sinica 20 1025–1041.
  • Chen, D. G. and Lio, Y. L. (2008). Comparative studies on frailties in survival analysis. Comm. Statist. Simulation Comput. 37 1631–1646.
  • Clayton, D. G. (1978). A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 65 141–151.
  • Congdon, P. (1994). Analyzing mortality in London: Life-tables with frailty. Statistician 43 277–308.
  • Conne, D., Ronchetti, E. and Victoria-Feser, M.-P. (2010). Goodness of fit for generalized linear latent variables models. J. Amer. Statist. Assoc. 105 1126–1134. With supplementary material available online.
  • de Boor, C. (2001). A Practical Guide to Splines, revised ed. Applied Mathematical Sciences 27. Springer, New York.
  • Duff, K., Paulsen, J., Mills, J., Beglinger, L. J., Moser, D. J., Smith, M. M., Langbehn, D., Stout, J., Queller, S., Harrington, D. L. and the PREDICT-HD Investigators and Coordinators of the Huntington Study Group (2010). Mild cognitive impairment in prediagnosed Huntington disease. Neurology 75 500–507.
  • Efron, B. (1988). Logistic regression, survival analysis, and the Kaplan–Meier curve. J. Amer. Statist. Assoc. 83 414–425.
  • Garcia, T. P. and Ma, Y. (2016). Optimal estimator for logistic model with distribution-free random intercept. Scand. J. Stat. 43 156–171.
  • Garcia, T. P., Ma, Y., Marder, K. and Wang, Y. (2017). Supplement to “Robust mixed effects model for clustered failure time data: Application to Huntington’s disease event measures.” DOI:10.1214/17-AOAS1038SUPP.
  • Geerdens, C., Claeskens, G. and Janssen, P. (2013). Goodness-of-fit tests for the frailty distribution in proportional hazards models with shared frailty. Biostatistics 14 433–446.
  • Glidden, D. V. and Vittinghoff, E. (2004). Modeling clustered survival data from multicenter clinical trials. Stat. Med. 23 369–388.
  • Gorfine, M., De-Picciotto, R. and Hsu, L. (2012). Conditional and marginal estimates in case-control family data—Extensions and sensitivity analyses. J. Stat. Comput. Simul. 82 1449–1470.
  • Govindarajulu, U. S., Glickman, M. E. and D’Agostino, R. B. Sr. (2007). Modeling frailty as a function of observed covariates. J. Stat. Theory Pract. 1 117–135.
  • Harper, P. S. (1996). Huntington’s Disease, 2nd ed. W.B. Saunders, London.
  • Hausman, J. A. (1978). Specification tests in econometrics. Econometrica 46 1251–1271.
  • Heagerty, P. J. and Kurland, B. F. (2001). Misspecified maximum likelihood estimates and generalised linear mixed models. Biometrika 88 973–985.
  • Henderson, R. and Oman, P. (1999). Effect of frailty on marginal regression estimates in survival analysis. J. R. Stat. Soc. Ser. B. Stat. Methodol. 61 367–379.
  • Hsu, L., Gorfine, M. and Malone, K. (2007). On robustness of marginal regression coefficient estimates and hazard functions in multivariate survival analysis of family data when the frailty distribution is mis-specified. Stat. Med. 26 4657–4678.
  • Huang, S. S., Yokoe, D. S., Stelling, J., Placzek, H., Kulldorff, M., Kleinman, K., O’Brien, T. F., Calderwood, M. S., Vostok, J., Dunn, J. and Platt, R. (2010). Automated detection of infectious disease outbreaks in hospitals: A retrospective cohort study. PLoS Med. 7 e1000238.
  • Huber, P., Ronchetti, E. and Victoria-Feser, M.-P. (2004). Estimation of generalized linear latent variable models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 893–908.
  • Huntington’s Disease Collaborative Research Group (2010). A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington’s disease chromosomes. Cell 72 971–983.
  • Johnson, S. G. and Narasimhan, B. (2013). Cubature: Adaptive multivariate integration over hypercubes. R package version 1.1-2. Available at
  • Klein, J. P., van Houwelingen, H. C., Ibrahim, J. G. and Scheike, T. H., eds. (2014). Handbook of Survival Analysis. CRC Press, Boca Raton, FL.
  • Langbehn, D. R., Brinkman, R. R., Falush, D., Paulsen, J. S. and Hayden, M. R. (2004). A new model for prediction of the age of onset and penetrance for Huntington’s disease based on CAG length. Clin. Genet. 65 267–277.
  • Lee, K. J. and Thompson, S. G. (2008). Flexible parametric models for random-effects distributions. Stat. Med. 27 418–434.
  • Lesaffre, E. and Molenberghs, G. (2001). Multivariate probit analysis: A neglected procedure in medical statistics. Stat. Med. 10 1391–1403.
  • Logan, B. R., Nelson, G. O. and Klein, J. P. (2008). Analyzing center specific outcomes in hematopoietic cell transplantation. Lifetime Data Anal. 14 389–404.
  • Logan, B. R., Zhang, M.-J. and Klein, J. P. (2011). Marginal models for clustered time-to-event data with competing risks using pseudovalues. Biometrics 67 1–7.
  • Ma, Y. and Genton, M. G. (2010). Explicit estimating equations for semiparametric generalized linear latent variable models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 475–495.
  • Marder, K., Zhao, H., Myers, R., Cudkowicz, M., Kayson, E., Kieburtz, K., Orme, C., Paulsen, J., Penney, J., Siemers, E., Shoulson, I. and the Huntington Study Group (2000). Rate of functional decline in Huntington’s disease. Neurology 369 452–458.
  • Marder, K., Levy, G., Louis, E. D., Mejia-Santana, H., Cote, L., Andrews, H., Harris, J., Waters, C., Ford, B., Frucht, S., Fahn, S. and Ottman, R. (2003). Accuracy of family history data on Parkinson’s disease. Neurology 61 18–23.
  • Murphy, S. A., Rossini, A. J. and van der Vaart, A. W. (1997). Maximum likelihood estimation in the proportional odds model. J. Amer. Statist. Assoc. 92 968–976.
  • Paulsen, J. and Long, J. (2014). Onset of Huntington’s disease: Can it be purely cognitive? Mov. Disord. 29 1342–1350.
  • Piepho, H. P. and McCulloch, C. E. (2004). Transformations in mixed models: Application to risk analysis for a multienvironment trial. J. Agric. Biol. Environ. Stat. 9 123–137.
  • Ripatti, S. and Palmgren, J. (2000). Estimation of multivariate frailty models using penalized partial likelihood. Biometrics 56 1016–1022.
  • Rizopoulos, D. and Ghosh, P. (2011). A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Stat. Med. 30 1366–1380.
  • Ross, C. A. and Tabrizi, S. J. (2010). Huntington’s disease: From molecular pathogenesis to clinical treatment. Lancet Neurol. 10 83–98.
  • Shih, J. H. and Louis, T. A. (1995). Inferences on the association parameter in copula models for bivariate survival data. Biometrics 51 1384–1399.
  • Stout, J. C., Paulsen, J. S., Queller, S., Solomon, A. C., Whitlock, K. B., Campbell, J. C., Carlozzi, N., Duff, K., Beglinger, L. J., Langbehn, D. R., Johnson, S. A., Biglan, K. M. and Aylward, E. H. (2011). Neurocognitive signs in prodromal Huntington disease. Neuropsychology 25 1–14.
  • Tsiatis, A. A. and Ma, Y. (2004). Locally efficient semiparametric estimators for functional measurement error models. Biometrika 91 835–848.
  • Wood, S. N. (2008). Fast stable direct fitting and smoothness selection for generalized additive models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 495–518.
  • Ying, Z. and Wei, L. J. (1994). The Kaplan–Meier estimate for dependent failure time observations. J. Multivariate Anal. 50 17–29.
  • Zeng, D., Lin, D. Y. and Yin, G. (2005). Maximum likelihood estimation for the proportional odds model with random effects. J. Amer. Statist. Assoc. 100 470–483.

Supplemental materials

  • Technical proofs and empirical results. The supplementary material contains theoretical derivations, additional simulation study results, and additional results for the Huntington’s disease application.