Annals of Applied Statistics

A two-state mixed hidden Markov model for risky teenage driving behavior

John C. Jackson, Paul S. Albert, and Zhiwei Zhang

Full-text: Open access


This paper proposes a joint model for longitudinal binary and count outcomes. We apply the model to a unique longitudinal study of teen driving where risky driving behavior and the occurrence of crashes or near crashes are measured prospectively over the first 18 months of licensure. Of scientific interest is relating the two processes and predicting crash and near crash outcomes. We propose a two-state mixed hidden Markov model whereby the hidden state characterizes the mean for the joint longitudinal crash/near crash outcomes and elevated g-force events which are a proxy for risky driving. Heterogeneity is introduced in both the conditional model for the count outcomes and the hidden process using a shared random effect. An estimation procedure is presented using the forward–backward algorithm along with adaptive Gaussian quadrature to perform numerical integration. The estimation procedure readily yields hidden state probabilities as well as providing for a broad class of predictors.

Article information

Ann. Appl. Stat., Volume 9, Number 2 (2015), 849-865.

Received: February 2013
Revised: May 2014
First available in Project Euclid: 20 July 2015

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Adaptive quadrature hidden Markov model joint model random effects


Jackson, John C.; Albert, Paul S.; Zhang, Zhiwei. A two-state mixed hidden Markov model for risky teenage driving behavior. Ann. Appl. Stat. 9 (2015), no. 2, 849--865. doi:10.1214/14-AOAS765.

Export citation


  • Albert, P. S. and Follmann, D. A. (2007). Random effects and latent processes approaches for analyzing binary longitudinal data with missingness: A comparison of approaches using opiate clinical trial data. Stat. Methods Med. Res. 16 417–439.
  • Alfò, M., Maruotti, A. and Trovato, G. (2011). A finite mixture model for multivariate counts under endogenous selectivity. Stat. Comput. 21 185–202.
  • Altman, R. M. (2007). Mixed hidden Markov models: An extension of the hidden Markov model to the longitudinal data setting. J. Amer. Statist. Assoc. 102 201–210.
  • Altman, R. M. (2008). A variance component test for mixed hidden Markov models. Statist. Probab. Lett. 78 1885–1893.
  • Bartolucci, F., Lupparelli, M. and Montanari, G. E. (2009). Latent Markov model for longitudinal binary data: An application to the performance evaluation of nursing homes. Ann. Appl. Stat. 3 611–636.
  • Bartolucci, F. and Pennoni, F. (2007). A class of latent Markov models for capture-recapture data allowing for time, heterogeneity, and behavior effects. Biometrics 63 568–578.
  • Baum, L. E., Petrie, T., Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist. 41 164–171.
  • Jackson, J., Albert, P. and Zhang, Z. (2015). Supplement to “A two-state mixed hidden Markov model for risky teenage driving behavior.” DOI:10.1214/14-AOAS765SUPP.
  • Jackson, J. C., Albert, P. S., Zhang, Z. and Simons-Morton, B. (2013). Ordinal latent variable models and their application in the study of newly licensed teenage drivers. J. R. Stat. Soc. Ser. C. Appl. Stat. 62 435–450.
  • Langeheine, R. and van de Pol, F. (1994). Discrete-time mixed Markov latent class models. In Analyzing Social and Political Change: A Casebook of Methods (A. Dale and R. B. Davies, eds.). Sage, London.
  • Liu, Q. and Pierce, D. A. (1994). A note on Gauss–Hermite quadrature. Biometrika 81 624–629.
  • Maruotti, A. (2011). Mixed hidden Markov models for longitudinal data: An overview. International Statistical Review 79 427–454.
  • Maruotti, A. and Rocci, R. (2012). A mixed non-homogeneous hidden Markov model for categorical data, with application to alcohol consumption. Stat. Med. 31 871–886.
  • McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed models. J. Amer. Statist. Assoc. 92 162–170.
  • Scott, S. L. (2002). Bayesian methods for hidden Markov models: Recursive computing in the 21st century. J. Amer. Statist. Assoc. 97 337–351.
  • Shirley, K. E., Small, D. S., Lynch, K. G., Maisto, S. A. and Oslin, D. W. (2010). Hidden Markov models for alcoholism treatment trial data. Ann. Appl. Stat. 4 366–395.
  • Simons-Morton, B. G., Ouimet, M. C., Zhang, Z., Klauer, S. E., Lee, S. E., Wang, J., Albert, P. S. and Dingus, T. A. (2011). Crash and risky driving involvement among novice adolescent drivers and their parents. Am. J. Public Health 101 2362–2367.
  • Smith, M. D. and Moffatt, P. G. (1999). Fisher’s information on the correlation coefficient in bivariate logistic models. Aust. N. Z. J. Stat. 41 315–330.
  • Wei, G. C. G. and Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J. Amer. Statist. Assoc. 85 699–704.

Supplemental materials

  • Adaptive quadrature for the three-state mixed hidden Markov model. We provide details on the adaptive quadrature routine for the MHMM with bivariate normal random effects in the hidden process, as well as expressions for the three-state hidden Markov model.