We propose a Bayesian hidden Markov model for analyzing time series and sequential data where a special structure of the transition probability matrix is embedded to model explicit-duration semi-Markovian dynamics. Our formulation allows for the development of highly flexible and interpretable models that can integrate available prior information on state durations while keeping a moderate computational cost to perform efficient posterior inference. We show the benefits of choosing a Bayesian approach for HSMM estimation over its frequentist counterpart, in terms of model selection and out-of-sample forecasting, also highlighting the computational feasibility of our inference procedure whilst incurring negligible statistical error. The use of our methodology is illustrated in an application relevant to e-Health, where we investigate rest-activity rhythms using telemetric activity data collected via a wearable sensing device. This analysis considers for the first time Bayesian model selection for the form of the explicit state dwell distribution. We further investigate the inclusion of a circadian covariate into the emission density and estimate this in a data-driven manner.
Bayesian Anal.
18(2):
547-577
(June 2023).
DOI: 10.1214/22-BA1318
Akaike, H. (1973). “Information theory and an extension of the maximum likelihood principle.” In Second International Symposium on Information Theory, 267–281. MR0483125Akaike, H. (1973). “Information theory and an extension of the maximum likelihood principle.” In Second International Symposium on Information Theory, 267–281. MR0483125
Ancoli-Israel, S., Cole, R., Alessi, C., Chambers, M., Moorcroft, W., and Pollak, C. P. (2003). “The role of actigraphy in the study of sleep and circadian rhythms.” Sleep, 26(3): 342–392.Ancoli-Israel, S., Cole, R., Alessi, C., Chambers, M., Moorcroft, W., and Pollak, C. P. (2003). “The role of actigraphy in the study of sleep and circadian rhythms.” Sleep, 26(3): 342–392.
Ancoli-Israel, S., Martin, J. L., Blackwell, T., Buenaver, L., Liu, L., Meltzer, L. J., Sadeh, A., Spira, A. P., and Taylor, D. J. (2015). “The SBSM guide to actigraphy monitoring: clinical and research applications.” Behavioral Sleep Medicine, 13(sup1): S4–S38.Ancoli-Israel, S., Martin, J. L., Blackwell, T., Buenaver, L., Liu, L., Meltzer, L. J., Sadeh, A., Spira, A. P., and Taylor, D. J. (2015). “The SBSM guide to actigraphy monitoring: clinical and research applications.” Behavioral Sleep Medicine, 13(sup1): S4–S38.
Aung, M. H., Matthews, M., and Choudhury, T. (2017). “Sensing behavioral symptoms of mental health and delivering personalized interventions using mobile technologies.” Depression and Anxiety, 34(7): 603–609.Aung, M. H., Matthews, M., and Choudhury, T. (2017). “Sensing behavioral symptoms of mental health and delivering personalized interventions using mobile technologies.” Depression and Anxiety, 34(7): 603–609.
Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M. A., Guo, J., Li, P., and Riddell, A. (2016). “Stan: A probabilistic programming language.” Journal of Statistical Software, 20.Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M. A., Guo, J., Li, P., and Riddell, A. (2016). “Stan: A probabilistic programming language.” Journal of Statistical Software, 20.
Duane, S., Kennedy, A. D., Pendleton, B. J., and Roweth, D. (1987). “Hybrid Monte Carlo.” Physics Letters B, 195(2): 216–222. MR3960671 10.1016/0370-2693(87)91197-xDuane, S., Kennedy, A. D., Pendleton, B. J., and Roweth, D. (1987). “Hybrid Monte Carlo.” Physics Letters B, 195(2): 216–222. MR3960671 10.1016/0370-2693(87)91197-x
Economou, T., Bailey, T. C., and Kapelan, Z. (2014). “MCMC implementation for Bayesian hidden semi-Markov models with illustrative applications.” Statistics and Computing, 24(5): 739–752. MR3229694 10.1007/s11222-013-9399-zEconomou, T., Bailey, T. C., and Kapelan, Z. (2014). “MCMC implementation for Bayesian hidden semi-Markov models with illustrative applications.” Statistics and Computing, 24(5): 739–752. MR3229694 10.1007/s11222-013-9399-z
Fujikoshi, Y. (1985). “Selection of variables in two-group discriminant analysis by error rate and Akaike’s information criteria.” Journal of Multivariate Analysis, 17(1): 27–37. MR0797518 10.1016/0047-259X(85)90092-2Fujikoshi, Y. (1985). “Selection of variables in two-group discriminant analysis by error rate and Akaike’s information criteria.” Journal of Multivariate Analysis, 17(1): 27–37. MR0797518 10.1016/0047-259X(85)90092-2
Gelfand, A. E. and Smith, A. F. (1990). “Sampling-based approaches to calculating marginal densities.” Journal of the American Statistical Association, 85(410): 398–409. MR1141740Gelfand, A. E. and Smith, A. F. (1990). “Sampling-based approaches to calculating marginal densities.” Journal of the American Statistical Association, 85(410): 398–409. MR1141740
Gelman, A., Hwang, J., and Vehtari, A. (2014). “Understanding predictive information criteria for Bayesian models.” Statistics and Computing, 24(6): 997–1016. MR3253850 10.1007/s11222-013-9416-2Gelman, A., Hwang, J., and Vehtari, A. (2014). “Understanding predictive information criteria for Bayesian models.” Statistics and Computing, 24(6): 997–1016. MR3253850 10.1007/s11222-013-9416-2
George, E. I. and McCulloch, R. E. (1993). “Variable selection via Gibbs sampling.” Journal of the American Statistical Association, 88(423): 881–889.George, E. I. and McCulloch, R. E. (1993). “Variable selection via Gibbs sampling.” Journal of the American Statistical Association, 88(423): 881–889.
Griewank, A. and Walther, A. (2008). Evaluating derivatives: principles and techniques of algorithmic differentiation, volume 105. SIAM. MR2454953 10.1137/1.9780898717761Griewank, A. and Walther, A. (2008). Evaluating derivatives: principles and techniques of algorithmic differentiation, volume 105. SIAM. MR2454953 10.1137/1.9780898717761
Gronau, Q., Singmann, H., and Wagenmakers, E.-J. (2020). “Bridgesampling: An R package for estimating normalizing constants.” Journal of Statistical Software, 92(10).Gronau, Q., Singmann, H., and Wagenmakers, E.-J. (2020). “Bridgesampling: An R package for estimating normalizing constants.” Journal of Statistical Software, 92(10).
Guédon, Y. (2003). “Estimating hidden semi-Markov chains from discrete sequences.” Journal of Computational and Graphical Statistics, 12(3): 604–639. MR2002638 10.1198/1061860032030Guédon, Y. (2003). “Estimating hidden semi-Markov chains from discrete sequences.” Journal of Computational and Graphical Statistics, 12(3): 604–639. MR2002638 10.1198/1061860032030
Hadj-Amar, B., Finkenstädt, B., Fiecas, M., and Huckstepp, R. (2021). “Identifying the recurrence of sleep apnea using a harmonic hidden Markov model.” The Annals of Applied Statistics, 15(3): 1171. MR4316645 10.1214/21-aoas1455Hadj-Amar, B., Finkenstädt, B., Fiecas, M., and Huckstepp, R. (2021). “Identifying the recurrence of sleep apnea using a harmonic hidden Markov model.” The Annals of Applied Statistics, 15(3): 1171. MR4316645 10.1214/21-aoas1455
Hadj-Amar, B., Finkenstädt, B., Fiecas, M., Lévi, F., and Huckstepp, R. (2019). “Bayesian Model Search for Nonstationary Periodic Time Series.” Journal of the American Statistical Association, 1–29. MR4143468 10.1080/01621459.2019.1623043Hadj-Amar, B., Finkenstädt, B., Fiecas, M., Lévi, F., and Huckstepp, R. (2019). “Bayesian Model Search for Nonstationary Periodic Time Series.” Journal of the American Statistical Association, 1–29. MR4143468 10.1080/01621459.2019.1623043
Hadj-Amar, B., Jewson, J., and Fiecas, M. (2022). “Supplementary Material to “Bayesian Approximations to Hidden Semi-Markov Models”.” Bayesian Analysis. 10.1214/22-BA1318SUPPHadj-Amar, B., Jewson, J., and Fiecas, M. (2022). “Supplementary Material to “Bayesian Approximations to Hidden Semi-Markov Models”.” Bayesian Analysis. 10.1214/22-BA1318SUPP
Hoffman, M. D. and Gelman, A. (2014). “The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo.” Journal of Machine Learning Research, 15(1): 1593–1623. MR3214779Hoffman, M. D. and Gelman, A. (2014). “The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo.” Journal of Machine Learning Research, 15(1): 1593–1623. MR3214779
Huang, Q., Cohen, D., Komarzynski, S., Li, X.-M., Innominato, P., Lévi, F., and Finkenstädt, B. (2018). “Hidden Markov models for monitoring circadian rhythmicity in telemetric activity data.” Journal of The Royal Society Interface, 15(139): 20170885.Huang, Q., Cohen, D., Komarzynski, S., Li, X.-M., Innominato, P., Lévi, F., and Finkenstädt, B. (2018). “Hidden Markov models for monitoring circadian rhythmicity in telemetric activity data.” Journal of The Royal Society Interface, 15(139): 20170885.
Jennison, C. (1997). “Discussion of “On Bayesian analysis of mixtures with an unknown number of components” by S. Richardson and P. J. Green.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59(4): 778–779. MR1483213 10.1111/1467-9868.00095Jennison, C. (1997). “Discussion of “On Bayesian analysis of mixtures with an unknown number of components” by S. Richardson and P. J. Green.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 59(4): 778–779. MR1483213 10.1111/1467-9868.00095
Jewson, J., Smith, J., and Holmes, C. (2018). “Principles of Bayesian inference using general divergence criteria.” Entropy, 20(6): 442. MR3879894 10.3390/e20060442Jewson, J., Smith, J., and Holmes, C. (2018). “Principles of Bayesian inference using general divergence criteria.” Entropy, 20(6): 442. MR3879894 10.3390/e20060442
Johnson, M. J. and Willsky, A. S. (2013). “Bayesian nonparametric hidden semi-Markov models.” Journal of Machine Learning Research, 14(Feb): 673–701. MR3033344Johnson, M. J. and Willsky, A. S. (2013). “Bayesian nonparametric hidden semi-Markov models.” Journal of Machine Learning Research, 14(Feb): 673–701. MR3033344
Kass, R. E. and Raftery, A. E. (1995). “Bayes factors.” Journal of the American Statistical Association, 90(430): 773–795. MR3363402 10.1080/01621459.1995.10476572Kass, R. E. and Raftery, A. E. (1995). “Bayes factors.” Journal of the American Statistical Association, 90(430): 773–795. MR3363402 10.1080/01621459.1995.10476572
Kaur, G., Phillips, C., Wong, K., and Saini, B. (2013). “Timing is important in medication administration: a timely review of chronotherapy research.” International Journal of Clinical Pharmacy, 35(3): 344–358.Kaur, G., Phillips, C., Wong, K., and Saini, B. (2013). “Timing is important in medication administration: a timely review of chronotherapy research.” International Journal of Clinical Pharmacy, 35(3): 344–358.
Konishi, S. and Kitagawa, G. (2008). Information criteria and statistical modeling. Springer Science & Business Media. MR2367855 10.1007/978-0-387-71887-3Konishi, S. and Kitagawa, G. (2008). Information criteria and statistical modeling. Springer Science & Business Media. MR2367855 10.1007/978-0-387-71887-3
Langrock, R., Swihart, B. J., Caffo, B. S., Punjabi, N. M., and Crainiceanu, C. M. (2013). “Combining hidden Markov models for comparing the dynamics of multiple sleep electroencephalograms.” Statistics in Medicine, 32(19): 3342–3356. MR3074361 10.1002/sim.5747Langrock, R., Swihart, B. J., Caffo, B. S., Punjabi, N. M., and Crainiceanu, C. M. (2013). “Combining hidden Markov models for comparing the dynamics of multiple sleep electroencephalograms.” Statistics in Medicine, 32(19): 3342–3356. MR3074361 10.1002/sim.5747
Langrock, R. and Zucchini, W. (2011). “Hidden Markov models with arbitrary state dwell-time distributions.” Computational Statistics & Data Analysis, 55(1): 715–724. MR2736590 10.1016/j.csda.2010.06.015Langrock, R. and Zucchini, W. (2011). “Hidden Markov models with arbitrary state dwell-time distributions.” Computational Statistics & Data Analysis, 55(1): 715–724. MR2736590 10.1016/j.csda.2010.06.015
Leroux, B. G. and Puterman, M. L. (1992). “Maximum-penalized-likelihood estimation for independent and Markov-dependent mixture models.” Biometrics, 545–558.Leroux, B. G. and Puterman, M. L. (1992). “Maximum-penalized-likelihood estimation for independent and Markov-dependent mixture models.” Biometrics, 545–558.
Lindley, D. V. (1957). “A statistical paradox.” Biometrika, 44(1/2): 187–192. MR0087273 10.1093/biomet/44.1-2.179Lindley, D. V. (1957). “A statistical paradox.” Biometrika, 44(1/2): 187–192. MR0087273 10.1093/biomet/44.1-2.179
Meng, X.-L. and Schilling, S. (2002). “Warp bridge sampling.” Journal of Computational and Graphical Statistics, 11(3): 552–586. MR1938446 10.1198/106186002457Meng, X.-L. and Schilling, S. (2002). “Warp bridge sampling.” Journal of Computational and Graphical Statistics, 11(3): 552–586. MR1938446 10.1198/106186002457
Meng, X.-L. and Wong, W. H. (1996). “Simulating ratios of normalizing constants via a simple identity: a theoretical exploration.” Statistica Sinica, 831–860. MR1422406Meng, X.-L. and Wong, W. H. (1996). “Simulating ratios of normalizing constants via a simple identity: a theoretical exploration.” Statistica Sinica, 831–860. MR1422406
Migueles, J. H., Cadenas-Sanchez, C., Ekelund, U., Nyström, C. D., Mora-Gonzalez, J., Löf, M., Labayen, I., Ruiz, J. R., and Ortega, F. B. (2017). “Accelerometer data collection and processing criteria to assess physical activity and other outcomes: a systematic review and practical considerations.” Sports Medicine, 47(9): 1821–1845.Migueles, J. H., Cadenas-Sanchez, C., Ekelund, U., Nyström, C. D., Mora-Gonzalez, J., Löf, M., Labayen, I., Ruiz, J. R., and Ortega, F. B. (2017). “Accelerometer data collection and processing criteria to assess physical activity and other outcomes: a systematic review and practical considerations.” Sports Medicine, 47(9): 1821–1845.
Rabiner, L. R., Wilpon, J. G., and Soong, F. K. (1989). “High performance connected digit recognition using hidden Markov models.” IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(8): 1214–1225.Rabiner, L. R., Wilpon, J. G., and Soong, F. K. (1989). “High performance connected digit recognition using hidden Markov models.” IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(8): 1214–1225.
Raviv, J. (1967). “Decision making in Markov chains applied to the problem of pattern recognition.” IEEE Transactions on Information Theory, 13(4): 536–551.Raviv, J. (1967). “Decision making in Markov chains applied to the problem of pattern recognition.” IEEE Transactions on Information Theory, 13(4): 536–551.
Robert, C. (2007). The Bayesian choice: from decision-theoretic foundations to computational implementation. Springer Science & Business Media. MR2723361Robert, C. (2007). The Bayesian choice: from decision-theoretic foundations to computational implementation. Springer Science & Business Media. MR2723361
Roberts, G. O., Rosenthal, J. S., et al. (2001). “Optimal scaling for various Metropolis-Hastings algorithms.” Statistical science, 16(4): 351–367. MR1888450 10.1214/ss/1015346320Roberts, G. O., Rosenthal, J. S., et al. (2001). “Optimal scaling for various Metropolis-Hastings algorithms.” Statistical science, 16(4): 351–367. MR1888450 10.1214/ss/1015346320
Rossell, D. and Telesca, D. (2017). “Nonlocal priors for high-dimensional estimation.” Journal of the American Statistical Association, 112(517): 254–265. MR3646569 10.1080/01621459.2015.1130634Rossell, D. and Telesca, D. (2017). “Nonlocal priors for high-dimensional estimation.” Journal of the American Statistical Association, 112(517): 254–265. MR3646569 10.1080/01621459.2015.1130634
Silva, B. M., Rodrigues, J. J., de la Torre Díez, I., López-Coronado, M., and Saleem, K. (2015). “Mobile-health: A review of current state in 2015.” Journal of Biomedical Informatics, 56: 265–272.Silva, B. M., Rodrigues, J. J., de la Torre Díez, I., López-Coronado, M., and Saleem, K. (2015). “Mobile-health: A review of current state in 2015.” Journal of Biomedical Informatics, 56: 265–272.
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and Van Der Linde, A. (2002). “Bayesian measures of model complexity and fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4): 583–639. MR1979380 10.1111/1467-9868.00353Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and Van Der Linde, A. (2002). “Bayesian measures of model complexity and fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(4): 583–639. MR1979380 10.1111/1467-9868.00353
Stan Development Team (2018). “Stan Functions Reference.” URL https://mc-stan.org/docs/2_23/functions-reference/index.htmlStan Development Team (2018). “Stan Functions Reference.” URL https://mc-stan.org/docs/2_23/functions-reference/index.html
Stephens, M. (2000). “Dealing with label switching in mixture models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 62(4): 795–809. MR1796293 10.1111/1467-9868.00265Stephens, M. (2000). “Dealing with label switching in mixture models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 62(4): 795–809. MR1796293 10.1111/1467-9868.00265
Watanabe, S. (2010). “Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory.” Journal of Machine Learning Research, 11(Dec): 3571–3594. MR2756194Watanabe, S. (2010). “Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory.” Journal of Machine Learning Research, 11(Dec): 3571–3594. MR2756194
Whiteley, N., Andrieu, C., and Doucet, A. (2009). “Particle Markov chain Monte Carlo for multiple change-point problems.” Department of Mathematics, Bristol University, Bristol, UK, Technical Report, 911.Whiteley, N., Andrieu, C., and Doucet, A. (2009). “Particle Markov chain Monte Carlo for multiple change-point problems.” Department of Mathematics, Bristol University, Bristol, UK, Technical Report, 911.
Yildirim, S., Singh, S. S., and Doucet, A. (2013). “An online expectation–maximization algorithm for changepoint models.” Journal of Computational and Graphical Statistics, 22(4): 906–926. MR3173749 10.1080/10618600.2012.674653Yildirim, S., Singh, S. S., and Doucet, A. (2013). “An online expectation–maximization algorithm for changepoint models.” Journal of Computational and Graphical Statistics, 22(4): 906–926. MR3173749 10.1080/10618600.2012.674653