Bayesian Analysis

EM versus Markov chain Monte Carlo for estimation of hidden Markov models: a computational perspective

Tobias Rydén

Full-text: Open access


Hidden Markov models (HMMs) and related models have become standard in statistics during the last 15--20 years, with applications in diverse areas like speech and other statistical signal processing, hydrology, financial statistics and econometrics, bioinformatics etc. Inference in HMMs is traditionally often carried out using the EM algorithm, but examples of Bayesian estimation, in general implemented through Markov chain Monte Carlo (MCMC) sampling are also frequent in the HMM literature. The purpose of this paper is to compare the EM and MCMC approaches in three cases of different complexity; the examples include model order selection, continuous-time HMMs and variants of HMMs in which the observed data depends on many hidden variables in an overlapping fashion. All these examples in some way or another originate from real-data applications. Neither EM nor MCMC analysis of HMMs is a black-box methodology without need for user-interaction, and we will illustrate some of the problems, like poor mixing and long computation times, one may expect to encounter.

Article information

Bayesian Anal., Volume 3, Number 4 (2008), 659-688.

First available in Project Euclid: 22 June 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

hidden Markov model incomplete data missing data EM Markov chain Monte Carlo trans-dimensional Monte Carlo computational statistics


Rydén, Tobias. EM versus Markov chain Monte Carlo for estimation of hidden Markov models: a computational perspective. Bayesian Anal. 3 (2008), no. 4, 659--688. doi:10.1214/08-BA326.

Export citation


  • Aitkin, M. (2001). "Likelihood and Bayesian analysis of mixtures." Statistical Modelling, 1: 287–304.
  • Albertson, D. G., Collins, C., McCormick, F., and Gray, J. W. (2003). "Chromosome aberrations in solid tumors." Nature Genetics, 34: 369–376.
  • Albertson, D. G. and Pinkel, D. (2003). "Genomic microarrays in human genetic disease and cancer." Human Molecular Genetics, 12: R145–R152.
  • Ball, F. G., Cai, Y., Kadane, J. B., and O'Hagan, A. (1999). "Bayesian inference for ion-channel gating mechanisms directly from single-channel recordings, using Markov chain Monte Carlo." Proceedings of the Royal Society of London - Series A, 455: 2879–2932.
  • Baum, L. E. and Petrie, T. (1966). "Statistical inference for probabilistic functions of finite state Markov chains." Annals of Mathematical Statistics, 37: 1554–1563.
  • Baum, L. E., Petrie, T., Soules, G., and Weiss, N. A. (1970). "A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains." Annals of Mathematical Statistics, 41: 164–171.
  • Bhar, R. and Hamori, S. (2004). Hidden Markov models: Applications to Financial Economics. Boston, MA: Kluwer Academic Publishers.
  • Bickel, P. J., Ritov, Y., and Rydén, T. (1998). "Asymptotic normality of the maximum-likelihood estimator for general hidden Markov models." Annals of Statistics, 26: 1614–1635.
  • Bruder, C. E. G., Piotrowski, A., Gijsbers, A. A. C. J., Andersson, R., Erickson, S., Diaz de Ståhl, T., Menzel, U., Sandgren, J., von Tell, D., Poplawski, A., Crowley, M., Crasto, C., Partridge, E. C., Tiwari, H., Allison, D. B., Komorowski, J., van Ommen, G.-J. B., Boomsma, D. I., Pedersen, N. L., den Dunnen, J. T., Wirdefeldt, K., and Dumanski, J. P. (2008). "Phenotypically concordant and discordant monozygotic twins display different DNA" copy-number-variation profiles. American Journal of Human Genetics, 82: 763–771.
  • Cappé, O., Moulines, E., and Rydén, T. (2005). Inference in Hidden Markov Models. New York: Springer.
  • Carlstein, E., Do, K.-A., Hall, P., Hesterberg, T., and Künsch, H. R. (1998). "Matched-block bootstrap for dependent data." Bernoulli, 4: 305–328.
  • Chib, S. (1996). "Calculating posterior distributions and modal estimates in Markov mixture models." Journal of Econometrics, 75: 79–97.
  • Chopin, N. (2007). "Inference and model choice for sequentially ordered hidden Markov models." Journal of the Royal Statistical Society - Series B, 69: 269–284.
  • Damien, P., Wakefield, J., and Walker, S. (1999). "Gibbs sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables." Journal of the Royal Statistical Society - Series B, 61: 331–344.
  • Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). "Maximum likelihood from incomplete data via the EM" algorithm (with discussion). Journal of the Royal Statistical Society - Series B, 39: 1–38.
  • Ferguson, T. S. (1973). "A Bayesian analysis of some nonparametric problems." Annals of Statistics, 1: 209–230.
  • –- (1996). A Course in Large Sample Theory. London: Chapman & Hall.
  • Fridyland, J., Snijders, A. M., Pinkel, D., Albertson, D. G., and Jain, A. N. (2004). "Hidden Markov models approach to the analysis of array CGH" data. Journal of Multivariate Analysis, 90: 132–153.
  • Frühwirth-Schnatter, S. (2001). "Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models." Journal of the American Statistical Association, 96: 194–209.
  • –- (2004). "Estimating marginal likelihoods for mixture and Markov switching models using bridge sampling techniques." Econometrics Journal, 7: 143–167.
  • –- (2006). Finite Mixture and Markov Switching Models. New York: Springer.
  • Green, P. J. (1995). "Reversible jump Markov chain Monte Carlo computation and Bayesian model determination." Biometrika, 82: 711–732.
  • Hansen, B. (1992). "The likelihood ratio test under non-standard conditions: Testing the Markov switching model of GNP". Journal of Applied Econometrics, 7: S61–S82.
  • Jasra, A., Holmes, C. C., and Stephens, D. A. (2005). "Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture modelling." Statistical Science, 20: 50–67.
  • Jelinek, F. (1998). Statistical Methods for Speech Recognition. Cambridge, MA: MIT Press.
  • Koski, T. (2001). Hidden Markov Models for Bioinformatics. Dordrecht: Kluwer Academic Publishers.
  • Krolzig, H.-M. (1997). Markov-switching vector autoregressions. Modelling, statistical inference, and application to business cycle analysis, volume 454 of Lecture Notes in Economics and Mathematical Systems. Berlin: Springer-Verlag.
  • Levinson, S. E., Rabiner, L. R., and Sondhi, M. M. (1983). "An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition." Bell System Techical Journal, 62: 1035–1074.
  • Lystig, T. and Hughes, J. P. (2002). "Exact computation of the observed information matrix for hidden Markov models." Journal of Computational and Graphical Statistics, 11: 678–689.
  • MacDonald, I. L. and Zucchini, W. (1997). Hidden Markov and Other Models for Discrete-valued Time Series. London: Chapman & Hall.
  • McLachlan, G. and Peel, D. (2000). Finite Mixture Models. New York: Wiley.
  • Meilijson, I. (1989). "A fast improvement to the EM" algorithm on its own terms. Journal of the Royal Statistical Society - Series B, 51: 127–138.
  • Meng, X.-L. and Wong, W. H. (1996). "Simulating ratios of normalizing constants via a simple identity: A theoretical exploration." Statistica Sinica, 6: 831–860.
  • Papaspiliopoulos, O., Roberts, G., and Sköld, M. (2007). "A general framework for the parametrization of hierarchical models." Statistical Science, 22: 59–73.
  • Petrie, T. (1969). "Probabilistic functions of finite state Markov chains." Annals of Mathematical Statistics, 40: 97–115.
  • Picard, F., Robin, S., Lavielle, M., Vaisse, C., and J., D. J. (2005). "A statistical approach for array CGH" data analysis. BMC Bioinformatics, 6: 27.
  • Raj, B. (2002). "Asymmetry of business cycles: the Markov-switching approach." In Ullah, A., Wan, A., and Chaturvedi, A. (eds.), Handbook of Applied Econometrics and Statistical Inference, 687–710. New York: Marcel Dekker.
  • Richardson, S. and Green, P. (1997). "On Bayesian analysis of mixtures with an unknown number of components." Journal of the Royal Statistical Society - Series B, 59: 731–792.
  • Robert, C. P., Rydén, T., and Titterington, D. M. (2000). "Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method." Journal of the Royal Statistical Society - Series B, 62: 57–75.
  • Roberts, W. J. J. and Ephraim, Y. (2008). "An EM" algorithm for ion-channel current estimation. IEEE Transactions on Signal Processing, 56: 26–33.
  • Rydén, T., Teräsvirta, T., and Åsbrink, S. (1998). "Stylized facts of daily return series and the hidden Markov model of absolute returns." Journal of Applied Econometrics, 13: 217–244.
  • Scott, S. L. (2002). "Bayesian methods for hidden Markov models: Recursive computing in the 21st century." Journal of the American Statistical Association, 97: 337–351.
  • Snijders, A. M., Nowak, N., Segraves, R., Blackwood, S., Brown, N., Conroy, J., Hamilton, G., Hindle, A. K., Huey, B., Kimura, K., Law, S., Myambo, K., Palmer, J., Ylstra, B., Yue, J. P., Gray, J. W., Jain, A. N., Pinkel, D., and Albertson, D. G. (2001). "Assembly of microarrays for genome-wide measurement of DNA" copy number. Nature Genetics, 29: 263–264.
  • Stjernqvist, S., Rydén, T., Sköld, M., and Staaf, J. (2007). "Continuous-index hidden Markov modelling of CgH" copy number data. Bioinformatics, 23: 1006–1014.
  • Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M. (2006). "Hierarchical Dirichlet processes." Journal of the American Statistical Association, 101: 1566–1581.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge: Cambridge University Press.
  • Wu, J. C. F. (1982). "On the convergence properties of the EM" algorithm. Annals of Statistics, 11: 95–103.

See also

  • Related item: Sylvia Frühwirth-Schnatter. Comment on article by Rydén. Bayesian Anal., Vol. 3, Iss. 4 (2008), 689-697.
  • Related item: Sergey Kirshner, Padhraic Smyth. Comment on article by Rydén. Bayesian Anal., Vol. 3, Iss. 4 (2008), 699-705.
  • Related item: Tobias Rydén. Rejoinder. Bayesian Anal., Vol. 3, Iss. 4 (2008), 707-715.