The Annals of Statistics

Subspace estimation and prediction methods for hidden Markov models

Sofia Andersson and Tobias Rydén

Full-text: Open access


Hidden Markov models (HMMs) are probabilistic functions of finite Markov chains, or, put in other words, state space models with finite state space. In this paper, we examine subspace estimation methods for HMMs whose output lies a finite set as well. In particular, we study the geometric structure arising from the nonminimality of the linear state space representation of HMMs, and consistency of a subspace algorithm arising from a certain factorization of the singular value decomposition of the estimated linear prediction matrix. For this algorithm, we show that the estimates of the transition and emission probability matrices are consistent up to a similarity transformation, and that the m-step linear predictor computed from the estimated system matrices is consistent, i.e., converges to the true optimal linear m-step predictor.

Article information

Ann. Statist., Volume 37, Number 6B (2009), 4131-4152.

First available in Project Euclid: 23 October 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62M09: Non-Markovian processes: estimation
Secondary: 62M10: Time series, auto-correlation, regression, etc. [See also 91B84] 62M20: Prediction [See also 60G25]; filtering [See also 60G35, 93E10, 93E11] 93B15: Realizations from input-output data 93B30: System identification

Hidden Markov model linear innovation representation prediction error representation subspace estimation consistency


Andersson, Sofia; Rydén, Tobias. Subspace estimation and prediction methods for hidden Markov models. Ann. Statist. 37 (2009), no. 6B, 4131--4152. doi:10.1214/09-AOS711.

Export citation


  • [1] Anderson, B. D. O. (1999). The realization problem for hidden Markov models. Math. Control Signals Systems 12 80–120.
  • [2] Andersson, S. and Rydén, T. (2009). Subspace estimation and prediction methods for hidden Markov models: Algorithms and consistency. Technical Report 2009:7, Centre for Mathematical Sciences, Lund University. Available at
  • [3] Andersson, S., Rydén, T. and Johansson, R. (2003). Linear optimal prediction and innovations representations of hidden Markov models. Stochastic Process. Appl. 108 131–149.
  • [4] Bauer, D. (1998). Some asymptotic theory for the estimation of linear systems using maximum likelihood methods of subspace algorithms. Ph.D. thesis, Technische Univ. Wien, Vienna.
  • [5] Baum, L. E. and Petrie, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Statist. 37 1554–1563.
  • [6] Baum, L. E., Petrie, T., Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist. 41 164–171.
  • [7] Bickel, P. J., Ritov, Y. and Rydén, T. (1998). Asymptotic normality of the maximum-likelihood estimator for general hidden Markov models. Ann. Statist. 26 1614–1635.
  • [8] Cappé, O., Moulines, E. and Rydén, T. (2005). Inference in Hidden Markov Models. Springer, New York.
  • [9] Chu, M. T. and Guo, Q. (1998). A numerical method for the inverse stochastic spectrum problem. SIAM J. Matrix Anal. Appl. 19 1027–1039.
  • [10] Deistler, M., Peternell, K. and Scherrer, W. (1995). Consistency and relative efficiency of subspace methods. Automatica 31 1865–1875.
  • [11] Doukhan, P. (1994). Mixing: Properties and Examples. Lecture Notes in Statistics 85. Springer, New York.
  • [12] Elliott, R. J., Aggoun, L. and Moore, J. B. (1995). Hidden Markov Models: Estimation and Control. Springer, New York.
  • [13] Golub, G. H. and Loan, C. F. V. (1996). Matrix Computations, 3rd ed. Johns Hopkins Univ. Press, Baltimore.
  • [14] Greub, W. (1975). Linear Algebra, 4th ed. Springer, New York.
  • [15] Hannan, E. J. and Deistler, M. (1988). The Statistical Theory of Linear Systems. Wiley, New York.
  • [16] Hjalmarsson, H. and Ninness, B. (1998). Fast, non-iterative estimation of hidden Markov models. In Proc. IEEE Conf. Acoustics, Speech and Signal Process (ICASSP’98) 2253–2256. Seattle.
  • [17] Hong-Zhi, A., Zhao-Guo, C. and Hannan, E. J. (1982). Autocorrelation, autoregression and autoregressive approximation. Ann. Statist. 10 926–936. (Correction note (1983) 11 1018.)
  • [18] Katayama, T. (2005). Subspace Methods for System Identification. Springer, New York.
  • [19] Koski, T. (2001). Hidden Markov Models for Bioinformatics. Kluwer, Dordrecht.
  • [20] Kotsalis, G., Megretski, A. and Dahleh, M. A. (2008). Balanced truncation for a class of stochastic jump linear systems and model reduction for hidden Markov models. IEEE Trans. Autom. Control 53 2543–2557.
  • [21] Leroux, B. G. (1992). Maximum-likelihood estimation for hidden Markov models. Stochastic Process. Appl. 40 127–143.
  • [22] MacDonald, I. L. and Zucchini, W. (1997). Hidden Markov and Other Models for Discrete-Valued Time Series. Chapman and Hall, London.
  • [23] Oodaira, H. and Yoshihara, K.-I. (1971). The law of iterated logarithm for stationary processes satisfying mixing conditions. Kōdai Math. Sem. Rep. 23 311–334.
  • [24] Orsi, R. (2006). Numerical methods for solving inverse eigenvalue problems for nonnegative matrices. SIAM J. Matrix Anal. Appl. 28 190–212.
  • [25] Ortega, J. M. (1987). Matrix Theory. A Second Course. Plenum Press, New York.
  • [26] Overschee, P. V. and Moor, B. D. (1996). Subspace Identification for Linear Systems. Kluwer, Dordrecht.
  • [27] Peternell, K. (1995). Identification of linear dynamic systems by subspace and realization-based algorithms. Ph.D. thesis, Technische Univ. Wien, Vienna.
  • [28] Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77 257–284.
  • [29] Rugh, W. J. (1993). Linear System Theory. Prentice-Hall, Englewood Cliffs, NJ.
  • [30] Vidyasagar, M. (2005). The relaization problem for hidden Markov models: The complete realization problem. In Proc. IEEE Conf. Decision Control 6632–6637. Seville.