The Annals of Statistics

Consistency of the maximum likelihood estimator for general hidden Markov models

Randal Douc, Eric Moulines, Jimmy Olsson, and Ramon van Handel

Full-text: Access has been disabled (more information)

Abstract

Consider a parametrized family of general hidden Markov models, where both the observed and unobserved components take values in a complete separable metric space. We prove that the maximum likelihood estimator (MLE) of the parameter is strongly consistent under a rather minimal set of assumptions. As special cases of our main result, we obtain consistency in a large class of nonlinear state space models, as well as general results on linear Gaussian state space models and finite state models.

A novel aspect of our approach is an information-theoretic technique for proving identifiability, which does not require an explicit representation for the relative entropy rate. Our method of proof could therefore form a foundation for the investigation of MLE consistency in more general dependent and non-Markovian time series. Also of independent interest is a general concentration inequality for V-uniformly ergodic Markov chains.

Article information

Source
Ann. Statist. Volume 39, Number 1 (2011), 474-513.

Dates
First available in Project Euclid: 15 February 2011

Permanent link to this document
http://projecteuclid.org/euclid.aos/1297779854

Digital Object Identifier
doi:10.1214/10-AOS834

Mathematical Reviews number (MathSciNet)
MR2797854

Zentralblatt MATH identifier
1209.62194

Subjects
Primary: 60F10: Large deviations 62B10: Information-theoretic topics [See also 94A17] 62F12: Asymptotic properties of estimators 62M09: Non-Markovian processes: estimation
Secondary: 60J05: Discrete-time Markov processes on general state spaces 62M05: Markov processes: estimation 62M10: Time series, auto-correlation, regression, etc. [See also 91B84] 94A17: Measures of information, entropy

Keywords
Hidden Markov models maximum likelihood estimation strong consistency V-uniform ergodicity concentration inequalities state space models

Citation

Douc, Randal; Moulines, Eric; Olsson, Jimmy; van Handel, Ramon. Consistency of the maximum likelihood estimator for general hidden Markov models. Ann. Statist. 39 (2011), no. 1, 474--513. doi:10.1214/10-AOS834. http://projecteuclid.org/euclid.aos/1297779854.


Export citation

References

  • [1] Adamczak, R. (2008). A tail inequality for suprema of unbounded empirical processes with applications to Markov chains. Electron. J. Probab. 13 1000–1034.
  • [2] Barron, A. (1985). The strong ergodic theorem for densities; generalized Shannon–McMillan–Breiman theorem. Ann. Probab. 13 1292–1303.
  • [3] Baum, L. E. and Petrie, T. P. (1966). Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Statist. 37 1554–1563.
  • [4] Bertsekas, D. P. and Shreve, S. E. (1978). Stochastic Optimal Control: The Discrete Time Case. Mathematics in Science and Engineering 139. Academic Press, New York.
  • [5] Billingsley, P. (1995). Probability and Measure, 3rd ed. Wiley, New York.
  • [6] Cappé, O., Moulines, E. and Rydén, T. (2005). Inference in Hidden Markov Models. Springer, New York.
  • [7] Churchill, G. (1992). Hidden Markov chains and the analysis of genome structure. Computers & Chemistry 16 107–115.
  • [8] Douc, R. and Matias, C. (2001). Asymptotics of the maximum likelihood estimator for general hidden Markov models. Bernoulli 7 381–420.
  • [9] Douc, R., Moulines, E. and Rydén, T. (2004). Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime. Ann. Statist. 32 2254–2304.
  • [10] Dupuis, P. and Ellis, R. S. (1997). A Weak Convergence Approach to the Theory of Large Deviations. Wiley, New York.
  • [11] Fredkin, D. and Rice, J. (1987). Correlation functions of a function of a finite-state Markov process with application to channel kinetics. Math. Biosci. 87 161–172.
  • [12] Fuh, C.-D. (2006). Efficient likelihood estimation in state space models. Ann. Statist. 34 2026–2068.
  • [13] Fuh, C.-D. (2010). Reply to “On some problems in the article Efficient Likelihood Estimation in State Space Models” by Cheng-Der Fuh [Ann. Statist. 34 (2006) 2026–2068]. Ann. Statist. 38 1282–1285.
  • [14] Genon-Catalot, V. and Laredo, C. (2006). Leroux’s method for general hidden Markov models. Stochastic Process. Appl. 116 222–243.
  • [15] Glynn, P. W. and Meyn, S. P. (1996). A Liapounov bound for solutions of the Poisson equation. Ann. Probab. 24 916–931.
  • [16] Glynn, P. W. and Ormoneit, D. (2002). Hoeffding’s inequality for uniformly ergodic Markov chains. Statist. Probab. Lett. 56 143–146.
  • [17] Hull, J. and White, A. (1987). The pricing of options on assets with stochastic volatilities. J. Finance 42 281–300.
  • [18] Jensen, J. L. (2010). On some problems in the article Efficient Likelihood Estimation in State Space Models by Cheng-Der Fuh [Ann. Statist. 34 (2006) 2026–2068]. Ann. Statist. 38 1279–1281.
  • [19] Juang, B. and Rabiner, L. (1991). Hidden Markov models for speech recognition. Technometrics 33 251–272.
  • [20] Kalashnikov, V. V. (1994). Regeneration and general Markov chains. J. Appl. Math. Stochastic Anal. 7 357–371.
  • [21] Le Gland, F. and Mevel, L. (2000). Basic properties of the projective product with application to products of column-allowable nonnegative matrices. Math. Control Signals Systems 13 41–62.
  • [22] Le Gland, F. and Mevel, L. (2000). Exponential forgetting and geometric ergodicity in hidden Markov models. Math. Control Signals Systems 13 63–93.
  • [23] Leroux, B. G. (1992). Maximum-likelihood estimation for hidden Markov models. Stochastic Process. Appl. 40 127–143.
  • [24] Liebscher, E. (2005). Towards a unified approach for proving geometric ergodicity and mixing properties of nonlinear autoregressive processes. J. Time Ser. Anal. 26 669–689.
  • [25] Mamon, R. S. and Elliott, R. J. (2007). Hidden Markov Models in Finance. International Series in Operations Research & Management Science 104. Springer, Berlin.
  • [26] Marton, K. and Shields, P. C. (1994). The positive-divergence and blowing-up properties. Israel J. Math. 86 331–348.
  • [27] Meyn, S. P. and Tweedie, R. L. (1993). Markov Chains and Stochastic Stability. Springer, London.
  • [28] Petrie, T. (1969). Probabilistic functions of finite state Markov chains. Ann. Math. Statist. 40 97–115.
  • [29] Roberts, G. O. and Tweedie, R. L. (1999). Bounds on regeneration times and convergence rates for Markov chains. Stochastic Process. Appl. 80 211–229.
  • [30] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
  • [31] van Handel, R. (2009). The stability of conditional Markov processes and Markov chains in random environments. Ann. Probab. 37 1876–1925.
  • [32] Williams, D. (1991). Probability With Martingales. Cambridge Univ. Press, Cambridge.