The Annals of Statistics

SPRT and CUSUM in hidden Markov models

Cheng-Der Fuh

Full-text: Open access


In this paper, we study the problems of sequential probability ratio tests for parameterized hidden Markov models. We investigate in some detail the performance of the tests and derive corrected Brownian approximations for error probabilities and expected sample sizes. Asymptotic optimality of the sequential probability ratio test for testing simple hypotheses based on hidden Markov chain data is established. Next, we consider the cumulative sum (CUSUM) procedure for change point detection in this model. Based on the renewal property of the stopping rule, CUSUM can be regarded as a repeated one-sided sequential probability ratio test. Asymptotic optimality of the CUSUM procedure is proved in the sense of Lorden (1971). Motivated by the sequential analysis in hidden Markov models, Wald's likelihood ratio identity and Wald's equation for products of Markov random matrices are also given. We apply these results to several types of hidden Markov models: i.i.d. hidden Markov models, switch Gaussian regression and switch Gaussian autoregression, which are commonly used in digital communications, speech recognition, bioinformatics and economics.

Article information

Ann. Statist. Volume 31, Number 3 (2003), 942-977.

First available in Project Euclid: 25 June 2003

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60B15: Probability measures on groups or semigroups, Fourier transforms, factorization
Secondary: 60F05: Central limit and other weak theorems 60K15: Markov renewal processes, semi-Markov processes

Brownian approximation change point detection CUSUM first passage time products of random matrices renewal theory Wald's identity Wald's equation


Fuh, Cheng-Der. SPRT and CUSUM in hidden Markov models. Ann. Statist. 31 (2003), no. 3, 942--977. doi:10.1214/aos/1056562468.

Export citation


  • ALSMEy ER, G. (1994). On the Markov renewal theorem. Stochastic Process. Appl. 50 37-56.
  • ALSMEy ER, G. (2000). The ladder variables of a Markov random walk. Probab. Math. Statist. 20 151-168.
  • ASMUSSEN, S. (1989). Risk theory in a Markov environment. Scand. Actuar. J. 1989(2) 69-100.
  • BALL, F. and RICE, J. A. (1992). Stochastic models for ion channels: Introduction and bibliography. Math. Biosci. 112 189-206.
  • BANSAL, R. K. and PAPANTONI-KAZAKOS, P. (1986). An algorithm for detecting a change in a stochastic process. IEEE Trans. Inform. Theory 32 227-235.
  • BASSEVILLE, M. and NIKIFOROV, I. V. (1993). Detection of Abrupt Changes: Theory and Application. Prentice-Hall, Englewood Cliffs, NJ.
  • BAUM, L. E. and PETRIE, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Statist. 37 1554-1563.
  • BICKEL, P. and RITOV, Y. (1996). Inference in hidden Markov models. I. Local asy mptotic normality in the stationary case. Bernoulli 2 199-228.
  • BICKEL, P., RITOV, Y. and Ry DÉN, T. (1998). Asy mptotic normality of the maximum likelihood estimator for general hidden Markov models. Ann. Statist. 26 1614-1635.
  • BOUGEROL, P. (1988). Théorèmes limite pour les sy stèmes linéaires à coefficients markoviens. Probab. Theory Related Fields 78 193-221.
  • BOUGEROL, P. and LACROIX, J. (1985). Products of Random Matrices with Applications to Schrödinger Operators. Birkhäuser, Boston.
  • CHURCHILL, G. A. (1989). Stochastic models for heterogeneous DNA sequences. Bull. Math. Biol. 51 79-94.
  • COGBURN, R. (1980). Markov chains in random environments: The case of Markovian environments. Ann. Probab. 8 908-916.
  • CSISZÁR, I. and NARAy AN, P. (1988). Arbitrarily varying channels with constrained inputs and states. IEEE Trans. Inform. Theory 34 27-34.
  • ELLIOTT, R., AGGOUN, L. and MOORE, J. (1995). Hidden Markov Models: Estimation and Control. Springer, New York.
  • ENGEL, C. and HAMILTON, J. D. (1990). Long swings in the dollar: Are they in the data and do markets know it? American Economic Review 80 689-713.
  • ENGLE, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50 987-1008.
  • FUH, C. D. (1997). Corrected diffusion approximations for ruin probabilities in a Markov random walk. Adv. in Appl. Probab. 29 695-712.
  • FUH, C. D. (1998). Efficient likelihood estimation of hidden Markov models. Technical report, Institute of Statistical Science, Taipei, Taiwan, ROC.
  • FUH, C. D. and LAI, T. L. (1998). Wald's equations, first passage times and moments of ladder variables in Markov random walks. J. Appl. Probab. 35 566-580.
  • FUH, C. D. and LAI, T. L. (2001). Asy mptotic expansions in multidimensional Markov renewal theory and first passage times for Markov random walks. Adv. in Appl. Prob. 33 652-673.
  • FUH, C. D. and ZHANG, C.-H. (2000). Poisson equation, maximal inequalities and r-quick convergence for Markov random walks. Stochastic Process. Appl. 87 53-67.
  • GOLDFELD, S. M. and QUANDT, R. E. (1973). A Markov model for switching regressions. J. Econometrics 1 3-16.
  • GUIVARCH, Y. and RAUGI, A. (1986). Products of random matrices: Convergence theorems. In Random Matrices and Their Applications (J. E. Cohen, H. Kesten and C. M. Newman, eds.) 31-54. Amer. Math. Soc., Providence, RI.
  • HAMILTON, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cy cle. Econometrica 57 357-384.
  • HAMILTON, J. D. (1994). Time Series Analy sis. Princeton Univ. Press.
  • HAMILTON, J. D. (1996). Specification testing in Markov-switching time series models. J. Econometrics 70 127-157.
  • HELGASON, S. (1962). Differential Geometry and Sy mmetric Spaces. Academic Press, New York.
  • ITÔ, H., AMARI, S.-I. and KOBAy ASHI, K. (1992). Identifiability of hidden Markov information sources and their minimum degrees of freedom. IEEE Trans. Inform. Theory 38 324-333.
  • JENSEN, J. L. (1987). A note on asy mptotic expansions for Markov chains using operator theory. Adv. in Appl. Math. 8 377-392.
  • JUANG, B.-H. and RABINER, L. R. (1985). A probabilistic distance measure for hidden Markov models. AT&T Tech. J. 64 391-408.
  • KESTEN, H. (1973). Random difference equations and renewal theory for products of random matrices. Acta Math. 131 207-248.
  • KESTEN, H. (1974). Renewal theory for functionals of a Markov chain with general state space. Ann. Probab. 2 355-386.
  • KROGH, A., BROWN, M., MIAN, I. S., SJOLANDER, K. and HAUSSLER, D. (1994). Hidden Markov models in computational biology: Applications to protein modeling. J. Molecular Biology 235 1501-1531.
  • LAI, T. L. (1995). Sequential change point detection in quality control and dy namical sy stems (with discussion). J. Roy. Statist. Soc. Ser. B 57 613-658.
  • LAI, T. L. (1998). Information bounds and quick detection of parameter changes in stochastic sy stems. IEEE Trans. Inform. Theory 44 2917-2929.
  • LEROUX, B. G. (1992). Maximum likelihood estimation for hidden Markov models. Stochastic Process. Appl. 40 127-143.
  • LIU, C. C. and NARAy AN, P. (1994). Order estimation and sequential universal data compression of a hidden Markov source by the method of mixtures. IEEE Trans. Inform. Theory 40 1167-1180.
  • LIU, J. S., NEUWALD, A. F. and LAWRENCE, C. E. (1999). Markovian structures in biological sequence alignments. J. Amer. Statist. Assoc. 94 1-15.
  • LORDEN, G. (1971). Procedures for reacting to a change in distribution. Ann. Math. Statist. 41 1897-1908.
  • MERHAV, N. (1991). Universal classification for hidden Markov models. IEEE Trans. Inform. Theory 37 1586-1594.
  • MEy N, S. P. and TWEEDIE, R. L. (1993). Markov Chains and Stochastic Stability. Springer, New York.
  • MOUSTAKIDES, G. V. (1986). Optimal stopping times for detecting changes in distributions. Ann. Statist. 14 1379-1387.
  • NAGAEV, S. V. (1957). Some limit theorems for stationary Markov chains. Theory Probab. Appl. 2 378-406.
  • PAGE, E. S. (1954). Continuous inspection schemes. Biometrika 41 100-114.
  • RABINER, L. R. and JUANG, B.-H. (1993). Fundamentals of Speech Recognition. Prentice-Hall, Englewood Cliffs, NJ.
  • RIESZ, F. and SZ.-NAGY, B. (1955). Functional Analy sis. Ungar, New York.
  • RITOV, Y. (1990). Decision theoretic optimality of the CUSUM procedure. Ann. Statist. 18 1464-1469.
  • SADOWSKY, J. S. (1989). A dependent data extension of Wald's identity and its application to sequential test performance computation. IEEE Trans. Inform. Theory 35 834-842.
  • SIEGMUND, D. (1979). Corrected diffusion approximations in certain random walk problems. Adv. in Appl. Probab. 11 701-719.
  • SIEGMUND, D. (1985). Sequential Analy sis. Tests and Confidence Intervals. Springer, New York.
  • STONE, C. (1965). On characteristic functions and renewal theory. Trans. Amer. Math. Soc. 120 327-342.
  • WALD, A. and WOLFOWITZ, J. (1948). Optimum character of the sequential probability ratio test. Ann. Math. Statist. 19 326-339.
  • WOODROOFE, M. (1982). Nonlinear Renewal Theory in Sequential Analy sis. SIAM, Philadelphia.
  • YAKIR, B. (1994). Optimal detection of a change in distribution when the observations form a Markov chain with a finite state space. In Change-Point Problems (E. Carlstein, H. G. Müller and D. Siegmund, eds.) 346-358. IMS, Hay ward, CA.
  • ZIV, J. (1985). Universal decoding for finite-state channels. IEEE Trans. Inform. Theory 31 453-460.
  • TAIPEI, 11529
  • TAIWAN, R. O. C. E-MAIL: