The Annals of Statistics

Nearest neighbor classification with dependent training sequences

M. Holst and A. Irle

Full-text: Open access


The asymptotic classification risk for nearest neighbor procedures is well understood in the case of i.i.d. training sequences. In this article, we generalize these results to a class of dependent models including hidden Markov models. In the case where the observed patterns have Lebesgue densities, the asymptotic risk takes the same expression as in the i.i.d. case. For discrete distributions, we show that the asymptotic risk depends on the rule used for breaking ties of equal distances.

Article information

Ann. Statist. Volume 29, Number 5 (2001), 1424-1442.

First available in Project Euclid: 8 February 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]
Secondary: 62G20: Asymptotic properties

Nearest neighbor classification asymptotic risk dependent training samples


Holst, M.; Irle, A. Nearest neighbor classification with dependent training sequences. Ann. Statist. 29 (2001), no. 5, 1424--1442. doi:10.1214/aos/1013203460.

Export citation


  • Adams,T. M. and Nobel,A. B. (1998). On density estimation from ergodic processes. Ann. Probab. 26 794-804.
  • Bickel,P. J., Ritov,Y. and Ryden,T. (1998). Asymptotic normality of the maximum-likelihood estimator for general hidden Markov models. Ann. Statist. 26 1614-1635.
  • Cover,T. M. and Thomas,J. A. (1991). Elements of Information Theory. Wiley, New York.
  • Cover,T. M. and Hart,P. E. (1967). Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13 21-27.
  • Dasarathy,B. (1991). Nearest Neighbor Classification Techniques. IEEE, Los Alamitos, CA. Devroye, L. (1981a). On the inequality of Cover andHart in nearest neighbor discrimination. IEEE Trans. Pattern Anal. Mach. Intelligence 3 75-78. Devroye,L. (1981b). On the asymptotic probability of error in nonparametric discrimination. Ann. Statist. 9 1320-1327.
  • Devroye,L., Gy ¨orfi,L., Krzyzak,A. and Lugosi,G. (1994). On the strong universal consistency of nearest neighbor regression function estimates. Ann. Statist. 22 1371-1385.
  • Devroye,L., Gy ¨orfi,L. and Lugosi G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York.
  • Fix,E. and Hodges,J. (1951). Discriminatory analysis. Nonparametric discrimination: consistency properties. Technical report, USAF School of Aviation Medicine, Randolph Field.
  • Fix,E. and Hodges,J. (1952). Discriminatory analysis: small sample performance. Technical report, USAF School of Aviation Medicine, Randolph Field.
  • Gy ¨orfi,L., H¨ardle,W., Sarda,P. and Vieu,V. (1989). Nonparametric Curve Estimation from Time Series. Lecture Notes Statistics 60 Springer, Berlin.
  • Gy ¨orfi,L. and Lugosi,G. (1992). Kernel density estimation from ergodic samples is not universally consistent. Comp. Statist. Data Anal. 24 437-442.
  • Huang,X. D., Ariki,Y. and Jack,M. A. (1990). Hidden Markov Models for Speech Recognition. Edinburgh Univ. Press.
  • Irle,A. (1997). On consistency in nonparametric estimation under mixing assumptions. J. Multivariate Anal. 60 123-147.
  • Kahan,S.,Pavlides,T. and Baird,H. S. (1987). On the recognition of printedcharacters of any font andsize. IEEE Trans. Pattern Anal. Mach. Intelligence 9 274-288.
  • Kulkarni,S. R. and Posner,S. E. (1995). Rates of convergence of nearest neighbor estimation under arbitrary sampling. IEEE Trans. Inform. Theory 41 1028-1039.
  • Leadbetter,M. R.,Lindgren,G. and Rootzen,H. (1983). Extremes andRelatedProperties of Random Sequences and Processes. Springer, New York.
  • MacDonald,I. and Zucchini,W. (1997). Hidden Markov and Other Models for Discrete-valued Time-series. Chapman andHall, London.
  • Morvai,G., Yakowitz,S. and Gy ¨orfi,L. (1996). Nonparametric inference for ergodic stationary time series. Ann. Statist. 24 370-379.
  • Nobel,A. B., Morvai,G. and Kulkarni,S. R. (1998). Density estimation from an individual numerical sequence. IEEE Trans. Inform. Theory 44 537-541.
  • Shannon,C. E. (1951). Prediction and entropy of handwritten English. Bell Systems Tech. J. 30 50-64.
  • Smith,S. J., Bourgoin,M. O., Sims,K. and Voorhees,H. L. (1994). Handwritten character classification using nearest neighbor in large databases. IEEE Trans. Pattern Anal. Mach. Intelligence 3 75-78.
  • Snapp,R. S. and Venkatesh,S. S. (1998). Asymptotic expansion of the k nearest neighbor risk. Ann. Statist. 26 850-878.
  • Stone,C. (1977). Consistent nonparametric regression. Ann. Statist. 5 595-645.
  • Wheeden,R. L. and Zygmund A. (1977). Measure andIntegral. Marcel Dekker, New York.