Choice of neighbor order in nearest-neighbor classification



The Annals of Statistics

Choice of neighbor order in nearest-neighbor classification

Peter Hall, Byeong U. Park, and Richard J. Samworth

Source: Ann. Statist. Volume 36, Number 5 (2008), 2135-2152.

Abstract

The kth-nearest neighbor rule is arguably the simplest and most intuitively appealing nonparametric classification procedure. However, application of this method is inhibited by lack of knowledge about its properties, in particular, about the manner in which it is influenced by the value of k; and by the absence of techniques for empirical choice of k. In the present paper we detail the way in which the value of k determines the misclassification error. We consider two models, Poisson and Binomial, for the training samples. Under the first model, data are recorded in a Poisson stream and are “assigned” to one or other of the two populations in accordance with the prior probabilities. In particular, the total number of data in both training samples is a Poisson-distributed random variable. Under the Binomial model, however, the total number of data in the training samples is fixed, although again each data value is assigned in a random way. Although the values of risk and regret associated with the Poisson and Binomial models are different, they are asymptotically equivalent to first order, and also to the risks associated with kernel-based classifiers that are tailored to the case of two derivatives. These properties motivate new methods for choosing the value of k.

Primary Subjects: 62H30
Secondary Subjects: 62G20
Keywords: Bayes classifier; bootstrap resampling; Edgeworth expansion; error probability; misclassification error; nonparametric classification; Poisson distribution

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
This document is available for purchase at a cost of $15. Select the "buy article" button below to make a credit card purchase of this document through a secure payment site.
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1223908087
Digital Object Identifier: doi:10.1214/07-AOS537

References

Audibert, J.-Y. and Tsybakov, A. B. (2007). Fast learning rates for plug-in classifiers under the margin condition. Ann. Statist. 35 608–633.
Mathematical Reviews (MathSciNet): MR2336861
Digital Object Identifier: doi:10.1214/009053606000001217
Project Euclid: euclid.aos/1183667286
Bax, E. (2000). Validation of nearest neighbor classifiers. IEEE Trans. Inform. Theory 46 2746–2752.
Mathematical Reviews (MathSciNet): MR1807404
Digital Object Identifier: doi:10.1109/18.887892
Cover, T. M. (1968). Rates of convergence for nearest neighbor procedures. In Proceedings of the Hawaii International Conference on System Sciences (B. K. Kinariwala and F. F. Kuo, eds.) 413–415. Univ. Hawaii Press, Honolulu.
Cover, T. M. and Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13 21–27.
Devroye, L. (1981). On the asymptotic probability of error in nonparametric discrimination. Ann. Statist. 9 1320–1327.
Mathematical Reviews (MathSciNet): MR630114
Digital Object Identifier: doi:10.1214/aos/1176345648
Project Euclid: euclid.aos/1176345648
Devroye, L., Györfi, L., Krzyżak, A. and Lugosi, G. (1994). On the strong universal consistency of nearest neighbor regression function estimates. Ann. Statist. 22 1371–1385.
Mathematical Reviews (MathSciNet): MR1292545
Digital Object Identifier: doi:10.1214/aos/1176325500
Project Euclid: euclid.aos/1176325500
Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York.
Mathematical Reviews (MathSciNet): MR1383093
Zentralblatt MATH: 0853.68150
Devroye, L. and Wagner, T. J. (1977). The strong uniform consistency of nearest neighbor density estimates. Ann. Statist. 5 536–540.
Mathematical Reviews (MathSciNet): MR436442
Digital Object Identifier: doi:10.1214/aos/1176343851
Project Euclid: euclid.aos/1176343851
Devroye, L. and Wagner, T. J. (1982). Nearest neighbor methods in discrimination. In Classification, Pattern Recognition and Reduction of Dimensionality. Handbook of Statistics 2 (P. R. Krishnaiah and L. N. Kanal, eds.) 193–197. North-Holland, Amsterdam.
Mathematical Reviews (MathSciNet): MR716698
Györfi, L. and Györfi, Z. (1978). An upper bound on the asymptotic error probability of the k-nearest neighbor rule for multiple classes. IEEE Trans. Inform. Theory 24 512–514.
Mathematical Reviews (MathSciNet): MR501596
Digital Object Identifier: doi:10.1109/TIT.1978.1055900
Györfi, L., Kohler, M., Krzyżak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer, New York.
Mathematical Reviews (MathSciNet): MR1987353
Fix, E. and Hodges, J. L., Jr. (1951). Discriminatory analysis, nonparametric discrimination, consistency properties. Randolph Field, Texas, Project 21-49-004, Report No. 4.
Fritz, J. (1975). Distribution-free exponential error bound for nearest neighbor pattern classification. IEEE Trans. Inform. Theory 21 552–557.
Mathematical Reviews (MathSciNet): MR395379
Digital Object Identifier: doi:10.1109/TIT.1975.1055443
Györfi, L. (1978). On the rate of convergence of nearest neighbor rules. IEEE Trans. Inform. Theory 24 509–512.
Mathematical Reviews (MathSciNet): MR501595
Györfi, L. (1981). The rate of convergence of kNN regression estimates and classification rules. IEEE Trans. Inform. Theory 27 362–364.
Mathematical Reviews (MathSciNet): MR619124
Digital Object Identifier: doi:10.1109/TIT.1981.1056344
Hall, P. and Kang, K.-H. (2005). Bandwidth choice for nonparametric classification. Ann. Statist. 33 284–306.
Mathematical Reviews (MathSciNet): MR2157804
Digital Object Identifier: doi:10.1214/009053604000000959
Project Euclid: euclid.aos/1112967707
Hall, P., Park, B. U. and Samworth, R. J. (2007). Choice of neighbour order for nearest-neighbour classification rule. Available at http://stat.snu.ac.kr/theostat/papers/hps.pdf.
Holst, M. and Irle, A. (2001). Nearest neighbor classification with dependent training sequences. Ann. Statist. 29 1424–1442.
Mathematical Reviews (MathSciNet): MR1873337
Digital Object Identifier: doi:10.1214/aos/1013203460
Project Euclid: euclid.aos/1013203460
Kharin, Yu. S. (1982). Asymptotic expansions for the risk of parametric and nonparametric decision functions. In Transactions of the Ninth Prague Conference on Information Theory, Statistical Decision Functions, Random Processes B 11–16. Reidel, Dordrecht.
Mathematical Reviews (MathSciNet): MR757900
Zentralblatt MATH: 0554.62052
Kharin, Yu. S. and Ducinskas, K. (1979). The asymptotic expansion of the risk for classifiers using maximum likelihood estimates. Statist. Problemy Upravleniya—Trudy Sem. Protsessy Optimal. Upravleniya V Sektsiya 38 77–93. (In Russian.)
Mathematical Reviews (MathSciNet): MR565564
Kohler, M. and Kryżak, A. (2006). On the rate of convergence of local averaging plug-in classification rules under a margin condition. Manuscript.
Kulkarni, S. R. and Posner, S. E. (1995). Rates of convergence of nearest neighbor estimation under arbitrary sampling. IEEE Trans. Inform. Theory 41 1028–1039.
Mathematical Reviews (MathSciNet): MR1366756
Digital Object Identifier: doi:10.1109/18.391248
Mammen, E. and Tsybakov, A. B. (1999). Smooth discrimination analysis. Ann. Statist. 27 1808–1829.
Mathematical Reviews (MathSciNet): MR1765618
Digital Object Identifier: doi:10.1214/aos/1017939240
Project Euclid: euclid.aos/1017939240
Marron, J. S. (1983). Optimal rates on convergence to Bayes risk in nonparametric discrimination. Ann. Statist. 11 1142–1155.
Mathematical Reviews (MathSciNet): MR720260
Project Euclid: euclid.aos/1176346328
Psaltis, D., Snapp, R. R. and Venkatesh, S. S. (1994). On the finite sample performance of the nearest neighbor classifier. IEEE Trans. Inform. Theory 40 820–837.
Raudys, Š. and Young, D. (2004). Results in statistical discriminant analysis: A review of the former Soviet Union literature. J. Multivariate Anal. 89 1–35.
Mathematical Reviews (MathSciNet): MR2041207
Digital Object Identifier: doi:10.1016/S0047-259X(02)00021-0
Snapp, R. R. and Venkatesh, S. S. (1998). Asymptotic expansion of the k nearest neighbor risk. Ann. Statist. 26 850–878.
Mathematical Reviews (MathSciNet): MR1635410
Digital Object Identifier: doi:10.1214/aos/1024691080
Project Euclid: euclid.aos/1024691080
Wagner, T. J. (1971). Convergence of the nearest neighbor rule. IEEE Trans. Inform. Theory 17 566–571.
Mathematical Reviews (MathSciNet): MR298829
Digital Object Identifier: doi:10.1109/TIT.1971.1054698

2009 © Institute of Mathematical Statistics