Source: Ann. Statist. Volume 26, Number 3
(1998), 850-878.
The finite-sample risk of the $k$ nearest neighbor classifier that
uses a weighted $L^p$-metric as a measure of class similarity is examined. For
a family of classification problems with smooth distributions in $mathbb{R}^n$,
an asymptotic expansion for the risk is obtained in decreasing fractional
powers of the reference sample size. An analysis of the leading expansion
coefficients reveals that the optimal weighted $L^p$-metric, that is, the
metric that minimizes the finite-sample risk, tends to a weighted Euclidean
(i.e., $L^2$) metric as the sample size is increased. Numerical simulations
corroborate this finding for a pattern recognition problem with normal
class-conditional densities.
References
1 BREIMAN, L., FRIEDMAN, J. H., OLSHEN, R. A. and STONE, C. J. 1984. Classification and Regression Trees. Wadsworth & Brooks Cole, Pacific Grove, CA.
2 COVER, T. M. 1968. Rates of convergence of nearest neighbor decision procedures. In Proceedings First Annual Hawaii Conference on Sy stems Theory 413 415.
3 COVER, T. M. and HART, P. E. 1967. Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13 21 27.
4 DEVROy E, L. 1982. Any discrimination rule can have an arbitrarily bad probability of error for finite sample size. IEEE Trans. Pattern Anal. Machine Intelligence 4 154 157.
5 ERDELy I, A. 1956. Asy mptotic Expansions. Dover, New York. ´
6 FIX, E. and HODGES, J. L., JR. 1951. Discriminatory analysis nonparametric discrimination: consistency properties. Project 21-49-004, Report No. 4. 261 279. USAF School of Aviation Medicine, Randolf Field, TX.
7 FRIEDMAN, J. H., BENTLEY, J. L. and FINKEL, R. A. 1977. An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Software 3 209 226.
8 FUKUNAGA, K. and HUMMELS, D. M. 1987. Bias of nearest neighbor estimates. IEEE Trans. Pattern Anal. Machine Intelligence 9 103 112.
9 FUKUNAGA, K. and FLICK, T. E. 1984. An optimal global nearest neighbor metric. IEEE Trans. Pattern Anal. Machine Intelligence 6 314 318.
10 FULKS, W. and SATHER, J. O. 1961. Asy mptotics II: Laplace's method for multiple integrals. Pacific J. Math. 11 185 192.
Mathematical Reviews (MathSciNet):
MR138945
11 HELLMAN, M. E. 1970. The nearest-neighbor classification rule with a reject option. IEEE Trans. Sy stems Man Cy bernet. 6 179 185.
12 KNUTH, D. E. 1976. Big omicron and big omega and big theta. ACM SIGACT News 8 18 23.
13 PSALTIS, D., SNAPP, R. R. and VENKATESH, S. S. 1994. On the finite sample performance of the nearest neighbor classifier. IEEE Trans. Inform. Theory 40 820 837.
14 SMITH, S. J., BOURGOIN, M. O., SIMS, K. and VOORHEES, H. L. 1994. Handwritten character classification using nearest neighbor in large databases. IEEE Trans. Pattern Anal. Machine Intelligence 16 915 919, 1994.
15 SNAPP, R. R. and VENKATESH, S. S. 1994. Asy mptotic predictions of the finite-sample risk of the k-nearest-neighbor classifier. In Proceedings of the 12th International Conference on Pattern Recognition 2 1 7. IEEE Computer Society Press, Los Alamitos, CA.
16 SNAPP, R. R. and VENKATESH, S. S. 1998. Asy mptotic derivation of the finite-sample risk of the k nearest neighbor classifier. Technical Report UVM-CS-1998-0101, Dept. Computer Science, Univ. Vermont.
17 SNAPP, R. R. and XU, T. 1996. Estimating the Bay es risk from sample data. In Advances Z in Neural Information Processing Sy stems 8 D. S. Touretzky, M. C. Moser, and M. E.. Hasselmo, eds. MIT Press.
18 STONE, C. J. 1977. Consistent nonparametric regression. Ann. Statist. 5 595 645.
19 WATSON, G. N. 1918. The harmonic functions associated with the parabolic cy linder. Proc. London Math. Soc. 17 116 148.
BURLINGTON, VERMONT 05405 PHILADELPHIA, PENNSy LVANIA 19104 E-MAIL: snapp@cs.uvm.edu E-MAIL: venkatesh@ee.upenn.edu