Open Access
October 2012 Optimal weighted nearest neighbour classifiers
Richard J. Samworth
Ann. Statist. 40(5): 2733-2763 (October 2012). DOI: 10.1214/12-AOS1049

Abstract

We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find the asymptotically optimal vector of nonnegative weights, which has a rather simple form. We show that the ratio of the regret of this classifier to that of an unweighted $k$-nearest neighbour classifier depends asymptotically only on the dimension $d$ of the feature vectors, and not on the underlying populations. The improvement is greatest when $d=4$, but thereafter decreases as $d\rightarrow\infty$. The popular bagged nearest neighbour classifier can also be regarded as a weighted nearest neighbour classifier, and we show that its corresponding weights are somewhat suboptimal when $d$ is small (in particular, worse than those of the unweighted $k$-nearest neighbour classifier when $d=1$), but are close to optimal when $d$ is large. Finally, we argue that improvements in the rate of convergence are possible under stronger smoothness assumptions, provided we allow negative weights. Our findings are supported by an empirical performance comparison on both simulated and real data sets.

Citation

Download Citation

Richard J. Samworth. "Optimal weighted nearest neighbour classifiers." Ann. Statist. 40 (5) 2733 - 2763, October 2012. https://doi.org/10.1214/12-AOS1049

Information

Published: October 2012
First available in Project Euclid: 4 February 2013

zbMATH: 1373.62317
MathSciNet: MR3097618
Digital Object Identifier: 10.1214/12-AOS1049

Subjects:
Primary: 62G20

Keywords: bagging , ‎classification‎ , nearest neighbours , weighted nearest neighbour classifiers

Rights: Copyright © 2012 Institute of Mathematical Statistics

Vol.40 • No. 5 • October 2012
Back to Top