Open Access
February 2002 Empirical Margin Distributions and Bounding the Generalization Error of Combined Classifiers
V. Koltchinskii, D. Panchenko
Ann. Statist. 30(1): 1-50 (February 2002). DOI: 10.1214/aos/1015362183

Abstract

We prove new probabilistic upper bounds on generalization error of complex classifiers that are combinations of simple classifiers. Such combinations could be implemented by neural networks or by voting methods of combining the classifiers, such as boosting and bagging. The bounds are in terms of the empirical distribution of the margin of the combined classifier. They are based on the methods of the theory of Gaussian and empirical processes (comparison inequalities, symmetrization method, concentration inequalities) and they improve previous results of Bartlett (1998) on bounding the generalization error of neural networks in terms of $\ell_1$-norms of the weights of neurons and of Schapire, Freund, Bartlett and Lee (1998) on bounding the generalization error of boosting. We also obtain rates of convergence in Lévy distance of empirical margin distribution to the true margin distribution uniformly over the classes of classifiers and prove the optimality of these rates.

Citation

Download Citation

V. Koltchinskii. D. Panchenko. "Empirical Margin Distributions and Bounding the Generalization Error of Combined Classifiers." Ann. Statist. 30 (1) 1 - 50, February 2002. https://doi.org/10.1214/aos/1015362183

Information

Published: February 2002
First available in Project Euclid: 5 March 2002

zbMATH: 1012.62004
MathSciNet: MR1892654
Digital Object Identifier: 10.1214/aos/1015362183

Subjects:
Primary: 62G05
Secondary: 60F15 , 62G20

Keywords: boosting , combined classifier , Concentration inequalities , empirical process , Gaussian process , Generalization error , margin , neural network , Rademacher process

Rights: Copyright © 2002 Institute of Mathematical Statistics

Vol.30 • No. 1 • February 2002
Back to Top