Open Access
January 2003 Bounding the generalization error of convex combinations of classifiers: balancing the dimensionality and the margins
Vladimir Koltchinskii, Dmitriy Panchenko, Fernando Lozano
Ann. Appl. Probab. 13(1): 213-252 (January 2003). DOI: 10.1214/aoap/1042765667

Abstract

A problem of bounding the generalization error of a classifier %\break $f\in \conv(\mathcal{H})$, where $\mathcal{H}$ is a "base" class of functions (classifiers), is considered. This problem frequently occurs in computer learning, where efficient algorithms that combine simple classifiers into a complex one (such as boosting and bagging) have attracted a lot of attention. Using Talagrand's concentration inequalities for empirical processes, we obtain new sharper bounds on the generalization error of combined classifiers that take into account both the empirical distribution of "classification margins" and an "approximate dimension" of the classifiers, and study the performance of these bounds in several experiments with learning algorithms.

Citation

Download Citation

Vladimir Koltchinskii. Dmitriy Panchenko. Fernando Lozano. "Bounding the generalization error of convex combinations of classifiers: balancing the dimensionality and the margins." Ann. Appl. Probab. 13 (1) 213 - 252, January 2003. https://doi.org/10.1214/aoap/1042765667

Information

Published: January 2003
First available in Project Euclid: 16 January 2003

zbMATH: 1073.62535
MathSciNet: MR1951998
Digital Object Identifier: 10.1214/aoap/1042765667

Subjects:
Primary: 62G05
Secondary: 60F15 , 62G20

Keywords: approximate dimension , bagging , boosting , combined classifier , Concentration inequalities , empirical process , Generalization error , margin , Rademacher process , random entropies

Rights: Copyright © 2003 Institute of Mathematical Statistics

Vol.13 • No. 1 • January 2003
Back to Top