Open Access
February 2004 Statistical behavior and consistency of classification methods based on convex risk minimization
Tong Zhang
Ann. Statist. 32(1): 56-85 (February 2004). DOI: 10.1214/aos/1079120130

Abstract

We study how closely the optimal Bayes error rate can be approximately reached using a classification algorithm that computes a classifier by minimizing a convex upper bound of the classification error function. The measurement of closeness is characterized by the loss function used in the estimation. We show that such a classification scheme can be generally regarded as a (nonmaximum-likelihood) conditional in-class probability estimate, and we use this analysis to compare various convex loss functions that have appeared in the literature. Furthermore, the theoretical insight allows us to design good loss functions with desirable properties. Another aspect of our analysis is to demonstrate the consistency of certain classification methods using convex risk minimization. This study sheds light on the good performance of some recently proposed linear classification methods including boosting and support vector machines. It also shows their limitations and suggests possible improvements.

Citation

Download Citation

Tong Zhang. "Statistical behavior and consistency of classification methods based on convex risk minimization." Ann. Statist. 32 (1) 56 - 85, February 2004. https://doi.org/10.1214/aos/1079120130

Information

Published: February 2004
First available in Project Euclid: 12 March 2004

zbMATH: 1105.62323
MathSciNet: MR2051001
Digital Object Identifier: 10.1214/aos/1079120130

Subjects:
Primary: 62G05 , 68T05 , G2H30

Keywords: boosting , ‎classification‎ , consistency , kernel methods , large margin methods

Rights: Copyright © 2004 Institute of Mathematical Statistics

Vol.32 • No. 1 • February 2004
Back to Top