Electronic Journal of Statistics

Improved classification rates under refined margin conditions

Ingrid Blaschzyk and Ingo Steinwart

Full-text: Open access

Abstract

In this paper we present a simple partitioning based technique to refine the statistical analysis of classification algorithms. The core of our idea is to divide the input space into two parts such that the first part contains a suitable vicinity around the decision boundary, while the second part is sufficiently far away from the decision boundary. Using a set of margin conditions we are then able to control the classification error on both parts separately. By balancing out these two error terms we obtain a refined error analysis in a final step. We apply this general idea to the histogram rule and show that even for this simple method we obtain, under certain assumptions, better rates than the ones known for support vector machines, for certain plug-in classifiers, and for a recently analyzed tree based adaptive-partitioning ansatz. Moreover, we show that a margin condition which sets the critical noise in relation to the decision boundary makes it possible to improve the optimal rates proven for distributions without this margin condition.

Article information

Source
Electron. J. Statist., Volume 12, Number 1 (2018), 793-823.

Dates
Received: October 2016
First available in Project Euclid: 3 March 2018

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1520046229

Digital Object Identifier
doi:10.1214/18-EJS1406

Mathematical Reviews number (MathSciNet)
MR3770888

Zentralblatt MATH identifier
06864477

Subjects
Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]
Secondary: 62G20: Asymptotic properties 68T05: Learning and adaptive systems [See also 68Q32, 91E40]

Keywords
Statistical learning classification excess risk fast rates of convergence histogram rule

Rights
Creative Commons Attribution 4.0 International License.

Citation

Blaschzyk, Ingrid; Steinwart, Ingo. Improved classification rates under refined margin conditions. Electron. J. Statist. 12 (2018), no. 1, 793--823. doi:10.1214/18-EJS1406. https://projecteuclid.org/euclid.ejs/1520046229


Export citation

References

  • [1] Audibert, J. Y. and Tsybakov, A. (2007). Fast learning rates for plug-in classifiers., Ann. Statist. 35 608–633.
  • [2] Binev, P., Cohen, A., Dahmen, W. and DeVore, R. (2014). Classification algorithms using adaptive partitioning., Ann. Statist 42 2141–2163.
  • [3] Breiman, L. (2001). Random Forests., Machine Learning 45 5–32.
  • [4] Devroye, L., Györfi, L. and Lugosi, L. (1996)., A Probabilistc Theory of Pattern Recognition. Springer.
  • [5] Döring, M., Györfi, L. and Walk, H. (2015)., Exact rate of convergence of kernel-based classification rule In Challenges in Computational Statistics and Data Mining 605 71–91. Springer International Publishing.
  • [6] Federer, H. (1969)., Geometric measure theory. Springer.
  • [7] Kohler, M. and Krzyzak, A. (2007). On the rate of convergence of local averaging plug-in classification rules under a margin condition., IEEE Trans. Inf. Theor. 53 1735–1742.
  • [8] Massart, P. and Nedelec, E. (2006). Risk bounds for statistical learning., Ann. Statist. 34 2326–2366.
  • [9] Steinwart, I. (2015). Fully adaptive density-based clustering., Ann. Statist. 43 2132–2167.
  • [10] Steinwart, I. and Christmann, A. (2008)., Support Vector Machines. Springer.