Bernoulli

  • Bernoulli
  • Volume 17, Number 4 (2011), 1368-1385.

Support vector machines with a reject option

Marten Wegkamp and Ming Yuan

Full-text: Open access

Abstract

This paper studies 1 regularization with high-dimensional features for support vector machines with a built-in reject option (meaning that the decision of classifying an observation can be withheld at a cost lower than that of misclassification). The procedure can be conveniently implemented as a linear program and computed using standard software. We prove that the minimizer of the penalized population risk favors sparse solutions and show that the behavior of the empirical risk minimizer mimics that of the population risk minimizer. We also introduce a notion of classification complexity and prove that our minimizers adapt to the unknown complexity. Using a novel oracle inequality for the excess risk, we identify situations where fast rates of convergence occur.

Article information

Source
Bernoulli, Volume 17, Number 4 (2011), 1368-1385.

Dates
First available in Project Euclid: 4 November 2011

Permanent link to this document
https://projecteuclid.org/euclid.bj/1320417508

Digital Object Identifier
doi:10.3150/10-BEJ320

Mathematical Reviews number (MathSciNet)
MR2854776

Zentralblatt MATH identifier
1243.68256

Keywords
adaptive prediction classification with a reject option lasso oracle inequalities sparsity support vector machines statistical learning

Citation

Wegkamp, Marten; Yuan, Ming. Support vector machines with a reject option. Bernoulli 17 (2011), no. 4, 1368--1385. doi:10.3150/10-BEJ320. https://projecteuclid.org/euclid.bj/1320417508


Export citation

References

  • [1] Bartlett, P.L. and Wegkamp, M.H. (2008). Classification with a reject option using a hinge loss. J. Mach. Learn. Res. 9 1823–1840.
  • [2] Bickel, P.J., Ritov, Y. and Tsybakov, A.B. (2009). Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • [3] Devroye, L. and Lugosi, G. (2000). Combinatorial Methods in Density Estimation. New York: Springer.
  • [4] Hastie, T., Tibshirani, R. and Friedman, J. (2001). The Elements of Statistical Learning. New York: Springer.
  • [5] Herbei, R. and Wegkamp, M.H. (2006). Classification with reject option. Canad. J. Statist. 34 709–721.
  • [6] Koltchinskii, V. (2009). Sparsity in penalized empirical risk minimization. Ann. Inst. H. Poincaré Probab. Statist. 45 7–57.
  • [7] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. New York: Springer.
  • [8] Tarigan, B. and van de Geer, S.A. (2006). Classifiers of support vector machine type with 1 complexity regularization. Bernoulli 12 1045–1076.
  • [9] Tsybakov, A.B. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135–166.
  • [10] van de Geer, S.A. (2000). Empirical Processes in M-estimation. Cambridge: Cambridge Univ. Press.
  • [11] Wegkamp, M.H. (2007). Lasso type classifiers with a reject option. Electron. J. Statist. 1 155–168.
  • [12] Yuan, M. and Wegkamp, M.H. (2010). Classification methods with reject option based on convex risk minimization. J. Mach. Learn. Res. 11 111–130.