The Annals of Statistics

Square root penalty: Adaptation to the margin in classification and in edge estimation

A. B. Tsybakov and S. A. van de Geer

Full-text: Open access

Abstract

We consider the problem of adaptation to the margin in binary classification. We suggest a penalized empirical risk minimization classifier that adaptively attains, up to a logarithmic factor, fast optimal rates of convergence for the excess risk, that is, rates that can be faster than n−1/2, where n is the sample size. We show that our method also gives adaptive estimators for the problem of edge estimation.

Article information

Source
Ann. Statist., Volume 33, Number 3 (2005), 1203-1224.

Dates
First available in Project Euclid: 1 July 2005

Permanent link to this document
https://projecteuclid.org/euclid.aos/1120224100

Digital Object Identifier
doi:10.1214/009053604000001066

Mathematical Reviews number (MathSciNet)
MR2195633

Zentralblatt MATH identifier
1080.62047

Subjects
Primary: 62G07: Density estimation
Secondary: 62G08: Nonparametric regression 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20] 68T10: Pattern recognition, speech recognition {For cluster analysis, see 62H30}

Keywords
Binary classification edge estimation adaptation margin penalized classification rule square root penalty sparsity block thresholding

Citation

Tsybakov, A. B.; van de Geer, S. A. Square root penalty: Adaptation to the margin in classification and in edge estimation. Ann. Statist. 33 (2005), no. 3, 1203--1224. doi:10.1214/009053604000001066. https://projecteuclid.org/euclid.aos/1120224100


Export citation

References

  • Audibert, J.-Y. (2004). Aggregated estimators and empirical complexity for least squares regression. Ann. Inst. H. Poincaré Probab. Statist. 40 685--736.
  • Barron, A., Birgé, L. and Massart, P. (1999). Risk bounds for model selection via penalization. Probab. Theory Related Fields 113 301--413.
  • Bartlett, P. L., Jordan, M. I. and McAuliffe, J. D. (2003). Convexity, classification and risk bounds. Technical Report 638, Dept. Statistics, Univ. California, Berkeley.
  • Blanchard, G., Lugosi, G. and Vayatis, N. (2003). On the rate of convergence of regularized boosting classifiers. J. Mach. Learn. Res. 4 861--894.
  • Cavalier, L. and Tsybakov, A. B. (2001). Penalized blockwise Stein's method, monotone oracles and sharp adaptive estimation. Math. Methods Statist. 10 247--282.
  • DeVore, R. A. and Lorentz, G. G. (1993). Constructive Approximation. Springer, Berlin.
  • Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York.
  • Härdle, W., Kerkyacharian, G., Picard, D. and Tsybakov, A. (1998). Wavelets, Approximation and Statistical Applications. Lecture Notes in Statist. 129. Springer, New York.
  • Koltchinskii, V. (2001). Rademacher penalties and structural risk minimization. IEEE Trans. Inform. Theory 47 1902--1914.
  • Koltchinskii, V. (2003). Local Rademacher complexities and oracle inequalities in risk minimization. Preprint.
  • Koltchinskii, V. and Panchenko, D. (2002). Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statist. 30 1--50.
  • Korostelev, A. P. and Tsybakov, A. B. (1993). Minimax Theory of Image Reconstruction. Lecture Notes in Statist. 82. Springer, New York.
  • Loubes, J.-M. and van de Geer, S. (2002). Adaptive estimation with soft thresholding penalties. Statist. Neerlandica 56 453--478.
  • Lugosi, G. and Wegkamp, M. (2004). Complexity regularization via localized random penalties. Ann. Statist. 32 1679--1697.
  • Mammen, E. and Tsybakov, A. B. (1999). Smooth discrimination analysis. Ann. Statist. 27 1808--1829.
  • Schölkopf, B. and Smola, A. (2002). Learning with Kernels. MIT Press, Cambridge, MA.
  • Tsybakov, A. B. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135--166.
  • van de Geer, S. (2000). Empirical Processes in M-Estimation. Cambridge Univ. Press.
  • van de Geer, S. (2003). Adaptive quantile regression. In Recent Advances and Trends in Nonparametric Statistics (M. G. Akritas and D. N. Politis, eds.) 235--250. North-Holland, Amsterdam.
  • Vapnik, V. N. (1998). Statistical Learning Theory. Wiley, New York.