Bernoulli

  • Bernoulli
  • Volume 16, Number 3 (2010), 605-613.

Sharper lower bounds on the performance of the empirical risk minimization algorithm

Guillaume Lecué and Shahar Mendelson

Full-text: Open access

Abstract

We present an argument based on the multidimensional and the uniform central limit theorems, proving that, under some geometrical assumptions between the target function T and the learning class F, the excess risk of the empirical risk minimization algorithm is lower bounded by

\[\frac{\mathbb{E}\sup_{q\in Q}G_{q}}{\sqrt{n}}\delta\],

where (Gq)qQ is a canonical Gaussian process associated with Q (a well chosen subset of F) and δ is a parameter governing the oscillations of the empirical excess risk function over a small ball in F.

Article information

Source
Bernoulli, Volume 16, Number 3 (2010), 605-613.

Dates
First available in Project Euclid: 6 August 2010

Permanent link to this document
https://projecteuclid.org/euclid.bj/1281099877

Digital Object Identifier
doi:10.3150/09-BEJ225

Mathematical Reviews number (MathSciNet)
MR2730641

Zentralblatt MATH identifier
1220.62007

Keywords
empirical risk minimization learning theory lower bound multidimensional central limit theorem uniform central limit theorem

Citation

Lecué, Guillaume; Mendelson, Shahar. Sharper lower bounds on the performance of the empirical risk minimization algorithm. Bernoulli 16 (2010), no. 3, 605--613. doi:10.3150/09-BEJ225. https://projecteuclid.org/euclid.bj/1281099877


Export citation

References

  • [1] Bartlett, P.L. and Mendelson, S. (2006). Empirical minimization., Probab. Theory Related Fields 135 311–334.
  • [2] Dudley, R.M. (1999)., Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics 63. Cambridge: Cambridge Univ. Press.
  • [3] Koltchinskii, V. (2006). Local Rademacher complexities and oracle inequalities in risk minimization., Ann. Statist. 34 2593–2656.
  • [4] Lecué, G. (2007). Suboptimality of penalized empirical risk minimization in classification. In, 20th Annual Conference On Learning Theory, COLT07 (G. Bshouty, ed.). LNAI 4539 142–156. Berlin: Springer.
  • [5] Lee, W.S., Bartlett, P.L. and Williamson, R.C. (1998). The importance of convexity in learning with squared loss., IEEE Trans. Inform. Theory 44 1974–1980.
  • [6] Massart P. and Nédélec, É. (2006). Risk bounds for statistical learning., Ann. Statist. 34 2326–2366.
  • [7] Mendelson, S. (2008). Lower bounds for the empirical minimization algorithm., IEEE Trans. Inform. Theory. 54 3797–3803.
  • [8] Talagrand, M. (2005)., The Generic Chaining. Springer Monographs in Mathematics. Berlin: Springer-Verlag.
  • [9] van der Vaart, A.W. and Wellner, J.A. (1996)., Weak Convergence and Empirical Processes. Springer Series in Statistics. New York: Springer-Verlag.
  • [10] Vapnik, V.N. (1998)., Statistical Learning Theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control. New York: Wiley.