Source: Bernoulli Volume 16, Number 3
(2010), 605-613.
We present an argument based on the multidimensional and the uniform central limit theorems, proving that, under some geometrical assumptions between the target function T and the learning class F, the excess risk of the empirical risk minimization algorithm is lower bounded by
,
where (Gq)q∈Q is a canonical Gaussian process associated with Q (a well chosen subset of F) and δ is a parameter governing the oscillations of the empirical excess risk function over a small ball in F.
Full-text: Access denied (no subscription
detected)
We're sorry, but we are unable to provide
you with the full text of this article because we are not able to identify
you as a subscriber.
If you have a personal subscription to
this journal, then please login. If you are already logged in, then you
may need to update your profile to register your subscription.
Read more about accessing full-text
References
[1] Bartlett, P.L. and Mendelson, S. (2006). Empirical minimization., Probab. Theory Related Fields 135 311–334.
[2] Dudley, R.M. (1999)., Uniform Central Limit Theorems. Cambridge Studies in Advanced Mathematics 63. Cambridge: Cambridge Univ. Press.
[3] Koltchinskii, V. (2006). Local Rademacher complexities and oracle inequalities in risk minimization., Ann. Statist. 34 2593–2656.
[4] Lecué, G. (2007). Suboptimality of penalized empirical risk minimization in classification. In, 20th Annual Conference On Learning Theory, COLT07 (G. Bshouty, ed.). LNAI 4539 142–156. Berlin: Springer.
[5] Lee, W.S., Bartlett, P.L. and Williamson, R.C. (1998). The importance of convexity in learning with squared loss., IEEE Trans. Inform. Theory 44 1974–1980.
[6] Massart P. and Nédélec, É. (2006). Risk bounds for statistical learning., Ann. Statist. 34 2326–2366.
[7] Mendelson, S. (2008). Lower bounds for the empirical minimization algorithm., IEEE Trans. Inform. Theory. 54 3797–3803.
[8] Talagrand, M. (2005)., The Generic Chaining. Springer Monographs in Mathematics. Berlin: Springer-Verlag.
[9] van der Vaart, A.W. and Wellner, J.A. (1996)., Weak Convergence and Empirical Processes. Springer Series in Statistics. New York: Springer-Verlag.
[10] Vapnik, V.N. (1998)., Statistical Learning Theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control. New York: Wiley.