The Annals of Statistics

Local Rademacher complexities

Peter L. Bartlett, Olivier Bousquet, and Shahar Mendelson
Source: Ann. Statist. Volume 33, Number 4 (2005), 1497-1537.

Abstract

We propose new bounds on the error of learning algorithms in terms of a data-dependent notion of complexity. The estimates we establish give optimal rates and are based on a local and empirical version of Rademacher averages, in the sense that the Rademacher averages are computed from the data, on a subset of functions with small empirical error. We present some applications to classification and prediction with convex function classes, and with kernel classes in particular.

First Page: Show Hide
Primary Subjects: 62G08, 68Q32
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1123250221
Digital Object Identifier: doi:10.1214/009053605000000282
Mathematical Reviews number (MathSciNet): MR2166554
Zentralblatt MATH identifier: 1083.62034

References

Bartlett, P. L., Boucheron, S. and Lugosi, G. (2002). Model selection and error estimation. Machine Learning 48 85--113.
Bartlett, P. L., Jordan, M. I. and McAuliffe, J. D. (2005). Convexity, classification, and risk bounds. J. Amer. Statist. Assoc. To appear.
Mathematical Reviews (MathSciNet): MR2268032
Zentralblatt MATH: 1118.62330
Digital Object Identifier: doi:10.1198/016214505000000907
Bartlett, P. L. and Mendelson, S. (2002). Rademacher and Gaussian complexities: Risk bounds and structural results. J. Mach. Learn. Res. 3 463--482.
Mathematical Reviews (MathSciNet): MR1984026
Digital Object Identifier: doi:10.1162/153244303321897690
Zentralblatt MATH: 1084.68549
Bartlett, P. L. and Mendelson, S. (2003). Empirical minimization. Probab. Theory Related Fields. To appear. Available at www.stat.berkeley.edu/~bartlett/papers/ bm-em-03.pdf.
Mathematical Reviews (MathSciNet): MR2240689
Zentralblatt MATH: 1142.62348
Digital Object Identifier: doi:10.1007/s00440-005-0462-3
Boucheron, S., Lugosi, G. and Massart, P. (2000). A sharp concentration inequality with applications. Random Structures Algorithms 16 277--292.
Mathematical Reviews (MathSciNet): MR1749290
Boucheron, S., Lugosi, G. and Massart, P. (2003). Concentration inequalities using the entropy method. Ann. Probab. 31 1583--1614.
Mathematical Reviews (MathSciNet): MR1989444
Digital Object Identifier: doi:10.1214/aop/1055425791
Project Euclid: euclid.aop/1055425791
Zentralblatt MATH: 1051.60020
Bousquet, O. (2002). A Bennett concentration inequality and its application to suprema of empirical processes. C. R. Math. Acad. Sci. Paris 334 495--500.
Mathematical Reviews (MathSciNet): MR1890640
Digital Object Identifier: doi:10.1016/S1631-073X(02)02292-6
Zentralblatt MATH: 1001.60021
Bousquet, O. (2003). Concentration inequalities for sub-additive functions using the entropy method. In Stochastic Inequalities and Applications (E. Giné, C. Houdré and D. Nualart, eds.) 213--247. Birkhäuser, Boston.
Mathematical Reviews (MathSciNet): MR2073435
Zentralblatt MATH: 1037.60015
Bousquet, O., Koltchinskii, V. and Panchenko, D. (2002). Some local measures of complexity of convex hulls and generalization bounds. Computational Learning Theory. Lecture Notes in Artificial Intelligence 2375 59--73. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR2040405
Zentralblatt MATH: 1050.68055
Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York.
Mathematical Reviews (MathSciNet): MR1383093
Zentralblatt MATH: 0853.68150
Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR1720712
Zentralblatt MATH: 0951.60033
Györfi, L., Kohler, M., Krzyżak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer, New York.
Mathematical Reviews (MathSciNet): MR1920390
Haussler, D. (1992). Decision theoretic generalizations of the PAC model for neural net and other learning applications. Inform. and Comput. 100 78--150.
Mathematical Reviews (MathSciNet): MR1175977
Digital Object Identifier: doi:10.1016/0890-5401(92)90010-D
Zentralblatt MATH: 0762.68050
Haussler, D. (1995). Sphere packing numbers for subsets of the Boolean $n$-cube with bounded Vapnik--Chervonenkis dimension. J. Combin. Theory Ser. A 69 217--232.
Mathematical Reviews (MathSciNet): MR1313896
Digital Object Identifier: doi:10.1016/0097-3165(95)90052-7
Zentralblatt MATH: 0818.60005
Koltchinskii, V. (2001). Rademacher penalties and structural risk minimization. IEEE Trans. Inform. Theory 47 1902--1914.
Mathematical Reviews (MathSciNet): MR1842526
Digital Object Identifier: doi:10.1109/18.930926
Zentralblatt MATH: 1008.62614
Koltchinskii, V. and Panchenko, D. (2000). Rademacher processes and bounding the risk of function learning. In High Dimensional Probability II (E. Giné, D. M. Mason and J. A. Wellner, eds.) 443--459. Birkhäuser, Boston.
Mathematical Reviews (MathSciNet): MR1857339
Zentralblatt MATH: 01552503
Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces: Isoperimetry and Processes. Springer, New York.
Mathematical Reviews (MathSciNet): MR1102015
Zentralblatt MATH: 0748.60004
Lee, W. S., Bartlett, P. L. and Williamson, R. C. (1998). The importance of convexity in learning with squared loss. IEEE Trans. Inform. Theory 44 1974--1980.
Mathematical Reviews (MathSciNet): MR1664079
Digital Object Identifier: doi:10.1109/18.705577
Zentralblatt MATH: 0935.68091
Lugosi, G. and Wegkamp, M. (2004). Complexity regularization via localized random penalties. Ann. Statist. 32 1679--1697.
Mathematical Reviews (MathSciNet): MR2089138
Digital Object Identifier: doi:10.1214/009053604000000463
Project Euclid: euclid.aos/1091626183
Zentralblatt MATH: 1045.62060
Mammen, E. and Tsybakov, A. B. (1999). Smooth discrimination analysis. Ann. Statist. 27 1808--1829.
Mathematical Reviews (MathSciNet): MR1765618
Digital Object Identifier: doi:10.1214/aos/1017939240
Project Euclid: euclid.aos/1017939240
Zentralblatt MATH: 0961.62058
Massart, P. (2000). About the constants in Talagrand's concentration inequalities for empirical processes. Ann. Probab. 28 863--884.
Mathematical Reviews (MathSciNet): MR1782276
Digital Object Identifier: doi:10.1214/aop/1019160263
Project Euclid: euclid.aop/1019160263
Zentralblatt MATH: 1140.60310
Massart, P. (2000). Some applications of concentration inequalities to statistics. Probability theory. Ann. Fac. Sci. Toulouse Math. (6) 9 245--303.
Mathematical Reviews (MathSciNet): MR1813803
McDiarmid, C. (1998). Concentration. In Probabilistic Methods for Algorithmic Discrete Mathematics (M. Habib, C. McDiarmid, J. Ramirez-Alfonsin and B. Reed, eds.) 195--248. Springer, New York.
Mathematical Reviews (MathSciNet): MR1678578
Zentralblatt MATH: 0927.60027
Mendelson, S. (2002). Geometric parameters of kernel machines. Computational Learning Theory. Lecture Notes in Artificial Intelligence 2375 29--43. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR2040403
Zentralblatt MATH: 1050.68070
Mendelson, S. (2002). Rademacher averages and phase transitions in Glivenko--Cantelli classes. IEEE Trans. Inform. Theory 48 251--263.
Mathematical Reviews (MathSciNet): MR1872178
Digital Object Identifier: doi:10.1109/18.971753
Zentralblatt MATH: 1059.60027
Mendelson, S. (2002). Improving the sample complexity using global data. IEEE Trans. Inform. Theory 48 1977--1991.
Mathematical Reviews (MathSciNet): MR1930004
Digital Object Identifier: doi:10.1109/TIT.2002.1013137
Zentralblatt MATH: 1061.68128
Mendelson, S. (2003). A few notes on statistical learning theory. Advanced Lectures on Machine Learning. Lecture Notes in Comput. Sci. 2600 1--40. Springer, New York.
Zentralblatt MATH: 1015.00030
Pollard, D. (1984). Convergence of Stochastic Processes. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR762984
Zentralblatt MATH: 0544.60045
Rio, E. (2001). Une inégalité de Bennett pour les maxima de processus empiriques. Ann. Inst. H. Poincaré Probab. Statist. 38 1053--1057.
Mathematical Reviews (MathSciNet): MR1955352
Digital Object Identifier: doi:10.1016/S0246-0203(02)01122-6
Talagrand, M. (1994). Sharper bounds for Gaussian and empirical processes. Ann. Probab. 22 28--76.
Mathematical Reviews (MathSciNet): MR1258865
van de Geer, S. (1987). A new approach to least-squares estimation, with applications. Ann. Statist. 15 587--602.
Mathematical Reviews (MathSciNet): MR888427
van de Geer, S. (2000). Empirical Processes in M-Estimation. Cambridge Univ. Press.
Zentralblatt MATH: 1179.62073
van der Vaart, A. (1998). Asymptotic Statistics. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR1652247
Zentralblatt MATH: 0910.62001
van der Vaart, A. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. With Applications of Statistics. Springer, New York.
Mathematical Reviews (MathSciNet): MR1385671
Zentralblatt MATH: 0862.60002
Vapnik, V. N. and Chervonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab. Appl. 16 264--280.

2012 © Institute of Mathematical Statistics

The Annals of Statistics

The Annals of Statistics