Concentration inequalities and asymptotic results for ratio type empirical processes



The Annals of Probability

Concentration inequalities and asymptotic results for ratio type empirical processes

Evarist Giné and Vladimir Koltchinskii

Source: Ann. Probab. Volume 34, Number 3 (2006), 1143-1216.

Abstract

Let ℱ be a class of measurable functions on a measurable space $(S,\mathcal{S})$ with values in [0,1] and let

Pn=n−1i=1nδXi

be the empirical measure based on an i.i.d. sample (X1,…,Xn) from a probability distribution P on $(S,\mathcal{S})$. We study the behavior of suprema of the following type:

\[\sup_{r_{n}\textless\sigma_{P}f\leq \delta_{n}}\frac{|P_{n}f-Pf|}{\phi(\sigma_{P}f)},\]

where σPf≥Var1/2Pf and ϕ is a continuous, strictly increasing function with ϕ(0)=0. Using Talagrand’s concentration inequality for empirical processes, we establish concentration inequalities for such suprema and use them to derive several results about their asymptotic behavior, expressing the conditions in terms of expectations of localized suprema of empirical processes. We also prove new bounds for expected values of sup-norms of empirical processes in terms of the largest σPf and the L2(P) norm of the envelope of the function class, which are especially suited for estimating localized suprema. With this technique, we extend to function classes most of the known results on ratio type suprema of empirical processes, including some of Alexander’s results for VC classes of sets. We also consider applications of these results to several important problems in nonparametric statistics and in learning theory (including general excess risk bounds in empirical risk minimization and their versions for L2-regression and classification and ratio type bounds for margin distributions in classification).

Primary Subjects: 60E15
Secondary Subjects: 60F17, 60F15, 62G08, 68T10
Keywords: Ratio type empirical processes; concentration inequalities; ratio limit theorems; localized sup-norms; weighted central limit theorems; VC type classes; moment bounds for empirical processes; nonparametric regression; classification

Full-text: Open access

Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aop/1151418495
Digital Object Identifier: doi:10.1214/009117906000000070

References

Alexander, K. S. (1985). Rates of growth for weighted empirical processes. In Proc. of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer II (L. Le Cam and R. Olshen, eds.) 475--493. Wadsworth, Belmont, CA.
Mathematical Reviews (MathSciNet): MR0822047
Alexander, K. S. (1987). Rates of growth and sample moduli for weighted empirical processes indexed by sets. Probab. Theory Related Fields 75 379--423.
Mathematical Reviews (MathSciNet): MR0890285
Digital Object Identifier: doi:10.1007/BF00318708
Alexander, K. S. (1987). The central limit theorem for weighted empirical processes indexed by sets. J. Multivariate Anal. 22 313--339.
Mathematical Reviews (MathSciNet): MR0899666
Digital Object Identifier: doi:10.1016/0047-259X(87)90093-5
Alexander, K. S. (1987). The central limit theorem for empirical processes on Vapnik--Červonenkis classes. Ann. Probab. 15 178--203.
Mathematical Reviews (MathSciNet): MR0877597
Digital Object Identifier: doi:10.1214/aop/1176992263
Project Euclid: euclid.aop/1176992263
Bartlett, P., Bousquet, O. and Mendelson, S. (2002). Localized Rademacher complexities. Computational Learning Theory. Lecture Notes in Comput. Sci. 2375 44--58. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR2040404
Zentralblatt MATH: 1050.68054
Digital Object Identifier: doi:10.1007/3-540-45435-7_4
Bartlett, P. and Lugosi, G. (1999). An inequality for uniform deviations of sample averages from their means. Statist. Probab. Lett. 44 55--62.
Mathematical Reviews (MathSciNet): MR1706315
Bartlett, P. and Mendelson, S. (2003). Empirical risk minimization. Preprint.
Birgé, L. and Massart, P. (1998). Rates of convergence for minimum contrast estimators. Bernoulli 4 329--375.
Mathematical Reviews (MathSciNet): MR1653272
Digital Object Identifier: doi:10.2307/3318720
Project Euclid: euclid.bj/1174324984
Bousquet, O. (2002). Concentration inequalities and empirical processes theory applied to the analysis of learning algorithms. Ph.D. thesis, Ecole Polytechnique, Paris.
Bousquet, O. (2003). Concentration inequalities for sub-additive functions using the entropy method. In Stochastic Inequalities and Applications 213--247. Birkhäuser, Basel.
Mathematical Reviews (MathSciNet): MR2073435
Zentralblatt MATH: 1037.60015
Bousquet, O., Koltchinskii, V. and Panchenko, D. (2002). Some local measures of complexity of convex hulls and generalization bounds. Computational Learning Theory. Lecture Notes in Comput. Sci. 2375 59--73. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR2040405
Zentralblatt MATH: 1050.68055
Digital Object Identifier: doi:10.1007/3-540-45435-7_5
Čibisov, D. M. (1964). Some limit theorems on the limiting behavior of empirical distribution functions. Selected Translations Math. Statist. Probab. 6 147--156.
Csáki, E. (1977). The law of hte iterated logarithm for normalized empirical distribution functions. Z. Wahrsch. Verw. Gebiete 38 147--167.
Mathematical Reviews (MathSciNet): MR0431350
Digital Object Identifier: doi:10.1007/BF00533305
Csörgő, M., Csörgő, S., Horváth, L. and Mason, D. (1986). Weighted empirical and quantile processes. Ann. Probab. 14 31--85.
Mathematical Reviews (MathSciNet): MR0815960
Digital Object Identifier: doi:10.1214/aop/1176992617
Project Euclid: euclid.aop/1176992617
de la Peña, V. and Giné, E. (1999). Decoupling: From Dependence to Independence. Springer, New York.
Mathematical Reviews (MathSciNet): MR1666908
Dudley, R. M. (1987). Universal Donsker classes and metric entropy. Ann. Probab. 20 1968--1982.
Mathematical Reviews (MathSciNet): MR1188050
Digital Object Identifier: doi:10.1214/aop/1176989537
Project Euclid: euclid.aop/1176989537
Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR1720712
Zentralblatt MATH: 0951.60033
Eicker, F. (1979). The asymptotic distribution of suprema of the standardized empirical process. Ann. Statist. 7 116--138.
Mathematical Reviews (MathSciNet): MR0515688
Digital Object Identifier: doi:10.1214/aos/1176344559
Project Euclid: euclid.aos/1176344559
Einmahl, J. H. J. (1996). Extension to higher dimensions of the Jaeschke--Eicker result on the standardized empirical process. Comm. Statist. Theory Methods 25 813--822.
Mathematical Reviews (MathSciNet): MR1380620
Digital Object Identifier: doi:10.1080/03610929608831733
Einmahl, U. and Mason, D. (2000). An empirical process approach to the uniform consistency of kernel type function estimators. J. Theor. Probab. 13 1--37.
Mathematical Reviews (MathSciNet): MR1744994
Digital Object Identifier: doi:10.1023/A:1007769924157
Giné, E. and Guillou, A. (2001). On consistency of kernel density estimators for randomly censored data: Rates holding uniformly over adaptive intervals. Ann. Inst. H. Poincaré Probab. Statist. 37 503--522.
Mathematical Reviews (MathSciNet): MR1876841
Digital Object Identifier: doi:10.1016/S0246-0203(01)01081-0
Giné, E., Koltchinskii, V. and Wellner, J. (2003). Ratio limit theorems for empirical processes. In Stochastic Inequalities and Applications 249--278. Birkhäuser, Basel.
Mathematical Reviews (MathSciNet): MR2073436
Giné, E., Latała, R. and Zinn, J. (2000). Exponential and moment inequalities for U-statistics. In High Dimensional Probability II (E. Giné, D. M. Mason and J. Wellner, eds.) 13--38. Birkhäuser, Boston.
Mathematical Reviews (MathSciNet): MR1857312
Jaeschke, D. (1979). The asymptotic distribution of the supremum of the standardized empirical distribution function on subintervals. Ann. Statist. 7 108--115.
Mathematical Reviews (MathSciNet): MR0515687
Digital Object Identifier: doi:10.1214/aos/1176344558
Project Euclid: euclid.aos/1176344558
Koltchinskii, V. (2003). Bounds on margin distributions in learning problems. Ann. Inst. H. Poincaré Probab. Statist. 39 943--978.
Mathematical Reviews (MathSciNet): MR2010392
Digital Object Identifier: doi:10.1016/S0246-0203(03)00023-2
Koltchinskii, V. (2006). Local Rademacher complexities and oracle inequalities in risk minimization. Ann. Statist. To appear.
Mathematical Reviews (MathSciNet): MR2329442
Digital Object Identifier: doi:10.1214/009053606000001019
Project Euclid: euclid.aos/1179935055
Koltchinskii, V. and Panchenko, D. (2000). Rademacher processes and bounding the risk of function learning. In High Dimensional Probability II (E. Giné, D. Mason and J. Wellner, eds.) 443--459. Birkhäuser, Boston.
Mathematical Reviews (MathSciNet): MR1857339
Zentralblatt MATH: 01552503
Koltchinskii, V. and Panchenko, D. (2002). Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statist. 30 1--50.
Mathematical Reviews (MathSciNet): MR1892654
Project Euclid: euclid.aos/1015362183
Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR1102015
Zentralblatt MATH: 0748.60004
Mason, D., Shorack, G. R. and Wellner, J. (1983). Strong limit theorems for oscillation moduli of the uniform empirical process. Z. Wahrsch. Verw. Gebiete 65 83--97.
Mathematical Reviews (MathSciNet): MR0717935
Digital Object Identifier: doi:10.1007/BF00534996
Massart, P. (2000). Some applications of concentration inequalities in statistics. Ann. Fac. Sci. Tolouse Math. (6) 9 245--303.
Mathematical Reviews (MathSciNet): MR1813803
Massart, P. (2000). About the constants in Talagrand's concentration inequalities for empirical processes. Ann. Probab. 28 863--884.
Mathematical Reviews (MathSciNet): MR1782276
Digital Object Identifier: doi:10.1214/aop/1019160263
Project Euclid: euclid.aop/1019160263
Massart, P. (2005). Concentration inequalities with applications to model selection and statistical learning. Lecture on Probability Theory and Statistics. Ecole d'Eté de Probabilités de Saint-Flour XXXIV-2003. Lecture Notes in Math. To appear. Available at www.math.u-psud.fr/~massart/.
Mathematical Reviews (MathSciNet): MR1813803
Massart, P. and Nedelec, E. (2006). Risk bounds for statistical learning. Ann. Statist. 34(5).
Mathematical Reviews (MathSciNet): MR2291502
Digital Object Identifier: doi:10.1214/009053606000000786
Project Euclid: euclid.aos/1169571799
Mendelson, S. (2001). Learning relatively small classes. Computational Learning Theory. Lecture Notes in Comput. Sci. 2111 273--288. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR2042041
Zentralblatt MATH: 0998.68068
Digital Object Identifier: doi:10.1007/3-540-44581-1_18
O'Reilly, N. E. (1974). On the weak convergence of empirical processes in sup-norm metrics. Ann. Probab. 2 642--651.
Mathematical Reviews (MathSciNet): MR0383486
Digital Object Identifier: doi:10.1214/aop/1176996610
Panchenko, D. (2002). Concentration inequalities in product spaces and applications to statistical learning theory. Ph.D. dissertation, Univ. New Mexico, Albuquerque.
Panchenko, D. (2002). Some extensions of an inequality of Vapnik and Chervonenkis. Electron. Comm. Probab. 7 55--65.
Mathematical Reviews (MathSciNet): MR1887174
Panchenko, D. (2003). Symmetrization approach to concentration inequalities for empirical processes. Ann. Probab. 31 2068--2081.
Mathematical Reviews (MathSciNet): MR2016612
Digital Object Identifier: doi:10.1214/aop/1068646378
Project Euclid: euclid.aop/1068646378
Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York.
Mathematical Reviews (MathSciNet): MR0762984
Zentralblatt MATH: 0544.60045
Shorack, G. R. and Wellner, J. A. (1982). Limit theorems and inequalities for the uniform empirical process indexed by intervals. Ann. Probab. 10 639--652.
Mathematical Reviews (MathSciNet): MR0659534
Digital Object Identifier: doi:10.1214/aop/1176993773
Project Euclid: euclid.aop/1176993773
Talagrand, M. (1994). Sharper bounds for Gaussian and empirical processes. Ann. Probab. 22 28--76.
Mathematical Reviews (MathSciNet): MR1258865
Digital Object Identifier: doi:10.1214/aop/1176988847
Project Euclid: euclid.aop/1176988847
Talagrand, M. (1996). New concentration inequalities in product spaces. Invent. Math. 126 505--563.
Mathematical Reviews (MathSciNet): MR1419006
Digital Object Identifier: doi:10.1007/s002220050108
Tsybakov, A. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135--166.
Mathematical Reviews (MathSciNet): MR2051002
Digital Object Identifier: doi:10.1214/aos/1079120131
Project Euclid: euclid.aos/1079120131
van de Geer, S. A. (1993). Hellinger-consistency of certain nonparametric maximum likelihood estimators. Ann. Statist. 21 14--44.
Mathematical Reviews (MathSciNet): MR1212164
Digital Object Identifier: doi:10.1214/aos/1176349013
Project Euclid: euclid.aos/1176349013
van de Geer, S. A. (2000). Applications of Empirical Process Theory. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR1739079
Zentralblatt MATH: 0953.62049
van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
Mathematical Reviews (MathSciNet): MR1385671
Zentralblatt MATH: 0862.60002
Wellner, J. A. (1978). Limit theorems for the ratio of the empirical distribution function to the true distribution function. Z. Wahrsch. Verw. Gebiete 45 108--123.
Mathematical Reviews (MathSciNet): MR0651392
Yukich, J. E. (1987). Some limit theorems for empirical processes indexed by functions. Probab. Theory Related Fields 74 71--90.
Mathematical Reviews (MathSciNet): MR0863719
Digital Object Identifier: doi:10.1007/BF01845640

2009 © Institute of Mathematical Statistics