## Annals of Probability

### Concentration inequalities and asymptotic results for ratio type empirical processes

#### Abstract

Let ℱ be a class of measurable functions on a measurable space $(S,\mathscr{S})$ with values in [0,1] and let $$P_n=n^{−1}\sum_{i=1}^nδ{X_i}$$ be the empirical measure based on an i.i.d. sample (X1,…,Xn) from a probability distribution P on $(S,\mathscr{S})$. We study the behavior of suprema of the following type: $$\sup_{r_{n}<\sigma_{P}f\leq \delta_{n}}\frac{|P_{n}f-Pf|}{\phi(\sigma_{P}f)},$$ where σPf≥Var1/2Pf and ϕ is a continuous, strictly increasing function with ϕ(0)=0. Using Talagrand’s concentration inequality for empirical processes, we establish concentration inequalities for such suprema and use them to derive several results about their asymptotic behavior, expressing the conditions in terms of expectations of localized suprema of empirical processes. We also prove new bounds for expected values of sup-norms of empirical processes in terms of the largest σPf and the L2(P) norm of the envelope of the function class, which are especially suited for estimating localized suprema. With this technique, we extend to function classes most of the known results on ratio type suprema of empirical processes, including some of Alexander’s results for VC classes of sets. We also consider applications of these results to several important problems in nonparametric statistics and in learning theory (including general excess risk bounds in empirical risk minimization and their versions for L2-regression and classification and ratio type bounds for margin distributions in classification).

#### Article information

Source
Ann. Probab., Volume 34, Number 3 (2006), 1143-1216.

Dates
First available in Project Euclid: 27 June 2006

https://projecteuclid.org/euclid.aop/1151418495

Digital Object Identifier
doi:10.1214/009117906000000070

Mathematical Reviews number (MathSciNet)
MR2243881

Zentralblatt MATH identifier
1152.60021

#### Citation

Giné, Evarist; Koltchinskii, Vladimir. Concentration inequalities and asymptotic results for ratio type empirical processes. Ann. Probab. 34 (2006), no. 3, 1143--1216. doi:10.1214/009117906000000070. https://projecteuclid.org/euclid.aop/1151418495

#### References

• Alexander, K. S. (1985). Rates of growth for weighted empirical processes. In Proc. of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer II (L. Le Cam and R. Olshen, eds.) 475–493. Wadsworth, Belmont, CA.
• Alexander, K. S. (1987). Rates of growth and sample moduli for weighted empirical processes indexed by sets. Probab. Theory Related Fields 75 379–423.
• Alexander, K. S. (1987). The central limit theorem for weighted empirical processes indexed by sets. J. Multivariate Anal. 22 313–339.
• Alexander, K. S. (1987). The central limit theorem for empirical processes on Vapnik–Červonenkis classes. Ann. Probab. 15 178–203.
• Bartlett, P., Bousquet, O. and Mendelson, S. (2002). Localized Rademacher complexities. Computational Learning Theory. Lecture Notes in Comput. Sci. 2375 44–58. Springer, Berlin.
• Bartlett, P. and Lugosi, G. (1999). An inequality for uniform deviations of sample averages from their means. Statist. Probab. Lett. 44 55–62.
• Bartlett, P. and Mendelson, S. (2003). Empirical risk minimization. Preprint.
• Birgé, L. and Massart, P. (1998). Rates of convergence for minimum contrast estimators. Bernoulli 4 329–375.
• Bousquet, O. (2002). Concentration inequalities and empirical processes theory applied to the analysis of learning algorithms. Ph.D. thesis, Ecole Polytechnique, Paris.
• Bousquet, O. (2003). Concentration inequalities for sub-additive functions using the entropy method. In Stochastic Inequalities and Applications 213–247. Birkhäuser, Basel.
• Bousquet, O., Koltchinskii, V. and Panchenko, D. (2002). Some local measures of complexity of convex hulls and generalization bounds. Computational Learning Theory. Lecture Notes in Comput. Sci. 2375 59–73. Springer, Berlin.
• Čibisov, D. M. (1964). Some limit theorems on the limiting behavior of empirical distribution functions. Selected Translations Math. Statist. Probab. 6 147–156.
• Csáki, E. (1977). The law of hte iterated logarithm for normalized empirical distribution functions. Z. Wahrsch. Verw. Gebiete 38 147–167.
• Csörgő, M., Csörgő, S., Horváth, L. and Mason, D. (1986). Weighted empirical and quantile processes. Ann. Probab. 14 31–85.
• de la Peña, V. and Giné, E. (1999). Decoupling: From Dependence to Independence. Springer, New York.
• Dudley, R. M. (1987). Universal Donsker classes and metric entropy. Ann. Probab. 20 1968–1982.
• Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Univ. Press.
• Eicker, F. (1979). The asymptotic distribution of suprema of the standardized empirical process. Ann. Statist. 7 116–138.
• Einmahl, J. H. J. (1996). Extension to higher dimensions of the Jaeschke–Eicker result on the standardized empirical process. Comm. Statist. Theory Methods 25 813–822.
• Einmahl, U. and Mason, D. (2000). An empirical process approach to the uniform consistency of kernel type function estimators. J. Theor. Probab. 13 1–37.
• Giné, E. and Guillou, A. (2001). On consistency of kernel density estimators for randomly censored data: Rates holding uniformly over adaptive intervals. Ann. Inst. H. Poincaré Probab. Statist. 37 503–522.
• Giné, E., Koltchinskii, V. and Wellner, J. (2003). Ratio limit theorems for empirical processes. In Stochastic Inequalities and Applications 249–278. Birkhäuser, Basel.
• Giné, E., Latała, R. and Zinn, J. (2000). Exponential and moment inequalities for U-statistics. In High Dimensional Probability II (E. Giné, D. M. Mason and J. Wellner, eds.) 13–38. Birkhäuser, Boston.
• Jaeschke, D. (1979). The asymptotic distribution of the supremum of the standardized empirical distribution function on subintervals. Ann. Statist. 7 108–115.
• Koltchinskii, V. (2003). Bounds on margin distributions in learning problems. Ann. Inst. H. Poincaré Probab. Statist. 39 943–978.
• Koltchinskii, V. (2006). Local Rademacher complexities and oracle inequalities in risk minimization. Ann. Statist. To appear.
• Koltchinskii, V. and Panchenko, D. (2000). Rademacher processes and bounding the risk of function learning. In High Dimensional Probability II (E. Giné, D. Mason and J. Wellner, eds.) 443–459. Birkhäuser, Boston.
• Koltchinskii, V. and Panchenko, D. (2002). Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statist. 30 1–50.
• Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. Springer, Berlin.
• Mason, D., Shorack, G. R. and Wellner, J. (1983). Strong limit theorems for oscillation moduli of the uniform empirical process. Z. Wahrsch. Verw. Gebiete 65 83–97.
• Massart, P. (2000). Some applications of concentration inequalities in statistics. Ann. Fac. Sci. Tolouse Math. (6) 9 245–303.
• Massart, P. (2000). About the constants in Talagrand's concentration inequalities for empirical processes. Ann. Probab. 28 863–884.
• Massart, P. (2005). Concentration inequalities with applications to model selection and statistical learning. Lecture on Probability Theory and Statistics. Ecole d'Eté de Probabilités de Saint-Flour XXXIV-2003. Lecture Notes in Math. To appear. Available at www.math.u-psud.fr/~massart/.
• Massart, P. and Nedelec, E. (2006). Risk bounds for statistical learning. Ann. Statist. 34(5).
• Mendelson, S. (2001). Learning relatively small classes. Computational Learning Theory. Lecture Notes in Comput. Sci. 2111 273–288. Springer, Berlin.
• O'Reilly, N. E. (1974). On the weak convergence of empirical processes in sup-norm metrics. Ann. Probab. 2 642–651.
• Panchenko, D. (2002). Concentration inequalities in product spaces and applications to statistical learning theory. Ph.D. dissertation, Univ. New Mexico, Albuquerque.
• Panchenko, D. (2002). Some extensions of an inequality of Vapnik and Chervonenkis. Electron. Comm. Probab. 7 55–65.
• Panchenko, D. (2003). Symmetrization approach to concentration inequalities for empirical processes. Ann. Probab. 31 2068–2081.
• Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York.
• Shorack, G. R. and Wellner, J. A. (1982). Limit theorems and inequalities for the uniform empirical process indexed by intervals. Ann. Probab. 10 639–652.
• Talagrand, M. (1994). Sharper bounds for Gaussian and empirical processes. Ann. Probab. 22 28–76.
• Talagrand, M. (1996). New concentration inequalities in product spaces. Invent. Math. 126 505–563.
• Tsybakov, A. (2004). Optimal aggregation of classifiers in statistical learning. Ann. Statist. 32 135–166.
• van de Geer, S. A. (1993). Hellinger-consistency of certain nonparametric maximum likelihood estimators. Ann. Statist. 21 14–44.
• van de Geer, S. A. (2000). Applications of Empirical Process Theory. Cambridge Univ. Press.
• van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
• Wellner, J. A. (1978). Limit theorems for the ratio of the empirical distribution function to the true distribution function. Z. Wahrsch. Verw. Gebiete 45 108–123.
• Yukich, J. E. (1987). Some limit theorems for empirical processes indexed by functions. Probab. Theory Related Fields 74 71–90.