The Annals of Statistics

Gaussian approximation of suprema of empirical processes

Victor Chernozhukov, Denis Chetverikov, and Kengo Kato

Full-text: Open access


This paper develops a new direct approach to approximating suprema of general empirical processes by a sequence of suprema of Gaussian processes, without taking the route of approximating whole empirical processes in the sup-norm. We prove an abstract approximation theorem applicable to a wide variety of statistical problems, such as construction of uniform confidence bands for functions. Notably, the bound in the main approximation theorem is nonasymptotic and the theorem allows for functions that index the empirical process to be unbounded and have entropy divergent with the sample size. The proof of the approximation theorem builds on a new coupling inequality for maxima of sums of random vectors, the proof of which depends on an effective use of Stein’s method for normal approximation, and some new empirical process techniques. We study applications of this approximation theorem to local and series empirical processes arising in nonparametric estimation via kernel and series methods, where the classes of functions change with the sample size and are non-Donsker. Importantly, our new technique is able to prove the Gaussian approximation for the supremum type statistics under weak regularity conditions, especially concerning the bandwidth and the number of series functions, in those examples.

Article information

Ann. Statist., Volume 42, Number 4 (2014), 1564-1597.

First available in Project Euclid: 7 August 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60F17: Functional limit theorems; invariance principles 62E17: Approximations to distributions (nonasymptotic) 62G20: Asymptotic properties

Coupling empirical process Gaussian approximation kernel estimation local empirical process series estimation supremum


Chernozhukov, Victor; Chetverikov, Denis; Kato, Kengo. Gaussian approximation of suprema of empirical processes. Ann. Statist. 42 (2014), no. 4, 1564--1597. doi:10.1214/14-AOS1230.

Export citation


  • [1] Berthet, P. and Mason, D. M. (2006). Revisiting two strong approximation results of Dudley and Philipp. In High Dimensional Probability. Institute of Mathematical Statistics Lecture Notes—Monograph Series 51 155–172. IMS, Beachwood, OH.
  • [2] Bickel, P. J. and Rosenblatt, M. (1973). On some global measures of the deviations of density function estimates. Ann. Statist. 1 1071–1095.
  • [3] Boucheron, S., Bousquet, O., Lugosi, G. and Massart, P. (2005). Moment inequalities for functions of independent random variables. Ann. Probab. 33 514–560.
  • [4] Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford Univ. Press, Oxford.
  • [5] Bretagnolle, J. and Massart, P. (1989). Hungarian constructions from the nonasymptotic viewpoint. Ann. Probab. 17 239–256.
  • [6] Chatterjee, S. (2005). An error bound in the Sudakov–Fernique inequality. Available at arXiv:math/0510424.
  • [7] Chatterjee, S. and Meckes, E. (2008). Multivariate normal approximation using exchangeable pairs. ALEA Lat. Am. J. Probab. Math. Stat. 4 257–283.
  • [8] Chazal, F., Fasy, B., Lecci, F., Rinaldo, A. and Wasserman, L. (2013). Stochastic convergence of persistence landscapes and silhouettes. Available at arXiv:1312.0308.
  • [9] Chen, L., Goldstein, L. and Shao, Q.-M. (2011). Normal Approximation by Stein’s Method. Springer, Berlin.
  • [10] Chernozhukov, V., Chetverikov, D. and Kato, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Statist. 41 2786–2819.
  • [11] Chernozhukov, V., Chetverikov, D. and Kato, K. (2014a). Comparison and anti-concentration bounds for maxima of Gaussian random vectors. Probab. Theory Related Fields. To appear. Available at arXiv:1301.4807v3.
  • [12] Chernozhukov, V., Chetverikov, D. and Kato, K. (2014b). Supplement to “Gaussian approximation of suprema of empirical.” DOI:10.1214/14-AOS1230SUPP.
  • [13] Chernozhukov, V., Chetverikov, D. and Kato, K. (2014c). Anti-concentration and honest, adaptive confidence bands. Ann. Statist. To appear. Available at arXiv:1303.7152.
  • [14] Chernozhukov, V., Lee, S. and Rosen, A. M. (2013). Intersection bounds: Estimation and inference. Econometrica 81 667–737.
  • [15] Cohen, A., Daubechies, I. and Vial, P. (1993). Wavelets on the interval and fast wavelet transforms. Appl. Comput. Harmon. Anal. 1 54–81.
  • [16] Csörgő, M. and Horváth, L. (1993). Weighted Approximations in Probability and Statistics. Wiley, Chichester.
  • [17] Deheuvels, P. and Mason, D. M. (1994). Functional laws of the iterated logarithm for local empirical processes indexed by sets. Ann. Probab. 22 1619–1661.
  • [18] Dehling, H. (1983). Limit theorems for sums of weakly dependent Banach space valued random variables. Z. Wahrsch. Verw. Gebiete 63 393–432.
  • [19] Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Univ. Press, Cambridge.
  • [20] Dudley, R. M. and Philipp, W. (1983). Invariance principles for sums of Banach space valued random elements and empirical processes. Z. Wahrsch. Verw. Gebiete 62 509–552.
  • [21] Einmahl, U. and Mason, D. M. (1997). Gaussian approximation of local empirical processes indexed by functions. Probab. Theory Related Fields 107 283–311.
  • [22] Einmahl, U. and Mason, D. M. (1998). Strong approximations to the local empirical process. In High Dimensional Probability (Oberwolfach, 1996) (E. Eberlein, M. Hahn and M. Talagrand, eds.) 75–92. Birkhäuser, Basel.
  • [23] Einmahl, U. and Mason, D. M. (2000). An empirical process approach to the uniform consistency of kernel-type function estimators. J. Theoret. Probab. 13 1–37.
  • [24] Einmahl, U. and Mason, D. M. (2005). Uniform in bandwidth consistency of kernel-type function estimators. Ann. Statist. 33 1380–1403.
  • [25] Ghosal, S., Sen, A. and van der Vaart, A. W. (2000). Testing monotonicity of regression. Ann. Statist. 28 1054–1082.
  • [26] Giné, E. and Guillou, A. (2001). On consistency of kernel density estimators for randomly censored data: Rates holding uniformly over adaptive intervals. Ann. Inst. Henri Poincaré Probab. Stat. 37 503–522.
  • [27] Giné, E. and Guillou, A. (2002). Rates of strong uniform consistency for multivariate kernel density estimators. Ann. Inst. Henri Poincaré Probab. Stat. 38 907–921.
  • [28] Giné, E. and Nickl, R. (2009). Uniform limit theorems for wavelet density estimators. Ann. Probab. 37 1605–1646.
  • [29] Giné, E. and Nickl, R. (2010). Confidence bands in density estimation. Ann. Statist. 38 1122–1170.
  • [30] Huang, J. Z. (1998). Projection estimation in multiple regression with application to functional ANOVA models. Ann. Statist. 26 242–272.
  • [31] Huang, J. Z. (2003). Asymptotics for polynomial spline regression under weak conditions. Statist. Probab. Lett. 65 207–216.
  • [32] Kerkyacharian, G., Nickl, R. and Picard, D. (2012). Concentration inequalities and confidence bands for needlet density estimators on compact homogeneous manifolds. Probab. Theory Related Fields 153 363–404.
  • [33] Kolmogorov, A. (1933). Sulla determinazione empirica di una legge di distribuzione. Inst. Ital. Atti. Giorn. 4 83–91.
  • [34] Koltchinskii, V. I. (1994). Komlos–Major–Tusnady approximation for the general empirical process and Haar expansions of classes of functions. J. Theoret. Probab. 7 73–118.
  • [35] Komlós, J., Major, P. and Tusnády, G. (1975). An approximation of partial sums of independent $\mathrm{RV}$’s and the sample $\mathrm{DF}$. I. Z. Wahrsch. Verw. Gebiete 32 111–131.
  • [36] Konakov, V. D. and Piterbarg, V. I. (1984). On the convergence rate of maximal deviation distribution for kernel regression estimates. J. Multivariate Anal. 15 279–294.
  • [37] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. Springer, Berlin.
  • [38] Le Cam, L. (1988). On the Prokhorov distance between the empirical process and the associated Gaussian bridge. Technical report 170, Dept. Statistics, Univ. California, Berkeley.
  • [39] Lounici, K. and Nickl, R. (2011). Global uniform risk bounds for wavelet deconvolution estimators. Ann. Statist. 39 201–231.
  • [40] Mason, D. M. (2004). A uniform functional law of the logarithm for the local empirical process. Ann. Probab. 32 1391–1418.
  • [41] Mason, D. M. and van Zwet, W. R. (1987). A refinement of the KMT inequality for the uniform empirical process. Ann. Probab. 15 871–884.
  • [42] Massart, P. (1989). Strong approximation for multivariate empirical and related processes, via KMT constructions. Ann. Probab. 17 266–291.
  • [43] Meckes, E. (2009). On Stein’s method for multivariate normal approximation. In High Dimensional Probability V: The Luminy Volume. IMS Collections 5 153–178. IMS, Beachwood, OH.
  • [44] Neumann, M. H. (1998). Strong approximation of density estimators from weakly dependent observations by density estimators from independent observations. Ann. Statist. 26 2014–2048.
  • [45] Newey, W. K. (1997). Convergence rates and asymptotic normality for series estimators. J. Econometrics 79 147–168.
  • [46] Nolan, D. and Pollard, D. (1987). $U$-processes: Rates of convergence. Ann. Statist. 15 780–799.
  • [47] Norvaiša, R. and Paulauskas, V. (1991). Rate of convergence in the central limit theorem for empirical processes. J. Theoret. Probab. 4 511–534.
  • [48] Panchenko, D. (2013). The Sherrington–Kirkpatrick Model. Springer, New York.
  • [49] Pollard, D. (2002). A User’s Guide to Measure Theoretic Probability. Cambridge Univ. Press, Cambridge.
  • [50] Reinert, G. and Röllin, A. (2009). Multivariate normal approximation with Stein’s method of exchangeable pairs under a general linearity condition. Ann. Probab. 37 2150–2173.
  • [51] Rio, E. (1994). Local invariance principles and their application to density estimation. Probab. Theory Related Fields 98 21–45.
  • [52] Schmidt-Hieber, J., Munk, A. and Dümbgen, L. (2013). Multiscale methods for shape constraints in deconvolution: Confidence statements for qualitative features. Ann. Statist. 41 1299–1328.
  • [53] Settati, A. (2009). Gaussian approximation of the empirical process under random entropy conditions. Stochastic Process. Appl. 119 1541–1560.
  • [54] Stein, C. (1972). A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. II: Probability Theory 583–602. Univ. California Press, Berkeley, CA.
  • [55] Stein, C. (1986). Approximate Computation of Expectations. Institute of Mathematical Statistics Lecture Notes—Monograph Series 7. IMS, Hayward, CA.
  • [56] Talagrand, M. (1996). New concentration inequalities in product spaces. Invent. Math. 126 505–563.
  • [57] Talagrand, M. (2005). The Generic Chaining. Springer, Berlin.
  • [58] van der Vaart, A. and Wellner, J. A. (2011). A local maximal inequality under uniform entropy. Electron. J. Stat. 5 192–203.
  • [59] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York.
  • [60] Yurinskii, V. V. (1977). On the error of the Gaussian approximation for convolutions. Theory Probab. Appl. 2 236–247.
  • [61] Zaitsev, A. Yu. (1987). On the Gaussian approximation of convolutions under multidimensional analogues of S. N. Bernstein’s inequality conditions. Probab. Theory Related Fields 74 535–566.

Supplemental materials

  • Supplement to “Gaussian approximation of suprema of empirical processes”. This supplemental file contains the additional technical proofs omitted in the main text, and some technical tools used in the proofs.