The Annals of Probability

Concentration inequalities using the entropy method

Stéphane Boucheron, Gábor Lugosi, and Pascal Massart

Full-text: Open access

Abstract

We investigate a new methodology, worked out by Ledoux and Massart, to prove concentration-of-measure inequalities. The method is based on certain modified logarithmic Sobolev inequalities. We provide some very simple and general ready-to-use inequalities. One of these inequalities may be considered as an exponential version of the Efron--Stein inequality. The main purpose of this paper is to point out the simplicity and the generality of the approach. We show how the new method can recover many of Talagrand's revolutionary inequalities and provide new applications in a variety of problems including Rademacher averages, Rademacher chaos, the number of certain small subgraphs in a random graph, and the minimum of the empirical risk in some statistical estimation problems.

Article information

Source
Ann. Probab., Volume 31, Number 3 (2003), 1583-1614.

Dates
First available in Project Euclid: 12 June 2003

Permanent link to this document
https://projecteuclid.org/euclid.aop/1055425791

Digital Object Identifier
doi:10.1214/aop/1055425791

Mathematical Reviews number (MathSciNet)
MR1989444

Zentralblatt MATH identifier
1051.60020

Subjects
Primary: 60E15: Inequalities; stochastic orderings 60C05: Combinatorial probability 28A35: Measures and integrals in product spaces
Secondary: 05C80: Random graphs [See also 60B20]

Keywords
Concentration inequalities empirical processes random graphs.

Citation

Boucheron, Stéphane; Lugosi, Gábor; Massart, Pascal. Concentration inequalities using the entropy method. Ann. Probab. 31 (2003), no. 3, 1583--1614. doi:10.1214/aop/1055425791. https://projecteuclid.org/euclid.aop/1055425791


Export citation

References

  • [1] AHLSWEDE, R., GÁCS, P. and K ORNER, J. (1976). Bounds on conditional probabilities with applications in multi-user communication. Z. Wahrsch. Verw. Gebiete 34 157-177. [Correction (1977) 39 353-354.]
  • [2] BARTLETT, P., BOUCHERON, S. and LUGOSI, G. (2002). Model selection and error estimation. Machine Learning 48 85-113.
  • [3] BOBKOV, S. G. and LEDOUX, M. (1998). On modified logarithmic Sobolev inequalities for Bernouilli and Poisson measures. J. Funct. Anal. 156 347-365.
  • [4] BOUCHERON, S., LUGOSI, G. and MASSART, P. (2000). A sharp concentration inequality with applications in random combinatorics and learning. Random Structures Algorithms 16 277-292.
  • [5] BOUSQUET, O. (2002). A Bennett concentration inequality and its application to suprema of empirical processes. C. R. Acad. Sci. Paris Ser. I 334 495-500.
  • [6] DE LA PEÑA, V. H. and GINÉ, E. (1999). Decoupling: From Dependence to Independence. Springer, New York.
  • [7] DEMBO, A. (1997). Information inequalities and concentration of measure. Ann. Probab. 25 927-939.
  • [8] EFRON, B. and STEIN, C. (1981). The jackknife estimate of variance. Ann. Statist. 9 586-596.
  • [9] HALL, G. H., LITTLEWOOD, J. E. and PÓLy A, G. (1952). Inequalities. Cambridge Univ. Press.
  • [10] JANSON, S. (2000). Poisson approximation for large deviations. Random Structures Algorithms 1 221-230.
  • [11] JANSON, S., LUCZAK, T. and RUCI ´NSKI, A. (2000). Random Graphs. Wiley, New York.
  • [12] JANSON, S. and RUCI ´NSKI, A. (2000). The deletion method for upper tail estimates. Technical Report 20, Uppsala Univ.
  • [13] JANSON, S. and RUCI ´NSKI, A. (2002). The infamous upper tail. Random Structures Algorithms 20 317-342.
  • [14] KIM, J. H. and VU, V. (2000). Concentration of multivariate poly nomials and applications. Combinatorica 20 417-434.
  • [15] KOLTCHINSKII, V. and PANCHENKO, D. (2002). Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Statist. 30 1-50.
  • [16] LEDOUX, M. (1996). Isoperimetry and Gaussian Analy sis. Lecture Notes in Math. 1648 165-294. Springer, New York.
  • [17] LEDOUX, M. (1996). On Talagrand's deviation inequalities for product measures. ESAIM: Probab. Statist. 1 63-87.
  • [18] LEDOUX, M. (2001). The Concentration of Measure Phenomenon. Amer. Math. Soc., Alexandria, VA.
  • [19] LEDOUX, M. and TALAGRAND, M. (1991). Probability in Banach Spaces. Springer, New York.
  • [20] MARTON, K. (1986). A simple proof of the blowing-up lemma. IEEE Trans. Inform. Theory 32 445-446.
  • [21] MARTON, K. (1996). Bounding ¯d-distance by informational divergence: A way to prove measure concentration. Ann. Probab. 24 857-866.
  • [22] MARTON, K. (1996). A measure concentration inequality for contracting Markov chains. Geom. Funct. Anal. 6 556-571. [Correction (1997) 7 609-613.]
  • [23] MASSART, P. (1998). Optimal constants for Hoeffding ty pe inequalities. Technical Report 98.86, Univ. Paris-Sud.
  • [24] MASSART, P. (2000). About the constants in Talagrand's concentration inequalities for empirical processes. Ann. Probab. 28 863-884.
  • [25] MASSART, P. (2000). Some applications of concentration inequalities to statistics. Ann. Fac. Sci. Toulouse Math. 9 245-303.
  • [26] MCDIARMID, C. (1989). On the method of bounded differences. In Survey s in Combinatorics (J. Siemons, ed.) 148-188. Cambridge Univ. Press.
  • [27] MCDIARMID, C. (1998). Concentration. In Probabilistic Methods for Algorithmic Discrete Mathematics (M. Habib, C. McDiarmid, J. Ramirez-Alfonsin and B. Reed, eds.)195-248. Springer, New York.
  • [28] RIO, E. (2001). Inégalités de concentration pour les processus empiriques de classes de parties. Probab. Theory Related Fields 119 163-175.
  • [29] SION, M. (1958). On general minimax theorems. Pacific J. Math. 8 171-176.
  • [30] STEELE, J. M. (1986). An Efron-Stein inequality for nonsy mmetric statistics. Ann. Statist. 14 753-758.
  • [31] STEELE, J. M. (1996). Probability Theory and Combinatorial Optimization. SIAM, Philadelphia.
  • [32] TALAGRAND, M. (1995). Concentration of measure and isoperimetric inequalities in product spaces. Publ. Math. I.H.E.S. 81 73-205.
  • [33] TALAGRAND, M. (1996). New concentration inequalities in product spaces. Invent. Math. 126 505-563.
  • [34] TALAGRAND, M. (1996). A new look at independence. Ann. Probab. 24 1-34.
  • [35] VAN DER WAART, A. W. and WELLNER, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
  • [36] VU, V. (2000). On the concentration of multivariate poly nomials with small expectation. Random Structures Algorithms 16 344-363.
  • [37] VU, V. (2001). A large deviation result on the number of small subgraphs of a random graph. Combin. Probab. Comput. 1 79-94. S. BOUCHERON
  • LRI, UMR 8623 CNRS BÂTIMENT 490 CNRS-UNIVERSITÉ PARIS-SUD 91405 ORSAY-CEDEX FRANCE E-MAIL: Stephane.Boucheron@lri.fr G. LUGOSI DEPARTMENT OF ECONOMICS POMPEU FABRA UNIVERSITY RAMON TRIAS FARGAS 25-27 08005 BARCELONA SPAIN E-MAIL: lugosi@upf.es P. MASSART MATHÉMATIQUES BÂTIMENT 425 UNIVERSITÉ PARIS-SUD 91405 ORSAY-CEDEX FRANCE E-MAIL: Pascal.Massart@math.u-psud.fr