Electronic Journal of Statistics

Optimal upper and lower bounds for the true and empirical excess risks in heteroscedastic least-squares regression

Adrien Saumard

Full-text: Open access


We consider the estimation of a bounded regression function with nonparametric heteroscedastic noise and random design. We study the true and empirical excess risks of the least-squares estimator on finite-dimensional vector spaces. We give upper and lower bounds on these quantities that are nonasymptotic and optimal to first order, allowing the dimension to depend on sample size. These bounds show the equivalence between the true and empirical excess risks when, among other things, the least-squares estimator is consistent in sup-norm with the projection of the regression function onto the considered model. Consistency in the sup-norm is then proved for suitable histogram models and more general models of piecewise polynomials that are endowed with a localized basis structure.

Article information

Electron. J. Statist., Volume 6 (2012), 579-655.

First available in Project Euclid: 18 April 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Least-squares regression heteroscedasticity excess risk lower bounds sup-norm localized basis empirical process


Saumard, Adrien. Optimal upper and lower bounds for the true and empirical excess risks in heteroscedastic least-squares regression. Electron. J. Statist. 6 (2012), 579--655. doi:10.1214/12-EJS679. https://projecteuclid.org/euclid.ejs/1334754008

Export citation


  • [1] S. Arlot and F. Bach. Data-driven calibration of linear estimators with minimal penalties, September 2009., arXiv:0909.1884v1.
  • [2] S. Arlot and P. Massart. Data-driven calibration of penalties for least-squares regression., J. Mach. Learn. Res., 10:245–279 (electronic), 2009.
  • [3] Y. Baraud, C. Giraud, and S. Huet. Gaussian model selection with an unknown variance., Ann. Statist., 37(2):630–672, 2009.
  • [4] P. L. Bartlett and S. Mendelson. Empirical Minimization., Probab. Theory Related Fields, 135(3):311–334, 2006.
  • [5] L. Birgé and P. Massart. Minimum contrast estimators on sieves: Exponential bounds and rates of convergence., Bernoulli, 4(3):329–375, 1998.
  • [6] L. Birgé and P. Massart. Minimal penalties for Gaussian model selection., Probab. Theory Related Fields, 138(1-2):33–73, 2007.
  • [7] S. Boucheron and P. Massart. A high dimensional Wilks phenomenon., Probability Theory and Related Fields, 2010. To appear.
  • [8] O. Bousquet. A Bennett concentration inequality and its application to suprema of empirical processes., C. R. Math. Acad. Sci. Paris, 334(6):495–500, 2002.
  • [9] M. Crouzeix and A.L. Mignot., Analyse numérique des équations différentielles. Collection Mathématiques Appliquées pour la Maîtrise. [Collection of Applied Mathematics for the Master’s Degree]. Masson, Paris, 1984.
  • [10] E. Giné and V. Koltchinskii. Concentration inequalities and asymptotic results for ratio type empirical processes., Ann.Probab., 33 :1143–1216, 2006.
  • [11] R. Klein and E. Rio. Concentration around the mean for maxima of empirical processes., Annals of Probability, 1:63–87 (electronic), 2005.
  • [12] T. Klein. Une inégalité de concentration à gauche pour les processus empiriques., C.R. Acad. Sci. Paris, Ser I, 334:500–505, 2002.
  • [13] V. Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimisation., Ann. Statist., 34(6) :2593–2656, 2006.
  • [14] M. Ledoux. On Talagrand’s deviation inequalities for product measures., ESAIM: Probability and Statistics, 1:63–87, 1996.
  • [15] M. Ledoux and M. Talagrand., Probability in Banach spaces. Springer, Berlin, 1991.
  • [16] M. Lerasle. Optimal model selection in density estimation, 2009., arXiv:0910.1654.
  • [17] P. Massart., Concentration Inequalities and Model Selection. Springer-Verlag, 2007.
  • [18] P. Massart and E. Nédélec. Risks bounds for statistical learning., Ann.Stat., 34(5) :2326–2366, 2006.
  • [19] A. Tsybakov. Optimal aggregation of classifiers in statistical learning., Ann.Stat., 32:135–166, 2004.