The Annals of Statistics

Adaptive estimation of a quadratic functional by model selection

B. Laurent and P. Massart

Full-text: Open access


We consider the problem of estimating $\|s\|^2$ when $s$ belongs to some separable Hilbert space and one observes the Gaussian process $Y(t) = \langles, t\rangle + \sigmaL(t)$, for all $t \epsilon \mathbb{H}$,where $L$ is some Gaussian isonormal process. This framework allows us in particular to consider the classical “Gaussian sequence model” for which $\mathbb{H} = l_2(\mathbb{N}*)$ and $L(t) = \sum_{\lambda\geq1}t_{\lambda}\varepsilon_{\lambda}$, where $(\varepsilon_{\lambda})_{\lambda\geq1}$ is a sequence of i.i.d. standard normal variables. Our approach consists in considering some at most countable families of finite-dimensional linear subspaces of $\mathbb{H}$ (the models) and then using model selection via some conveniently penalized least squares criterion to build new estimators of $\|s\|^2$. We prove a general nonasymptotic risk bound which allows us to show that such penalized estimators are adaptive on a variety of collections of sets for the parameter $s$, depending on the family of models from which they are built.In particular, in the context of the Gaussian sequence model, a convenient choice of the family of models allows defining estimators which are adaptive over collections of hyperrectangles, ellipsoids, $l_p$-bodies or Besov bodies.We take special care to describe the conditions under which the penalized estimator is efficient when the level of noise $\sigma$ tends to zero. Our construction is an alternative to the one by Efroïmovich and Low for hyperrectangles and provides new results otherwise.

Article information

Ann. Statist. Volume 28, Number 5 (2000), 1302-1338.

First available: 12 March 2002

Permanent link to this document

Mathematical Reviews number (MathSciNet)

Digital Object Identifier

Zentralblatt MATH identifier

Primary: 62G05: Estimation
Secondary: 62G20: Asymptotic properties 62J02: General nonlinear regression

Adaptive estimation quadratic functionals model selection Besov bodies $l_p$-bodies Gaussian sequence model efficient estimation


Laurent, B.; Massart, P. Adaptive estimation of a quadratic functional by model selection. The Annals of Statistics 28 (2000), no. 5, 1302--1338. doi:10.1214/aos/1015957395.

Export citation


  • Baraud, Y. (2000). Model selection for regression on a fixed design. Probab. Theory Related Fields 117 467-493.
  • Barron, A. R., Birg´e, L. and Massart, P. (1999). Risk bound for model selection via penalization. Probab. Theory Related Fields 113 301-415.
  • Bickel, P. and Ritov, Y. (1988). Estimating integrated squared density derivatives: sharp best order of convergence estimates. Sankhy ¯a Ser. A 50 381-393.
  • Birg´e, L. (1983). Approximation dans les espaces m´etriques et th´eorie de l'estimation. Z. Wahrsch. Verw. Gebiete 65 181-237.
  • Birg´e, L. and Massart, P. (1995). Estimation of integral functionals of a density. Ann. Statist. 23 11-29.
  • Birg´e, L. and Massart, P. (1997). From model selection to adaptive estimation. In Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics (D. Pollard, E. Torgersen and G. Yang, eds.) 55-87. Springer, New York.
  • Birg´e, L. and Massart, P. (1998). Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4 329-375. Birg´e, L. and Massart, P. (2000a). An adaptive compression algorithm in Besov spaces. Constr. Approx. 16 1-36. Birg´e, L. and Massart, P. (2000b). Gaussian model selection. Technical Report 2000.05, Univ. Paris Sud.
  • DeVore, R. A., Jawerth, B. and Popov, V. (1992). Compression of wavelet decompositions. Amer. J. Math. 114 737-785.
  • DeVore, R. A. and Lorentz, G. G. (1993). Constructive Approximation. Springer, New York.
  • DeVore, R. A., Kyriazis, G. Leviatan, D. and Tikhomirov, V. M. (1993). Wavelet compression and nonlinear n-widths. Adv. Comput. Math. 1 197-214.
  • Johnstone, I. (1999). Chi-square oracle inequalities. Preprint.
  • Donoho, D. and Johnstone, I. (1998). Minimax estimation via wavelet shrinkage. Ann. Statist. 26 879-921.
  • Donoho, D. and Liu, R. (1991). Geometrizing rates of convergence II. Ann. Statist. 19 633-668.
  • Donoho, D. and Nussbaum, M. (1990). Minimax quadratic estimation of a quadratic functional. J. Complexity 6 290-323.
  • Dudley, R. M. (1973). Sample functions of the Gaussian process. Ann. Probab. 1 66-103.
  • Efro¨imovich, S. and Low, M. (1996). On optimal adaptive estimation of a quadratic functional. Ann. Statist. 24 1106-1125.
  • Gayraud, G. and Tribouley, K. (1999). Wavelet methods to estimate an integrated quadratic functional: adaptivity and asymptotic law. Statist. Probab. Lett. 44 109-122.
  • Laurent, B. (1996). Efficient estimation of integral functionals of a density. Ann. Statist. 24 659-681.
  • Lepskii, O. V. (1990). On a problem of adaptive estimation in Gaussian white noise. Theory Probab. Appl. 35 454-466.
  • Lepskii, O. V. (1992). On problems of adaptive estimation in Gaussian white noise. Adv. Soviet Math. 12 87-106.