Electronic Journal of Statistics

Sparse model selection under heterogeneous noise: Exact penalisation and data-driven thresholding

Laurent Cavalier and Markus Reiß

Full-text: Open access


We consider a Gaussian sequence space model $X_{\lambda}=f_{\lambda}+\xi_{\lambda},$ where the noise variables $(\xi_{\lambda})_{\lambda}$ are independent, but with heterogeneous variances $(\sigma_{\lambda}^{2})_{\lambda}$. Our goal is to estimate the unknown signal vector $(f_{\lambda})$ by a model selection approach. We focus on the situation where the non-zero entries $f_{\lambda}$ are sparse. Then the heterogenous case is much more involved than the homogeneous model where $\sigma_{\lambda}^{2}=\sigma^{2}$ is constant. Indeed, we can no longer profit from symmetry inside the stochastic process that one needs to control. The problem and the penalty do not only depend on the number of coefficients that one selects, but also on their position. This appears also in the minimax bounds where the worst coefficients will go to the larger variances. With a careful and explicit choice of the penalty, however, we are able to select the correct coefficients and get a sharp non-asymptotic control of the risk of our procedure. Some finite sample results from simulations are provided.

Article information

Electron. J. Statist., Volume 8, Number 1 (2014), 432-455.

First available in Project Euclid: 18 April 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G05: Estimation
Secondary: 62J05: Linear regression

Sparse oracle inequality optimal threshold statistical inverse problem risk hull penalized empirical risk full subset selection heteroskedastic noise


Cavalier, Laurent; Reiß, Markus. Sparse model selection under heterogeneous noise: Exact penalisation and data-driven thresholding. Electron. J. Statist. 8 (2014), no. 1, 432--455. doi:10.1214/14-EJS889. https://projecteuclid.org/euclid.ejs/1397826707

Export citation


  • Abramovich F., Benjamini Y., Donoho D.L. and Johnstone I.M. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34, 584–653.
  • Abramovich F. and Silverman B.W. (1998). Wavelet decomposition approaches to statistical inverse problems. Biometrika 85, 115–129.
  • Akaike H. (1973). Information theory and an extension of the maximum likelihood principle. Proc. 2nd Intern. Symp. Inf. Theory, Petrov P.N. and Csaki F. eds. Budapest, 267–281.
  • Benjamini Y. and Hochberg Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Royal Stat. Soc. Ser. B 57, 289–300.
  • Birgé L. and Massart P. (2001). Gaussian model selection. J. Eur. Math. Soc. 3 203–268.
  • Cavalier L. (2004). Estimation in a problem of fractional integration. Inverse Problems 20, 1–10.
  • Cavalier L. (2011). Inverse problems in statistics. Inverse problems and high-dimensional estimation, Lecture Notes in Statistics, Springer.
  • Cavalier L. and Golubev Yu. (2006). Risk hull method and regularization by projections of ill-posed inverse problems. Ann. Statist. 34, 1653–1677.
  • Cavalier L., Golubev G.K., Picard D. and Tsybakov A.B. (2002). Oracle inequalities in inverse problems. Ann. Statist. 30, 843–874.
  • Cavalier L. and Raimondo M. (2007). Wavelet deconvolution with noisy eigenvalues. IEEE Trans. Signal Proc. 55, 2414–2424.
  • Cohen A., Hoffmann M. and Reiß M. (2004). Adaptive wavelet Galerkin method for linear inverse problems. SIAM J. Numer. Anal. 42, 1479–1501.
  • Comte F. and Renault E. (1996). Long memory continuous time models. Journal of Econometrics 73, 101–149.
  • Donoho D.L. (1995). Nonlinear solution of linear inverse problems by wavelet-vaguelette decomposition. Appl. and Comput. Harmon. Anal. 2, 101–126.
  • Golubev Y. (2002). Reconstruction of sparse vectors in white Gaussian noise. Probl. Inf. Transm. 1 65–79.
  • Golubev Y. (2011). On oracle inequalities related to data-driven hard thresholding. Probab. Theory Related Fields 150, 435–469.
  • Hoffmann M. and Reiß M. (2008). Nonlinear estimation for linear inverse problems with error in the operator. Ann. Statist. 36, 310–336.
  • Johnstone I.M. (2011). Gaussian estimation: Sequence and wavelets models. Book to appear.
  • Johnstone I.M. and D. Paul (2013). Adaptation in a class of linear inverse problems. arXiv:1310.7149.
  • Johnstone I.M. and Silverman B.W. (1997). Wavelet threshold estimators for data with correlated noise. J. Royal Stat. Soc. Ser. B 59, 300–351.
  • Massart P. (2007). Concentration inequalities and model selection. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003. Lecture Notes in Mathematics, Springer, Berlin.
  • Rochet P. (2013). Adaptive hard-thresholding for linear inverse problems. To appear in ESAIM.
  • Sowell F. (1990). The fractional unit root distribution. Econometrica 58, 495–505.
  • Wang Y. (1996). Function estimation via wavelet shrinkage for long-memory data. Annals of Statist. 24, 466–484.
  • Wu Z. and Zhou H.H. (2013). Model selection and sharp asymptotic minimaxity. Probab. Theory Related Fields 156, 193–227.
  • Zygmund A. (1959). Trigonometric series. 2nd ed. Vols. I, II. Cambridge University Press, New York.