The Annals of Statistics

On optimality of Bayesian testimation in the normal means problem

Felix Abramovich, Vadim Grinshtein, and Marianna Pensky

Full-text: Open access


We consider a problem of recovering a high-dimensional vector μ observed in white noise, where the unknown vector μ is assumed to be sparse. The objective of the paper is to develop a Bayesian formalism which gives rise to a family of l0-type penalties. The penalties are associated with various choices of the prior distributions πn(⋅) on the number of nonzero entries of μ and, hence, are easy to interpret. The resulting Bayesian estimators lead to a general thresholding rule which accommodates many of the known thresholding and model selection procedures as particular cases corresponding to specific choices of πn(⋅). Furthermore, they achieve optimality in a rather general setting under very mild conditions on the prior. We also specify the class of priors πn(⋅) for which the resulting estimator is adaptively optimal (in the minimax sense) for a wide range of sparse sequences and consider several examples of such priors.

Article information

Ann. Statist., Volume 35, Number 5 (2007), 2261-2286.

First available in Project Euclid: 7 November 2007

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62C10: Bayesian problems; characterization of Bayes procedures
Secondary: 62C20: Minimax procedures 62G05: Estimation

Adaptivity complexity penalty maximum a posteriori rule minimax estimation sequence estimation sparsity thresholding


Abramovich, Felix; Grinshtein, Vadim; Pensky, Marianna. On optimality of Bayesian testimation in the normal means problem. Ann. Statist. 35 (2007), no. 5, 2261--2286. doi:10.1214/009053607000000226.

Export citation


  • Abramovich, F. and Angelini, C. (2006). Bayesian maximum a posteriori multiple testing procedure. Sankhyā 68 436–460.
  • Abramovich, F. and Benjamini, Y. (1995). Thresholding of wavelet coefficients as a multiple hypotheses testing procedure. In Wavelets and Statistics. Lecture Notes in Statist. 103 5–14. Springer, New York.
  • Abramovich, F. and Benjamini, Y. (1996). Adaptive thresholding of wavelet coefficients. Comput. Statist. Data Anal. 22 351–361.
  • Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34 584–653.
  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (B. N. Petrov and F. Csáki, eds.) 267–281. Akadémiai Kiadó, Budapest.
  • Antoniadis, A. and Fan, J. (2001). Regularization of wavelet approximations (with discussion). J. Amer. Statist. Assoc. 96 939–967.
  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
  • Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. 3 203–268.
  • Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation via wavelet shrinkage. Biometrika 81 425–455.
  • Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over $\ell_p$-balls for $\ell_q$-error. Probab. Theory Related Fields 99 277–303.
  • Donoho, D. L. and Johnstone, I. M. (1996). Neo-classical minimax problems, thresholding and adaptive function estimation. Bernoulli 2 39–62.
  • Donoho, D. L., Johnstone, I. M., Hoch, J. C. and Stern, A. S. (1992). Maximum entropy and the nearly black object (with discussion). J. Roy. Statist. Soc. Ser. B 54 41–81.
  • Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • Foster, D. and George, E. (1994). The risk inflation criterion for multiple regression. Ann. Statist. 22 1947–1975.
  • Foster, D. and Stine, R. (1999). Local asymptotic coding and the minimum description length. IEEE Trans. Inform. Theory 45 1289–1293.
  • Frank, I. E. and Friedman, J. H. (1993). A statistical view of some chemometrics regression tools (with discussion). Technometrics 35 109–148.
  • Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75 800–802.
  • Holm, S. (1979). A simple sequentially rejective multiple test procedure. Scand. J. Statist. 6 65–70.
  • Hunter, D. R. and Li, R. (2005). Variable selection using MM algorithms. Ann. Statist. 33 1617–1642.
  • Johnstone, I. M. (1994). Minimax Bayes, asymptotic minimax and sparse wavelet priors. In Statistical Decision Theory and Related Topics V (S. Gupta and J. Berger, eds.) 303–326. Springer, New York.
  • Johnstone, I. M. (2002). Function estimation and Gaussian sequence models. Unpublished manuscript.
  • Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594–1649.
  • Sarkar, S. K. (2002). Some results on false discovery rate in stepwise multiple testing procedures. Ann. Statist. 30 239–257.
  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
  • Tibshirani, R. and Knight, K. (1999). The covariance inflation criterion for adaptive model selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 529–546.