The Annals of Statistics

Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences

Ismaël Castillo and Aad van der Vaart

Full-text: Open access


We consider full Bayesian inference in the multivariate normal mean model in the situation that the mean vector is sparse. The prior distribution on the vector of means is constructed hierarchically by first choosing a collection of nonzero means and next a prior on the nonzero values. We consider the posterior distribution in the frequentist set-up that the observations are generated according to a fixed mean vector, and are interested in the posterior distribution of the number of nonzero components and the contraction of the posterior distribution to the true mean vector. We find various combinations of priors on the number of nonzero coefficients and on these coefficients that give desirable performance. We also find priors that give suboptimal convergence, for instance, Gaussian priors on the nonzero coefficients. We illustrate the results by simulations.

Article information

Ann. Statist., Volume 40, Number 4 (2012), 2069-2101.

First available in Project Euclid: 30 October 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G05: Estimation 62G20: Asymptotic properties

Bayesian estimators sparsity Gaussian sequence model mixture priors asymptotics contraction


Castillo, Ismaël; van der Vaart, Aad. Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences. Ann. Statist. 40 (2012), no. 4, 2069--2101. doi:10.1214/12-AOS1029.

Export citation


  • [1] Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34 584–653.
  • [2] Abramovich, F., Grinshtein, V. and Pensky, M. (2007). On optimality of Bayesian testimation in the normal means problem. Ann. Statist. 35 2261–2286.
  • [3] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • [4] Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. (JEMS) 3 203–268.
  • [5] Brown, L. D. and Greenshtein, E. (2009). Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means. Ann. Statist. 37 1685–1704.
  • [6] Cai, T. T., Jin, J. and Low, M. G. (2007). Estimation and confidence sets for sparse normal mixtures. Ann. Statist. 35 2421–2449.
  • [7] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
  • [8] Castillo, I. (2008). Lower bounds for posterior rates with Gaussian process priors. Electron. J. Stat. 2 1281–1299.
  • [9] Castillo, I. and van der Vaart, A. W. (2012). Supplement to “Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences.” DOI:10.1214/12-AOS1029SUPP.
  • [10] Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over $l_p$-balls for $l_q$-error. Probab. Theory Related Fields 99 277–303.
  • [11] Donoho, D. L., Johnstone, I. M., Hoch, J. C. and Stern, A. S. (1992). Maximum entropy and the nearly black object. J. R. Stat. Soc. Ser. B Stat. Methodol. 54 41–81. With discussion and a reply by the authors.
  • [12] George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87 731–747.
  • [13] Golubev, G. K. (2002). Reconstruction of sparse vectors in white Gaussian noise. Problemy Peredachi Informatsii 38 75–91.
  • [14] Huang, J., Ma, S. and Zhang, C.-H. (2008). Adaptive Lasso for sparse high-dimensional regression models. Statist. Sinica 18 1603–1618.
  • [15] Jiang, W. and Zhang, C.-H. (2009). General maximum likelihood empirical Bayes estimation of normal means. Ann. Statist. 37 1647–1684.
  • [16] Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594–1649.
  • [17] Johnstone, I. M. and Silverman, B. W. (2005). Empirical Bayes selection of wavelet thresholds. Ann. Statist. 33 1700–1752.
  • [18] Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Ann. Statist. 38 2587–2619.
  • [19] Yuan, M. and Lin, Y. (2005). Efficient empirical Bayes variable selection and estimation in linear models. J. Amer. Statist. Assoc. 100 1215–1225.
  • [20] Zhang, C.-H. (2005). General empirical Bayes wavelet methods and exactly adaptive minimax estimation. Ann. Statist. 33 54–100.
  • [21] Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567–1594.
  • [22] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.

Supplemental materials

  • Supplementary material: Supplement to “Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences”. This supplementary file contains the proofs of some technical results appearing in the paper.