The Annals of Statistics

Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences

Ismaël Castillo and Aad van der Vaart

Full-text: Open access

Abstract

We consider full Bayesian inference in the multivariate normal mean model in the situation that the mean vector is sparse. The prior distribution on the vector of means is constructed hierarchically by first choosing a collection of nonzero means and next a prior on the nonzero values. We consider the posterior distribution in the frequentist set-up that the observations are generated according to a fixed mean vector, and are interested in the posterior distribution of the number of nonzero components and the contraction of the posterior distribution to the true mean vector. We find various combinations of priors on the number of nonzero coefficients and on these coefficients that give desirable performance. We also find priors that give suboptimal convergence, for instance, Gaussian priors on the nonzero coefficients. We illustrate the results by simulations.

Article information

Source
Ann. Statist., Volume 40, Number 4 (2012), 2069-2101.

Dates
First available in Project Euclid: 30 October 2012

Permanent link to this document
https://projecteuclid.org/euclid.aos/1351602537

Digital Object Identifier
doi:10.1214/12-AOS1029

Mathematical Reviews number (MathSciNet)
MR3059077

Zentralblatt MATH identifier
1257.62025

Subjects
Primary: 62G05: Estimation 62G20: Asymptotic properties

Keywords
Bayesian estimators sparsity Gaussian sequence model mixture priors asymptotics contraction

Citation

Castillo, Ismaël; van der Vaart, Aad. Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences. Ann. Statist. 40 (2012), no. 4, 2069--2101. doi:10.1214/12-AOS1029. https://projecteuclid.org/euclid.aos/1351602537


Export citation

References

  • [1] Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34 584–653.
  • [2] Abramovich, F., Grinshtein, V. and Pensky, M. (2007). On optimality of Bayesian testimation in the normal means problem. Ann. Statist. 35 2261–2286.
  • [3] Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • [4] Birgé, L. and Massart, P. (2001). Gaussian model selection. J. Eur. Math. Soc. (JEMS) 3 203–268.
  • [5] Brown, L. D. and Greenshtein, E. (2009). Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means. Ann. Statist. 37 1685–1704.
  • [6] Cai, T. T., Jin, J. and Low, M. G. (2007). Estimation and confidence sets for sparse normal mixtures. Ann. Statist. 35 2421–2449.
  • [7] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
  • [8] Castillo, I. (2008). Lower bounds for posterior rates with Gaussian process priors. Electron. J. Stat. 2 1281–1299.
  • [9] Castillo, I. and van der Vaart, A. W. (2012). Supplement to “Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences.” DOI:10.1214/12-AOS1029SUPP.
  • [10] Donoho, D. L. and Johnstone, I. M. (1994). Minimax risk over $l_p$-balls for $l_q$-error. Probab. Theory Related Fields 99 277–303.
  • [11] Donoho, D. L., Johnstone, I. M., Hoch, J. C. and Stern, A. S. (1992). Maximum entropy and the nearly black object. J. R. Stat. Soc. Ser. B Stat. Methodol. 54 41–81. With discussion and a reply by the authors.
  • [12] George, E. I. and Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika 87 731–747.
  • [13] Golubev, G. K. (2002). Reconstruction of sparse vectors in white Gaussian noise. Problemy Peredachi Informatsii 38 75–91.
  • [14] Huang, J., Ma, S. and Zhang, C.-H. (2008). Adaptive Lasso for sparse high-dimensional regression models. Statist. Sinica 18 1603–1618.
  • [15] Jiang, W. and Zhang, C.-H. (2009). General maximum likelihood empirical Bayes estimation of normal means. Ann. Statist. 37 1647–1684.
  • [16] Johnstone, I. M. and Silverman, B. W. (2004). Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Statist. 32 1594–1649.
  • [17] Johnstone, I. M. and Silverman, B. W. (2005). Empirical Bayes selection of wavelet thresholds. Ann. Statist. 33 1700–1752.
  • [18] Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. Ann. Statist. 38 2587–2619.
  • [19] Yuan, M. and Lin, Y. (2005). Efficient empirical Bayes variable selection and estimation in linear models. J. Amer. Statist. Assoc. 100 1215–1225.
  • [20] Zhang, C.-H. (2005). General empirical Bayes wavelet methods and exactly adaptive minimax estimation. Ann. Statist. 33 54–100.
  • [21] Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567–1594.
  • [22] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.

Supplemental materials

  • Supplementary material: Supplement to “Needles and Straw in a Haystack: Posterior concentration for possibly sparse sequences”. This supplementary file contains the proofs of some technical results appearing in the paper.