Bayesian Analysis

Asymptotic Optimality of One-Group Shrinkage Priors in Sparse High-dimensional Problems

Prasenjit Ghosh and Arijit Chakrabarti

Full-text: Open access

Abstract

We study asymptotic optimality of inference in a high-dimensional sparse normal means model using a broad class of one-group shrinkage priors. Assuming that the proportion of non-zero means is known, we show that the corresponding Bayes estimates asymptotically attain the minimax risk (up to a multiplicative constant) for estimation with squared error loss. The constant is shown to be 1 for the important sub-class of “horseshoe-type” priors proving exact asymptotic minimaxity property for these priors, a result hitherto unknown in the literature. An empirical Bayes version of the estimator is shown to achieve the minimax rate in case the level of sparsity is unknown. We prove that the resulting posterior distributions contract around the true mean vector at the minimax optimal rate and provide important insight about the possible rate of posterior contraction around the corresponding Bayes estimator. Our work shows that for rate optimality, a heavy tailed prior with sufficient mass around zero is enough, a pole at zero like the horseshoe prior is not necessary. This part of the work is inspired by Pas et al. (2014). We come up with novel unifying arguments to extend their results over the general class of priors. Next we focus on simultaneous hypothesis testing for the means under the additive 01 loss where the means are modeled through a two-groups mixture distribution. We study asymptotic risk properties of certain multiple testing procedures induced by the class of one-group priors under study, when applied in this set-up. Our key results show that the tests based on the “horseshoe-type” priors asymptotically achieve the risk of the optimal solution in this two-groups framework up to the correct constant and are thus asymptotically Bayes optimal under sparsity (ABOS). This is the first result showing that in a sparse problem a class of one-group priors can exactly mimic the performance of an optimal two-groups solution asymptotically. Our work shows an intrinsic technical connection between the theories of minimax estimation and simultaneous hypothesis testing for such one-group priors.

Article information

Source
Bayesian Anal. Volume 12, Number 4 (2017), 1133-1161.

Dates
First available in Project Euclid: 30 September 2016

Permanent link to this document
https://projecteuclid.org/euclid.ba/1475266758

Digital Object Identifier
doi:10.1214/16-BA1029

Keywords
asymptotic minimaxity posterior contraction ABOS sparsity one-group shrinkage priors horseshoe prior

Rights
Creative Commons Attribution 4.0 International License.

Citation

Ghosh, Prasenjit; Chakrabarti, Arijit. Asymptotic Optimality of One-Group Shrinkage Priors in Sparse High-dimensional Problems. Bayesian Anal. 12 (2017), no. 4, 1133--1161. doi:10.1214/16-BA1029. https://projecteuclid.org/euclid.ba/1475266758


Export citation

References

  • Armagan, A., Dunson, D. B., and Clyde, M. (2011). “Generalized Beta Mixtures of Gaussians.” In Shawe-Taylor, J., Zemel, R. S., Bartlett, P. L., Pereira, F. C. N., and Weinberger, K. Q. (eds.),Advances in Neural Information Processing Systems, volume 24, 523–531.
  • Armagan, A., Dunson, D. B., and Lee, J. (2012). “Generalized Double Pareto Shrinkage.”Statistica Sinica, 23(1): 119–143.
  • Bhadra, A., Datta, J., Polson, N., and Willard, B. (2015). “The Horseshoe+ Estimator of Ultra-Sparse Signals.”arXiv:1502.00560v2.
  • Bhattacharya, A., Pati, D., Pillai, N., and Dunson, D. B. (2012). “Bayesian Shrinkage.”arXiv:1212.6088v1.
  • Bhattacharya, A., Pati, D., Pillai, N., and Dunson, D. B. (2014). “Dirichlet-Laplace priors for optimal shrinkage.”arXiv:1401.5398v1.
  • Bogdan, M., Chakrabarti, A., Frommlet, F., and Ghosh, J. K. (2011). “Asymptotic Bayes-Optimality under Sparsity of Some Multiple Testing Procedures.”The Annals of Statistics, 39(3): 1551–1579.
  • Carvalho, C., Polson, N., and Scott, J. (2009). “Handling Sparsity via the Horseshoe.”Journal of Machine Learning Research W&CP, 5: 73–80.
  • Carvalho, C., Polson, N., and Scott, J. (2010). “The Horseshoe Estimator for Sparse Signals.”Biometrika, 97(2): 465–480.
  • Castillo, I. and van der Vaart, A. W. (2012). “Needles and Straw in a Haystack: Posterior Concentration For Possibly Sparse Sequences.”The Annals of Statistics, 40(4): 2069–2101.
  • Datta, J. and Ghosh, J. K. (2013). “Asymptotic Properties of Bayes Risk for the Horseshoe Prior.”Bayesian Analysis, 8(1): 111–132.
  • Donoho, D. L., Johnstone, I. M., Hoch, J. C., and Stern, A. S. (1992). “Maximum Entropy and the Nearly Black Object (with Discussion).”Journal of the Royal Statistical Society. Series B (Methodological), 54: 41–81.
  • Ghosal, S., Ghosh, J. K., and van der Vaart, A. W. (2000). “Convergence Rates of Posterior Distribution.”The Annals of Statistics, 28(2): 500–531.
  • Ghosh, P. and Chakrabarti, A. (2015). “Posterior Concentration Properties of a General Class of Shrinkage Estimators around Nearly Black Vectors.”arXiv:1412.8161v2.
  • Ghosh, P. and Chakrabarti, A. (2016). “Supplementary Materials to the Article “Asymptotic Optimality of One-Group Shrinkage Priors in Sparse High-dimensional Problems”.”Bayesian Analysis.
  • Ghosh, P., Tang, X., Ghosh, M., and Chakrabarti, A. (2015). “Asymptotic Properties of Bayes Risk of a General Class of Shrinkage Priors in Multiple Hypothesis Testing Under Sparsity.”Bayesian Analysis (Advance Publications, DOI:10.1214/15-BA973).
  • Griffin, J. E. and Brown, P. J. (2005). “Alternative prior distributions for variable selection with very many more variables than observations.” Technical report, University of Warwick.
  • Hans, C. (2009). “Bayesian Lasso Regression.”Biometrika, 96(4): 835–845.
  • Park, T. and Casella, G. (2008). “The Bayesian Lasso.”Journal of the American Statistical Association, 103(482): 681–686.
  • Polson, N. G. and Scott, J. G. (2011). “Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction.” InBayesian Statistics 9, Proceedings of the 9th Valencia International Meeting, 501–538. Oxford University Press.
  • Tipping, M. (2001). “Sparse Bayesian Learning and the Relevance Vector Machine.”Journal of Machine Learning Research, 1: 211–244.
  • van der Pas, S. L., Kleijn, B. J. K., and van der Vaart, A. W. (2014). “The Horseshoe Estimator: Posterior Concentration Around Nearly Black Vectors.”Electronic Journal of Statistics, 8: 2585–2618.
  • van der Pas, S. L., Salomond, J. B., and Schmidt-Hieber, J. (2016). “Conditions for Posterior Contraction in the Sparse Normal Means Problem.”Electronic Journal of Statistics, 10: 976–1000.

Supplemental materials