Bayesian Analysis

The Horseshoe+ Estimator of Ultra-Sparse Signals

Anindya Bhadra, Jyotishka Datta, Nicholas G. Polson, and Brandon Willard

Advance publication

This article is in its final form and can be cited using the year of online publication and the DOI.

Full-text: Open access


We propose a new prior for ultra-sparse signal detection that we term the “horseshoe+ prior.” The horseshoe+ prior is a natural extension of the horseshoe prior that has achieved success in the estimation and detection of sparse signals and has been shown to possess a number of desirable theoretical properties while enjoying computational feasibility in high dimensions. The horseshoe+ prior builds upon these advantages. Our work proves that the horseshoe+ posterior concentrates at a rate faster than that of the horseshoe in the Kullback–Leibler (K-L) sense. We also establish theoretically that the proposed estimator has lower posterior mean squared error in estimating signals compared to the horseshoe and achieves the optimal Bayes risk in testing up to a constant. For one-group global–local scale mixture priors, we develop a new technique for analyzing the marginal sparse prior densities using the class of Meijer-G functions. In simulations, the horseshoe+ estimator demonstrates superior performance in a standard design setting against competing methods, including the horseshoe and Dirichlet–Laplace estimators. We conclude with an illustration on a prostate cancer data set and by pointing out some directions for future research.

Article information

Bayesian Anal. (2016), 27 pages.

First available in Project Euclid: 22 September 2016

Permanent link to this document

Digital Object Identifier

Primary: 62F15: Bayesian inference
Secondary: 62F12: Asymptotic properties of estimators 62C10: Bayesian problems; characterization of Bayes procedures

Bayesian global–local shrinkage horseshoe horseshoe+ normal means sparsity


Bhadra, Anindya; Datta, Jyotishka; Polson, Nicholas G.; Willard, Brandon. The Horseshoe+ Estimator of Ultra-Sparse Signals. Bayesian Anal. Advance Publication, 22 September 2016. doi: 10.1214/16-BA1028.

Export citation


  • Armagan, A., Clyde, M., and Dunson, D. B. (2011). “Generalized beta mixtures of Gaussians.” In Advances in Neural Information Processing Systems, 523–531.
  • Armagan, A., Dunson, D. B., and Lee, J. (2013). “Generalized double Pareto shrinkage.” Statistica Sinica, 23(1): 119–143.
  • Barndorff-Nielsen, O., Kent, J., and Sørensen, M. (1982). “Normal variance–mean mixtures and $z$ distributions.” International Statistical Review/Revue Internationale de Statistique, 50: 145–159.
  • Bhadra, A., Datta, J., Polson, N. G., and Willard, B. T. (2016a). “Default Bayesian analysis with global–local shrinkage Priors.” Biometrika, to appear. arXiv:1510.03516
  • Bhadra, A., Datta, J., Polson, N. G., and Willard, B. (2016b). “Supplementary material to “The horseshoe+ estimator of ultra-sparse signals”.” Bayesian Analysis.
  • Bhattacharya, A., Pati, D., Pillai, N. S., and Dunson, D. B. (2015). “Dirichlet–Laplace priors for optimal shrinkage.” Journal of the American Statistical Association, 110: 1479–1490.
  • Bingham, N. H., Goldie, C. M., and Teugels, J. L. (1989). Regular variation, volume 27 of Encyclopedia of Mathematics and Its Applications. Cambridge University Press.
  • Bogdan, M., Chakrabarti, A., Frommlet, F., and Ghosh, J. K. (2011). “Asymptotic Bayes-optimality under sparsity of some multiple testing procedures.” The Annals of Statistics, 39(3): 1551–1579.
  • Bogdan, M., Ghosh, J. K., and Tokdar, S. T. (2008). “A comparison of the Benjamini–Hochberg procedure with some Bayesian rules for multiple testing.” In Beyond parametrics in interdisciplinary research: Festschrift in honor of Professor Pranab K. Sen, volume 1 of Inst. Math. Stat. Collect., 211–230. Inst. Math. Statist., Beachwood, Ohio, USA.
  • Bourgade, P., Fujita, T., and Yor, M. (2007). “Euler’s formulae for $\zeta$ (2n) and products of Cauchy variables.” Electronic Communications in Probability, 12: 73–80.
  • Carvalho, C. M., Polson, N. G., and Scott, J. G. (2009). “Handling sparsity via the horseshoe.” Journal of Machine Learning Research W&CP, 5: 73–80.
  • Carvalho, C. M., Polson, N. G., and Scott, J. G. (2010). “The horseshoe estimator for sparse signals.” Biometrika, 97: 465–480.
  • Castillo, I. and van der Vaart, A. (2012). “Needles and straw in a haystack: Posterior concentration for possibly sparse sequences.” The Annals of Statistics, 40(4): 2069–2101.
  • Clarke, B. and Barron, A. R. (1990). “Information-theoretic asymptotics of Bayes methods.” IEEE Transactions on Information Theory, 36(3): 453–471.
  • Datta, J. and Ghosh, J. K. (2013). “Asymptotic properties of Bayes risk for the horseshoe prior.” Bayesian Analysis, 8(1): 111–132.
  • Denison, D. G. and George, E. I. (2012). Bayesian prediction with adaptive ridge estimators, volume 8 of IMS Collections, 215–234. Beachwood, Ohio, USA: Institute of Mathematical Statistics.
  • Donoho, D. L., Johnstone, I. M., Hoch, J. C., and Stern, A. S. (1992). “Maximum entropy and the nearly black object.” Journal of the Royal Statistical Society. Series B (Methodological), 54: 41–81.
  • Efron, B. (2008). “Microarrays, empirical Bayes and the two-groups model.” Statistical Science, 23(1): 1–22.
  • Efron, B. (2010a). “The future of indirect evidence.” Statistical Science, 25(2): 145–157.
  • Efron, B. (2010b). Large-scale inference: empirical Bayes methods for estimation, testing, and prediction, volume 1. Cambridge University Press.
  • Efron, B. (2011). “Tweedie’s formula and selection bias.” Journal of the American Statistical Association, 106(496): 1602–1614.
  • Foster, D. P. and Stine, R. A. (2005). “Polyshrink: An adaptive variable selection procedure that is competitive with Bayes experts.” Technical report, Univ. of Penn.
  • Gelman, A. (2006). “Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper).” Bayesian Analysis, 1(3): 515–534.
  • Ghosh, P. and Chakrabarti, A. (2014). “Posterior Concentration Properties of a General Class of Shrinkage Estimators around Nearly Black Vectors.” arXiv:1412.8161.
  • Ghosh, P., Tang, X., Ghosh, M., and Chakrabarti, A. (2016). “Asymptotic properties of Bayes risk of a general class of shrinkage priors in multiple hypothesis testing under sparsity.” Bayesian Analysis, 11: 753–796.
  • Griffin, J. E. and Brown, P. J. (2010). “Inference with normal–gamma prior distributions in regression problems.” Bayesian Analysis, 5(1): 171–188.
  • Guan, Y. and Stephens, M. (2008). “Practical issues in imputation-based association mapping.” PLoS Genet, 4(12): e1000279.
  • Johnstone, I. M. and Silverman, B. W. (2004). “Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences.” Annals of Statistics, 32: 1594–1649.
  • Marchini, J. and Howie, B. (2010). “Genotype imputation for genome-wide association studies.” Nature Reviews Genetics, 11(7): 499–511.
  • Mathai, A., Saxena, R. K., and Haubold, H. J. (2009). The H-function. New York, NY: Springer.
  • Mikosch, T. (1999). Regular variation, subexponentiality and their applications in probability theory. Volume 99 of EURANDOM report. Eindhoven, The Netherlands: Eindhoven University of Technology.
  • Mitchell, T. J. and Beauchamp, J. J. (1988). “Bayesian variable selection in linear regression.” Journal of the American Statistical Association, 83(404): 1023–1032.
  • Pericchi, L. and Smith, A. (1992). “Exact and approximate posterior moments for a normal location parameter.” Journal of the Royal Statistical Society. Series B (Methodological), 54: 793–804.
  • Polson, N. G. and Scott, J. G. (2010). “Shrink globally, act locally: Sparse Bayesian regularization and prediction.” Bayesian Statistics, 9: 501–538.
  • Polson, N. G. and Scott, J. G. (2012). “On the half-Cauchy prior for a global scale parameter.” Bayesian Analysis, 7(4): 887–902.
  • Rissanen, J. (1983). “A universal prior for integers and estimation by minimum description length.” The Annals of Statistics, 11: 416–431.
  • Scott, J. G. and Berger, J. O. (2006). “An exploration of aspects of Bayesian multiple testing.” Journal of Statistical Planning and Inference, 136(7): 2144–2162.
  • Scott, J. G. and Berger, J. O. (2010). “Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem.” The Annals of Statistics, 38(5): 2587–2619.
  • Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A. A., D’Amico, A. V., Richie, J. P., et al. (2002). “Gene expression correlates of clinical prostate cancer behavior.” Cancer Cell, 1(2): 203–209.
  • Stan Development Team (2014). “Stan: A C++ Library for Probability and Sampling, Version 2.2.”
  • Stephens, M. and Balding, D. J. (2009). “Bayesian statistical methods for genetic association studies.” Nature Reviews Genetics, 10(10): 681–690.
  • Stranger, B. E., Stahl, E. A., and Raj, T. (2011). “Progress and promise of genome-wide association studies for human complex trait genetics.” Genetics, 187(2): 367–383.
  • Tibshirani, R. (1996). “Regression shrinkage and selection via the lasso.” Journal of the Royal Statistical Society (Series B), 58: 267–288.
  • van der Pas, S., Kleijn, B., and van der Vaart, A. (2014). “The horseshoe estimator: Posterior concentration around nearly black vectors.” Electronic Journal of Statistics, 8: 2585–2618.
  • van der Pas, S., Salomond, J.-B., and Schmidt-Hieber, J. (2016). “Conditions for posterior contraction in the sparse normal means problem.” Electronic Journal of Statistics, 10: 976–1000.

Supplemental materials