## Bayesian Analysis

### The Horseshoe+ Estimator of Ultra-Sparse Signals

#### Abstract

We propose a new prior for ultra-sparse signal detection that we term the “horseshoe+ prior.” The horseshoe+ prior is a natural extension of the horseshoe prior that has achieved success in the estimation and detection of sparse signals and has been shown to possess a number of desirable theoretical properties while enjoying computational feasibility in high dimensions. The horseshoe+ prior builds upon these advantages. Our work proves that the horseshoe+ posterior concentrates at a rate faster than that of the horseshoe in the Kullback–Leibler (K-L) sense. We also establish theoretically that the proposed estimator has lower posterior mean squared error in estimating signals compared to the horseshoe and achieves the optimal Bayes risk in testing up to a constant. For one-group global–local scale mixture priors, we develop a new technique for analyzing the marginal sparse prior densities using the class of Meijer-G functions. In simulations, the horseshoe+ estimator demonstrates superior performance in a standard design setting against competing methods, including the horseshoe and Dirichlet–Laplace estimators. We conclude with an illustration on a prostate cancer data set and by pointing out some directions for future research.

#### Article information

Source
Bayesian Anal. (2017), 27 pages.

Dates
First available in Project Euclid: 22 September 2016

http://projecteuclid.org/euclid.ba/1474572263

Digital Object Identifier
doi:10.1214/16-BA1028

#### Citation

Bhadra, Anindya; Datta, Jyotishka; Polson, Nicholas G.; Willard, Brandon. The Horseshoe+ Estimator of Ultra-Sparse Signals. Bayesian Anal., advance publication, 22 September 2016. doi: 10.1214/16-BA1028. http://projecteuclid.org/euclid.ba/1474572263

#### References

• Armagan, A., Clyde, M., and Dunson, D. B. (2011). “Generalized beta mixtures of Gaussians.” In Advances in Neural Information Processing Systems, 523–531.
• Armagan, A., Dunson, D. B., and Lee, J. (2013). “Generalized double Pareto shrinkage.” Statistica Sinica, 23(1): 119–143.
• Barndorff-Nielsen, O., Kent, J., and Sørensen, M. (1982). “Normal variance–mean mixtures and $z$ distributions.” International Statistical Review/Revue Internationale de Statistique, 50: 145–159.
• Bhadra, A., Datta, J., Polson, N. G., and Willard, B. T. (2016a). “Default Bayesian analysis with global–local shrinkage Priors.” Biometrika, to appear. arXiv:1510.03516
• Bhadra, A., Datta, J., Polson, N. G., and Willard, B. (2016b). “Supplementary material to “The horseshoe+ estimator of ultra-sparse signals”.” Bayesian Analysis.
• Bhattacharya, A., Pati, D., Pillai, N. S., and Dunson, D. B. (2015). “Dirichlet–Laplace priors for optimal shrinkage.” Journal of the American Statistical Association, 110: 1479–1490.
• Bingham, N. H., Goldie, C. M., and Teugels, J. L. (1989). Regular variation, volume 27 of Encyclopedia of Mathematics and Its Applications. Cambridge University Press.
• Bogdan, M., Chakrabarti, A., Frommlet, F., and Ghosh, J. K. (2011). “Asymptotic Bayes-optimality under sparsity of some multiple testing procedures.” The Annals of Statistics, 39(3): 1551–1579.
• Bogdan, M., Ghosh, J. K., and Tokdar, S. T. (2008). “A comparison of the Benjamini–Hochberg procedure with some Bayesian rules for multiple testing.” In Beyond parametrics in interdisciplinary research: Festschrift in honor of Professor Pranab K. Sen, volume 1 of Inst. Math. Stat. Collect., 211–230. Inst. Math. Statist., Beachwood, Ohio, USA.
• Bourgade, P., Fujita, T., and Yor, M. (2007). “Euler’s formulae for $\zeta$ (2n) and products of Cauchy variables.” Electronic Communications in Probability, 12: 73–80.
• Carvalho, C. M., Polson, N. G., and Scott, J. G. (2009). “Handling sparsity via the horseshoe.” Journal of Machine Learning Research W&CP, 5: 73–80.
• Carvalho, C. M., Polson, N. G., and Scott, J. G. (2010). “The horseshoe estimator for sparse signals.” Biometrika, 97: 465–480.
• Castillo, I. and van der Vaart, A. (2012). “Needles and straw in a haystack: Posterior concentration for possibly sparse sequences.” The Annals of Statistics, 40(4): 2069–2101.
• Clarke, B. and Barron, A. R. (1990). “Information-theoretic asymptotics of Bayes methods.” IEEE Transactions on Information Theory, 36(3): 453–471.
• Datta, J. and Ghosh, J. K. (2013). “Asymptotic properties of Bayes risk for the horseshoe prior.” Bayesian Analysis, 8(1): 111–132.
• Denison, D. G. and George, E. I. (2012). Bayesian prediction with adaptive ridge estimators, volume 8 of IMS Collections, 215–234. Beachwood, Ohio, USA: Institute of Mathematical Statistics.
• Donoho, D. L., Johnstone, I. M., Hoch, J. C., and Stern, A. S. (1992). “Maximum entropy and the nearly black object.” Journal of the Royal Statistical Society. Series B (Methodological), 54: 41–81.
• Efron, B. (2008). “Microarrays, empirical Bayes and the two-groups model.” Statistical Science, 23(1): 1–22.
• Efron, B. (2010a). “The future of indirect evidence.” Statistical Science, 25(2): 145–157.
• Efron, B. (2010b). Large-scale inference: empirical Bayes methods for estimation, testing, and prediction, volume 1. Cambridge University Press.
• Efron, B. (2011). “Tweedie’s formula and selection bias.” Journal of the American Statistical Association, 106(496): 1602–1614.
• Foster, D. P. and Stine, R. A. (2005). “Polyshrink: An adaptive variable selection procedure that is competitive with Bayes experts.” Technical report, Univ. of Penn.
• Gelman, A. (2006). “Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper).” Bayesian Analysis, 1(3): 515–534.
• Ghosh, P. and Chakrabarti, A. (2014). “Posterior Concentration Properties of a General Class of Shrinkage Estimators around Nearly Black Vectors.” arXiv:1412.8161.
• Ghosh, P., Tang, X., Ghosh, M., and Chakrabarti, A. (2016). “Asymptotic properties of Bayes risk of a general class of shrinkage priors in multiple hypothesis testing under sparsity.” Bayesian Analysis, 11: 753–796.
• Griffin, J. E. and Brown, P. J. (2010). “Inference with normal–gamma prior distributions in regression problems.” Bayesian Analysis, 5(1): 171–188.
• Guan, Y. and Stephens, M. (2008). “Practical issues in imputation-based association mapping.” PLoS Genet, 4(12): e1000279.
• Johnstone, I. M. and Silverman, B. W. (2004). “Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences.” Annals of Statistics, 32: 1594–1649.
• Marchini, J. and Howie, B. (2010). “Genotype imputation for genome-wide association studies.” Nature Reviews Genetics, 11(7): 499–511.
• Mathai, A., Saxena, R. K., and Haubold, H. J. (2009). The H-function. New York, NY: Springer.
• Mikosch, T. (1999). Regular variation, subexponentiality and their applications in probability theory. Volume 99 of EURANDOM report. Eindhoven, The Netherlands: Eindhoven University of Technology.
• Mitchell, T. J. and Beauchamp, J. J. (1988). “Bayesian variable selection in linear regression.” Journal of the American Statistical Association, 83(404): 1023–1032.
• Pericchi, L. and Smith, A. (1992). “Exact and approximate posterior moments for a normal location parameter.” Journal of the Royal Statistical Society. Series B (Methodological), 54: 793–804.
• Polson, N. G. and Scott, J. G. (2010). “Shrink globally, act locally: Sparse Bayesian regularization and prediction.” Bayesian Statistics, 9: 501–538.
• Polson, N. G. and Scott, J. G. (2012). “On the half-Cauchy prior for a global scale parameter.” Bayesian Analysis, 7(4): 887–902.
• Rissanen, J. (1983). “A universal prior for integers and estimation by minimum description length.” The Annals of Statistics, 11: 416–431.
• Scott, J. G. and Berger, J. O. (2006). “An exploration of aspects of Bayesian multiple testing.” Journal of Statistical Planning and Inference, 136(7): 2144–2162.
• Scott, J. G. and Berger, J. O. (2010). “Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem.” The Annals of Statistics, 38(5): 2587–2619.
• Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A. A., D’Amico, A. V., Richie, J. P., et al. (2002). “Gene expression correlates of clinical prostate cancer behavior.” Cancer Cell, 1(2): 203–209.
• Stan Development Team (2014). “Stan: A C++ Library for Probability and Sampling, Version 2.2.” http://mc-stan.org/.
• Stephens, M. and Balding, D. J. (2009). “Bayesian statistical methods for genetic association studies.” Nature Reviews Genetics, 10(10): 681–690.
• Stranger, B. E., Stahl, E. A., and Raj, T. (2011). “Progress and promise of genome-wide association studies for human complex trait genetics.” Genetics, 187(2): 367–383.
• Tibshirani, R. (1996). “Regression shrinkage and selection via the lasso.” Journal of the Royal Statistical Society (Series B), 58: 267–288.
• van der Pas, S., Kleijn, B., and van der Vaart, A. (2014). “The horseshoe estimator: Posterior concentration around nearly black vectors.” Electronic Journal of Statistics, 8: 2585–2618.
• van der Pas, S., Salomond, J.-B., and Schmidt-Hieber, J. (2016). “Conditions for posterior contraction in the sparse normal means problem.” Electronic Journal of Statistics, 10: 976–1000.