Bayesian Analysis

The Horseshoe+ Estimator of Ultra-Sparse Signals

Anindya Bhadra, Jyotishka Datta, Nicholas G. Polson, and Brandon Willard

Full-text: Open access

Abstract

We propose a new prior for ultra-sparse signal detection that we term the “horseshoe+ prior.” The horseshoe+ prior is a natural extension of the horseshoe prior that has achieved success in the estimation and detection of sparse signals and has been shown to possess a number of desirable theoretical properties while enjoying computational feasibility in high dimensions. The horseshoe+ prior builds upon these advantages. Our work proves that the horseshoe+ posterior concentrates at a rate faster than that of the horseshoe in the Kullback–Leibler (K-L) sense. We also establish theoretically that the proposed estimator has lower posterior mean squared error in estimating signals compared to the horseshoe and achieves the optimal Bayes risk in testing up to a constant. For one-group global–local scale mixture priors, we develop a new technique for analyzing the marginal sparse prior densities using the class of Meijer-G functions. In simulations, the horseshoe+ estimator demonstrates superior performance in a standard design setting against competing methods, including the horseshoe and Dirichlet–Laplace estimators. We conclude with an illustration on a prostate cancer data set and by pointing out some directions for future research.

Article information

Source
Bayesian Anal. Volume 12, Number 4 (2017), 1105-1131.

Dates
First available in Project Euclid: 22 September 2016

Permanent link to this document
https://projecteuclid.org/euclid.ba/1474572263

Digital Object Identifier
doi:10.1214/16-BA1028

Subjects
Primary: 62F15: Bayesian inference
Secondary: 62F12: Asymptotic properties of estimators 62C10: Bayesian problems; characterization of Bayes procedures

Keywords
Bayesian global–local shrinkage horseshoe horseshoe+ normal means sparsity

Rights
Creative Commons Attribution 4.0 International License.

Citation

Bhadra, Anindya; Datta, Jyotishka; Polson, Nicholas G.; Willard, Brandon. The Horseshoe+ Estimator of Ultra-Sparse Signals. Bayesian Anal. 12 (2017), no. 4, 1105--1131. doi:10.1214/16-BA1028. https://projecteuclid.org/euclid.ba/1474572263


Export citation

References

  • Armagan, A., Clyde, M., and Dunson, D. B. (2011). “Generalized beta mixtures of Gaussians.” InAdvances in Neural Information Processing Systems, 523–531.
  • Armagan, A., Dunson, D. B., and Lee, J. (2013). “Generalized double Pareto shrinkage.”Statistica Sinica, 23(1): 119–143.
  • Barndorff-Nielsen, O., Kent, J., and Sørensen, M. (1982). “Normal variance–mean mixtures and $z$ distributions.”International Statistical Review/Revue Internationale de Statistique, 50: 145–159.
  • Bhadra, A., Datta, J., Polson, N. G., and Willard, B. T. (2016a). “Default Bayesian analysis with global–local shrinkage Priors.”Biometrika, to appear.arXiv:1510.03516
  • Bhadra, A., Datta, J., Polson, N. G., and Willard, B. (2016b). “Supplementary material to “The horseshoe+ estimator of ultra-sparse signals”.”Bayesian Analysis.
  • Bhattacharya, A., Pati, D., Pillai, N. S., and Dunson, D. B. (2015). “Dirichlet–Laplace priors for optimal shrinkage.”Journal of the American Statistical Association, 110: 1479–1490.
  • Bingham, N. H., Goldie, C. M., and Teugels, J. L. (1989).Regular variation, volume 27 ofEncyclopedia of Mathematics and Its Applications. Cambridge University Press.
  • Bogdan, M., Chakrabarti, A., Frommlet, F., and Ghosh, J. K. (2011). “Asymptotic Bayes-optimality under sparsity of some multiple testing procedures.”The Annals of Statistics, 39(3): 1551–1579.
  • Bogdan, M., Ghosh, J. K., and Tokdar, S. T. (2008). “A comparison of the Benjamini–Hochberg procedure with some Bayesian rules for multiple testing.” InBeyond parametrics in interdisciplinary research: Festschrift in honor of Professor Pranab K. Sen, volume 1 ofInst. Math. Stat. Collect., 211–230. Inst. Math. Statist., Beachwood, Ohio, USA.
  • Bourgade, P., Fujita, T., and Yor, M. (2007). “Euler’s formulae for $\zeta$ (2n) and products of Cauchy variables.”Electronic Communications in Probability, 12: 73–80.
  • Carvalho, C. M., Polson, N. G., and Scott, J. G. (2009). “Handling sparsity via the horseshoe.”Journal of Machine Learning Research W&CP, 5: 73–80.
  • Carvalho, C. M., Polson, N. G., and Scott, J. G. (2010). “The horseshoe estimator for sparse signals.”Biometrika, 97: 465–480.
  • Castillo, I. and van der Vaart, A. (2012). “Needles and straw in a haystack: Posterior concentration for possibly sparse sequences.”The Annals of Statistics, 40(4): 2069–2101.
  • Clarke, B. and Barron, A. R. (1990). “Information-theoretic asymptotics of Bayes methods.”IEEE Transactions on Information Theory, 36(3): 453–471.
  • Datta, J. and Ghosh, J. K. (2013). “Asymptotic properties of Bayes risk for the horseshoe prior.”Bayesian Analysis, 8(1): 111–132.
  • Denison, D. G. and George, E. I. (2012).Bayesian prediction with adaptive ridge estimators, volume 8 ofIMS Collections, 215–234. Beachwood, Ohio, USA: Institute of Mathematical Statistics.
  • Donoho, D. L., Johnstone, I. M., Hoch, J. C., and Stern, A. S. (1992). “Maximum entropy and the nearly black object.”Journal of the Royal Statistical Society. Series B (Methodological), 54: 41–81.
  • Efron, B. (2008). “Microarrays, empirical Bayes and the two-groups model.”Statistical Science, 23(1): 1–22.
  • Efron, B. (2010a). “The future of indirect evidence.”Statistical Science, 25(2): 145–157.
  • Efron, B. (2010b).Large-scale inference: empirical Bayes methods for estimation, testing, and prediction, volume 1. Cambridge University Press.
  • Efron, B. (2011). “Tweedie’s formula and selection bias.”Journal of the American Statistical Association, 106(496): 1602–1614.
  • Foster, D. P. and Stine, R. A. (2005). “Polyshrink: An adaptive variable selection procedure that is competitive with Bayes experts.” Technical report, Univ. of Penn.
  • Gelman, A. (2006). “Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper).”Bayesian Analysis, 1(3): 515–534.
  • Ghosh, P. and Chakrabarti, A. (2014). “Posterior Concentration Properties of a General Class of Shrinkage Estimators around Nearly Black Vectors.”arXiv:1412.8161.
  • Ghosh, P., Tang, X., Ghosh, M., and Chakrabarti, A. (2016). “Asymptotic properties of Bayes risk of a general class of shrinkage priors in multiple hypothesis testing under sparsity.”Bayesian Analysis, 11: 753–796.
  • Griffin, J. E. and Brown, P. J. (2010). “Inference with normal–gamma prior distributions in regression problems.”Bayesian Analysis, 5(1): 171–188.
  • Guan, Y. and Stephens, M. (2008). “Practical issues in imputation-based association mapping.”PLoS Genet, 4(12): e1000279.
  • Johnstone, I. M. and Silverman, B. W. (2004). “Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences.”Annals of Statistics, 32: 1594–1649.
  • Marchini, J. and Howie, B. (2010). “Genotype imputation for genome-wide association studies.”Nature Reviews Genetics, 11(7): 499–511.
  • Mathai, A., Saxena, R. K., and Haubold, H. J. (2009).The H-function. New York, NY: Springer.
  • Mikosch, T. (1999).Regular variation, subexponentiality and their applications in probability theory. Volume 99 of EURANDOM report. Eindhoven, The Netherlands: Eindhoven University of Technology.
  • Mitchell, T. J. and Beauchamp, J. J. (1988). “Bayesian variable selection in linear regression.”Journal of the American Statistical Association, 83(404): 1023–1032.
  • Pericchi, L. and Smith, A. (1992). “Exact and approximate posterior moments for a normal location parameter.”Journal of the Royal Statistical Society. Series B (Methodological), 54: 793–804.
  • Polson, N. G. and Scott, J. G. (2010). “Shrink globally, act locally: Sparse Bayesian regularization and prediction.”Bayesian Statistics, 9: 501–538.
  • Polson, N. G. and Scott, J. G. (2012). “On the half-Cauchy prior for a global scale parameter.”Bayesian Analysis, 7(4): 887–902.
  • Rissanen, J. (1983). “A universal prior for integers and estimation by minimum description length.”The Annals of Statistics, 11: 416–431.
  • Scott, J. G. and Berger, J. O. (2006). “An exploration of aspects of Bayesian multiple testing.”Journal of Statistical Planning and Inference, 136(7): 2144–2162.
  • Scott, J. G. and Berger, J. O. (2010). “Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem.”The Annals of Statistics, 38(5): 2587–2619.
  • Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A. A., D’Amico, A. V., Richie, J. P., et al. (2002). “Gene expression correlates of clinical prostate cancer behavior.”Cancer Cell, 1(2): 203–209.
  • Stan Development Team (2014). “Stan: A C++ Library for Probability and Sampling, Version 2.2.”http://mc-stan.org/.
  • Stephens, M. and Balding, D. J. (2009). “Bayesian statistical methods for genetic association studies.”Nature Reviews Genetics, 10(10): 681–690.
  • Stranger, B. E., Stahl, E. A., and Raj, T. (2011). “Progress and promise of genome-wide association studies for human complex trait genetics.”Genetics, 187(2): 367–383.
  • Tibshirani, R. (1996). “Regression shrinkage and selection via the lasso.”Journal of the Royal Statistical Society (Series B), 58: 267–288.
  • van der Pas, S., Kleijn, B., and van der Vaart, A. (2014). “The horseshoe estimator: Posterior concentration around nearly black vectors.”Electronic Journal of Statistics, 8: 2585–2618.
  • van der Pas, S., Salomond, J.-B., and Schmidt-Hieber, J. (2016). “Conditions for posterior contraction in the sparse normal means problem.”Electronic Journal of Statistics, 10: 976–1000.

Supplemental materials