Bayesian Analysis

Asymptotic Properties of Bayes Risk of a General Class of Shrinkage Priors in Multiple Hypothesis Testing Under Sparsity

Prasenjit Ghosh, Xueying Tang, Malay Ghosh, and Arijit Chakrabarti

Full-text: Open access

Abstract

Consider the problem of simultaneous testing for the means of independent normal observations. In this paper, we study some asymptotic optimality properties of certain multiple testing rules induced by a general class of one-group shrinkage priors in a Bayesian decision theoretic framework, where the overall loss is taken as the number of misclassified hypotheses. We assume a two-groups normal mixture model for the data and consider the asymptotic framework adopted in Bogdan et al. (2011) who introduced the notion of asymptotic Bayes optimality under sparsity in the context of multiple testing. The general class of one-group priors under study is rich enough to include, among others, the families of three parameter beta, generalized double Pareto priors, and in particular the horseshoe, the normal–exponential–gamma and the Strawderman–Berger priors. We establish that within our chosen asymptotic framework, the multiple testing rules under study asymptotically attain the risk of the Bayes Oracle up to a multiplicative factor, with the constant in the risk close to the constant in the Oracle risk. This is similar to a result obtained in Datta and Ghosh (2013) for the multiple testing rule based on the horseshoe estimator introduced in Carvalho et al. (2009, 2010). We further show that under very mild assumption on the underlying sparsity parameter, the induced decisions using an empirical Bayes estimate of the corresponding global shrinkage parameter proposed by van der Pas et al. (2014), asymptotically attain the optimal Bayes risk up to the same multiplicative factor. We provide a unifying argument applicable for the general class of priors under study. In the process, we settle a conjecture regarding optimality property of the generalized double Pareto priors made in Datta and Ghosh (2013). Our work also shows that the result in Datta and Ghosh (2013) can be improved further.

Article information

Source
Bayesian Anal., Volume 11, Number 3 (2016), 753-796.

Dates
First available in Project Euclid: 16 September 2015

Permanent link to this document
https://projecteuclid.org/euclid.ba/1442364340

Digital Object Identifier
doi:10.1214/15-BA973

Mathematical Reviews number (MathSciNet)
MR3498045

Zentralblatt MATH identifier
1359.62309

Subjects
Primary: 62C10: Bayesian problems; characterization of Bayes procedures
Secondary: 62C10: Bayesian problems; characterization of Bayes procedures

Keywords
asymptotic optimality Bayes Oracle empirical Bayes generalized double Pareto three parameter beta horseshoe normal–exponential–gamma Strawderman–Berger

Citation

Ghosh, Prasenjit; Tang, Xueying; Ghosh, Malay; Chakrabarti, Arijit. Asymptotic Properties of Bayes Risk of a General Class of Shrinkage Priors in Multiple Hypothesis Testing Under Sparsity. Bayesian Anal. 11 (2016), no. 3, 753--796. doi:10.1214/15-BA973. https://projecteuclid.org/euclid.ba/1442364340


Export citation

References

  • Armagan, A., Dunson, D. B., and Clyde, M. (2011). “Generalized Beta Mixtures of Gaussians”. In: Shawe-Taylor, J., Zemel, R. S., Bartlett, P. L., Pereira, F. C. N., and Weinberger, K. Q. (eds.), Advances in Neural Information Processing Systems, volume 24, 523–531.
  • Armagan, A., Dunson, D. B., and Lee, J. (2012). “Generalized Double Pareto Shrinkage”. Statistica Sinica, 23(1): 119–143.
  • Benjamini, Y. and Hochberg, Y. (1995). “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing”. Journal of the Royal Statistical Society, Series B, 57(1): 289–300.
  • Bhattacharya, A., Pati, D., Pillai, N., and Dunson, D. B. (2012). “Bayesian Shrinkage”. arXiv:1212.6088v1.
  • — (2014). “Dirichlet-Laplace priors for optimal shrinkage”. arXiv:1401.5398v1.
  • Bingham, N. H., Goldie, C. M., and Teugels, J. L. (1987). Regular Variation. Cambridge, Great Britain: Encyclopedia of Mathematics and Its Applications, University Press.
  • Bogdan, M., Chakrabarti, A., Frommlet, F., and Ghosh, J. K. (2011). “Asymptotic Bayes-Optimality under Sparsity of Some Multiple Testing Procedures”. The Annals of Statistics, 39(3): 1551–1579.
  • Bogdan, M., Ghosh, J. K., and Tokdar, S. T. (2008). “A comparison of the Benjamini–Hochberg procedure with some Bayesian rules for multiple testing”. In: Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen, volume 1, 211–230. Beachwood, OH: IMS Collections, IMS.
  • Cai, T. T. and Jin, J. (2010). “Optimal Rates of Convergence for Estimating the Null Density and Proportion of Nonnull Effects in Large-Scale Multiple Comparisons”. The Annals of Statistics, 38(1): 100–145.
  • Cai, T. T., Jin, J., and Low, M. J. (2007). “Estimation and confidence sets for sparse normal mixtures”. The Annals of Statistics, 35(6): 2421–2449.
  • Carvalho, C., Polson, N., and Scott, J. (2009). “Handling sparsity via the horseshoe”. Journal of Machine Learning Research W&CP, 5: 73–80.
  • — (2010). “The horseshoe estimator for sparse signals”. Biometrika, 97(2): 465–480.
  • Datta, J. and Ghosh, J. K. (2013). “Asymptotic Properties of Bayes Risk for the Horseshoe Prior”. Bayesian Analysis, 8(1): 111–132.
  • Donoho, D. and Jin, J. (2004). “Higher criticism for detecting sparse heterogeneous mixtures”. The Annals of Statistics, 32(3): 962–994.
  • Efron, B. (2004). “Large-scale simultaneous hypothesis testing: The choice of a null hypothesis”. Journal of the American Statistical Association, 99(465): 96–104.
  • — (2008). “Microarrays, Empirical Bayes and the two-groups Model”. Statistical Science, 23(1): 1–22.
  • Gelman, A. (2006). “Prior distributions for variance parameters in hierarchical models”. Bayesian Analysis, 1(3): 515–533.
  • Ghosh, P. and Chakrabarti, A. (2015). “Posterior Concentration Properties of a General Class of Shrinkage Priors around Nearly Black Vectors”. arXiv:1412.8161v4.
  • Griffin, J. E. and Brown, P. J. (2005). “Alternative prior distributions for variable selection with very many more variables than observations”. Technical report, University of Warwick.
  • — (2010). “Inference with normal–gamma prior distributions in regression problems”. Bayesian Analysis, 5(1): 171–188.
  • — (2012). “Structuring shrinkage: some correlated priors for regression”. Biometrika, 99(2): 481–487.
  • — (2013). “Some priors for sparse regression modeling”. Bayesian Analysis, 8(3): 691–702.
  • Hans, C. (2009). “Bayesian lasso regression”. Biometrika, 96(4): 835–845.
  • Hoeffding, W. (1963). “Probability inequalities for sums of bounded random variables”. Journal of the American Statistical Association, 58: 13–30.
  • Ingster, Y. I. (1997). “Some problems of hypothesis testing leading to infinitely divisible distributions”. Mathematical Methods of Statistics, 6(1): 47–69.
  • Meinshausen, N. and Rice, J. (2006). “Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses”. The Annals of Statistics, 34(1): 373–393.
  • Mitchell, T. and Beauchamp, J. (1988). “Bayesian variable selection in linear regression (with discussion)”. Journal of the American Statistical Association, 83(404): 1023–1036.
  • Park, T. and Casella, G. (2008). “The Bayesian lasso”. Journal of the American Statistical Association, 103(482): 681–686.
  • Pati, D., Bhattacharya, A., Pillai, N., and Dunson, D. (2014). “Posterior contraction in sparse Bayesian factor models for massive covariance matrices”. The Annals of Statistics, 42(3): 1102–1130.
  • Polson, N. G. and Scott, J. G. (2011). “Shrink Globally, Act Locally: Sparse Bayesian Regularization and Prediction”. In: Bayesian Statistics 9, Proceedings of the 9th Valencia International Meeting, 501–538. Oxford University Press.
  • — (2012). “On the Half-Cauchy Prior for a Global Scale parameter”. Bayesian Analysis, 7(2): 1–16.
  • Scott, J. and Berger, J. O. (2006). “An exploration of aspects of Bayesian multiple testing”. Journal of Statistical Planning and Inference, 136(7): 2144–2162.
  • — (2010). “Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem”. The Annals of Statistics, 38(5): 2587–2619.
  • Scott, J. G. (2011). “Bayesian estimation of intensity surfaces on the sphere via needlet shrinkage and selection”. Bayesian Analysis, 6(2): 307–327.
  • Storey, J. D. (2007). “The optimal discovery procedure: a new approach to simultaneous significance testing”. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(3): 347–368.
  • Tipping, M. (2001). “Sparse Bayesian learning and the Relevance Vector Machine”. Journal of Machine Learning Research, 1: 211–244.
  • van der Pas, S. L., Kleijn, B. J. K., and van der Vaart, A. W. (2014). “The horseshoe estimator: Posterior concentration around nearly black vectors”. Electronic Journal of Statistics, 8: 2585–2618.