## Electronic Journal of Statistics

### Non-marginal decisions: A novel Bayesian multiple testing procedure

#### Abstract

In this paper, we consider the problem of multiple testing where the hypotheses are dependent. In most of the existing literature, either Bayesian or non-Bayesian, the decision rules mainly focus on the validity of the test procedure rather than actually utilizing the dependency to increase efficiency. Moreover, the decisions regarding different hypotheses are marginal in the sense that they do not depend upon each other directly. However, in realistic situations, the hypotheses are usually dependent, and hence it is desirable that the decisions regarding the dependent hypotheses are taken jointly.

In this article, we develop a novel Bayesian multiple testing procedure that coherently takes this requirement into consideration. Our method, which is based on new notions of error and non-error terms, substantially enhances efficiency by judicious exploitation of the dependence structure among the hypotheses. We show that our method minimizes the posterior expected loss associated with an additive “0-1” loss function; we also prove theoretical results on the relevant error probabilities, establishing the coherence and usefulness of our method. The optimal decision configuration is not available in closed form and we propose an efficient simulated annealing algorithm for the purpose of optimization, which is also generically applicable to binary optimization problems.

Extensive simulation studies indicate that in dependent situations, our method performs significantly better than some existing popular conventional multiple testing methods, in terms of accuracy and power control. Moreover, application of our ideas to a real, spatial data set associated with radionuclide concentration in Rongelap islands yielded insightful results.

#### Article information

Source
Electron. J. Statist., Volume 13, Number 1 (2019), 489-535.

Dates
First available in Project Euclid: 14 February 2019

https://projecteuclid.org/euclid.ejs/1550134833

Digital Object Identifier
doi:10.1214/19-EJS1535

#### Citation

Chandra, Noirrit Kiran; Bhattacharya, Sourabh. Non-marginal decisions: A novel Bayesian multiple testing procedure. Electron. J. Statist. 13 (2019), no. 1, 489--535. doi:10.1214/19-EJS1535. https://projecteuclid.org/euclid.ejs/1550134833

#### References

• Abramovich, F. and Angelini, C. (2006). Bayesian Maximum a posteriori Multiple Testing Procedure., Sankhyā: The Indian Journal of Statistics (2003-2007), 68(3), 436–460.
• Andrieu, C., Breyer, L. A., and Doucet, A. (2001). Convergence of simulated annealing using Foster-Lyapunov criteria., Journal of Applied Probability, 38(4), 975–994.
• Benjamini, Y. and Heller, R. (2007). False Discovery Rates for Spatial Signals., Journal of the American Statistical Association, 102(480), 1272–1281.
• Benjamini, Y. and Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing., Journal of the Royal Statistical Society. Series B (Methodological), 57(1), 289–300.
• Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency., Ann. Statist., 29(4), 1165–1188.
• Berry, D. A. and Hochberg, Y. (1999). Bayesian perspectives on multiple comparisons., Journal of Statistical Planning and Inference, 82(1), 215–227.
• Chandra, N. K. and Bhattacharya, S. (2018). Asymptotic theory of a bayesian non-marginal multiple testing procedure under possible model misspecification., arXiv preprint arXiv:1611.01369.
• Chandra, N. K., Singh, R., and Bhattacharya, S. (2018). A Novel Bayesian Multiple Testing Approach to Deregulated miRNA Discovery Harnessing Positional Clustering., Biometrics.
• Dey, K. K. and Bhattacharya, S. (2017). A Brief Tutorial on Transformation based Markov Chain Monte Carlo and Optimal Scaling of the Additive Transformation., Braz. J. Probab. Stat., 31(3), 569–617.
• Diggle, P. J., Tawn, J., and Moyeed, R. (1998). Model-based geostatistics., Journal of the Royal Statistical Society: Series C (Applied Statistics), 47(3), 299–350.
• Dudoit, S., Shaffer, J. P., and Boldrick, J. C. (2003). Multiple hypothesis testing in microarray experiments., Statist. Sci., 18(1), 71–103.
• Dutta, S. and Bhattacharya, S. (2014). Markov Chain Monte Carlo based on deterministic transformations., Statistical Methodology, 16, 100–116.
• Efron, B. (2007). Correlation and Large-Scale Simultaneous Significance Testing., Journal of the American Statistical Association, 102(477), 93–103.
• Fan, J., Han, X., and Gu, W. (2012). Estimating False Discovery Proportion Under Arbitrary Covariance Dependence., Journal of the American Statistical Association, 107(499), 1019–1035. PMID: 24729644.
• Finner, H. and Roters, M. (2002). Multiple hypotheses testing and expected number of type I. errors., Ann. Statist., 30(1), 220–238.
• Finner, H., Dickhaus, T., and Roters, M. (2007). Dependency and false discovery rate: Asymptotics., Ann. Statist., 35(4), 1432–1455.
• Genovese, C. R., Roeder, K., and Wasserman, L. (2006). False Discovery Control with p-Value Weighting., Biometrika, 93(3), 509–524.
• Guindani, M., Müller, P., and Zhang, S. (2009). A bayesian discovery procedure., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(5), 905–925.
• Heller, R., Stanley, D., Yekutieli, D., Rubin, N., and Benjamini, Y. (2006). Cluster-based analysis of FMRI data., NeuroImage, 33(2), 599–608.
• Jaccard, P. (1901). Étude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura., Bulletin de la Société Vaudoise des Sciences Naturelles, 37, 547–579.
• Jaccard, P. (1908). Nouvelles Recherches sur la Distribution Florale., Bulletin de la Société Vaudoise des Sciences Naturelles, 44, 223–270.
• Jaccard, P. (1912). The Distribution of the Flora in the Alpine Zone., New Phytologist, 11, 37–50.
• Müller, P., Parmigiani, G., Robert, C., and Rousseau, J. (2004). Optimal sample size for multiple testing: the case of gene expression microarrays., Journal of the American Statistical Association, 99(468), 990–1001.
• Qiu, X., Lev, K., and Andrei, Y. (2005). Correlation between gene expression levels and limitations of the empirical bayes methodology for finding differentially expressed genes., Statistical Applications in Genetics and Molecular Biology, 4(1), 1–32.
• Robert, C. and Casella, G. (2013)., Monte Carlo Statistical Methods. Springer Science & Business Media.
• Sarkar, S. K., Zhou, T., and Ghosh, D. (2008). A general decision theoretic formulation of procedures controlling FDR and FNR from a Bayesian perspective., Statistica Sinica, 18(3), 925–945.
• Schwartzman, A. and Lin, X. (2011). The effect of correlation in false discovery rate estimation., Biometrika, 98(1), 199–214.
• Scott, J. G. and Berger, J. O. (2006). An exploration of aspects of Bayesian multiple testing., Journal of Statistical Planning and Inference, 136(7), 2144 – 2162.
• Scott, J. G. and Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem., Ann. Statist., 38(5), 2587–2619.
• Shalizi, C. R. (2009). Dynamics of Bayesian Updating with Dependent Data and Misspecified Models., Electron. J. Statist., 3, 1039–1074.
• Storey, J. D. (2002). A direct approach to false discovery rates., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3), 479–498.
• Storey, J. D. (2003). The positive false discovery rate: a Bayesian interpretation and the q-value., Ann. Statist., 31(6), 2013–2035.
• Sun, W. and Cai, T. T. (2009). Large-scale multiple testing under dependence., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(2), 393–424.
• Sun, W., Reich, B. J., Tony Cai, T., Guindani, M., and Schwartzman, A. (2015). False discovery control in large-scale spatial multiple testing., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 77(1), 59–83.
• Xie, J., Cai, T. T., Maris, J., and Li, H. (2011). Optimal false discovery rate control for dependent data., Statistics and its interface, 4(4), 417.
• Zhang, C., Fan, J., and Yu, T. (2011). Multiple testing via FDR$_l$ for large scale imaging data., Ann. Statist., 39(1), 613–642.