In the sparse sequence model, we consider a popular Bayesian multiple testing procedure and investigate for the first time its behaviour from the frequentist point of view. Given a spike-and-slab prior on the high-dimensional sparse unknown parameter, one can easily compute posterior probabilities of coming from the spike, which correspond to the well known local-fdr values , also called ℓ-values. The spike-and-slab weight parameter is calibrated in an empirical Bayes fashion, using marginal maximum likelihood. The multiple testing procedure under study, called here the cumulative ℓ-value procedure, ranks coordinates according to their empirical ℓ-values and thresholds so that the cumulative ranked sum does not exceed a user-specified level t. We validate the use of this method from the multiple testing perspective: for alternatives of appropriately large signal strength, the false discovery rate (FDR) of the procedure is shown to converge to the target level t, while its false negative rate (FNR) goes to 0. We complement this study by providing convergence rates for the method. Additionally, we prove that the q-value multiple testing procedure [44, 17] shares similar convergence rates in this model.
This work has been supported by ANR-16-CE40-0019 (SansSouci), ANR-17-CE40-0001 (BASICS) and by the GDR ISIS through the “projets exploratoires” program (project TASTY). It was mostly completed while K.A. was at Université Paris-Saclay, supported by a public grant as part of the Investissement d’avenir project, reference ANR-11-LABX-0056-LMH, LabEx LMH.
We thank an associate editor and two anonymous referees for their insightful comments, which helped us improve the paper.
"Empirical Bayes cumulative ℓ-value multiple testing procedure for sparse sequences." Electron. J. Statist. 16 (1) 2033 - 2081, 2022. https://doi.org/10.1214/22-EJS1979