## Electronic Journal of Statistics

### Variability and stability of the false discovery proportion

#### Abstract

Much effort has been done to control the “false discovery rate” (FDR) when $m$ hypotheses are tested simultaneously. The FDR is the expectation of the “false discovery proportion” $\text{FDP}=V/R$ given by the ratio of the number of false rejections $V$ and all rejections $R$. In this paper, we have a closer look at the FDP for adaptive linear step-up multiple tests. These tests extend the well known Benjamini and Hochberg test by estimating the unknown amount $m_{0}$ of the true null hypotheses. We give exact finite sample formulas for higher moments of the FDP and, in particular, for its variance. Using these allows us a precise discussion about the stability of the FDP, i.e., when the FDP is asymptotically close to its mean. We present sufficient and necessary conditions for this stability. They include the presence of a stable estimator for the proportion $m_{0}/m$. We apply our results to convex combinations of generalized Storey type estimators with various tuning parameters and (possibly) data-driven weights. The corresponding step-up tests allow a flexible adaptation. Moreover, these tests control the FDR at finite sample size. We compare these tests to the classical Benjamini and Hochberg test and discuss the advantages of them.

#### Article information

Source
Electron. J. Statist., Volume 13, Number 1 (2019), 882-910.

Dates
First available in Project Euclid: 26 March 2019

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1553565707

Digital Object Identifier
doi:10.1214/19-EJS1544

Zentralblatt MATH identifier
07056143

Subjects
Primary: 62G10: Hypothesis testing
Secondary: 62G20: Asymptotic properties

#### Citation

Ditzhaus, Marc; Janssen, Arnold. Variability and stability of the false discovery proportion. Electron. J. Statist. 13 (2019), no. 1, 882--910. doi:10.1214/19-EJS1544. https://projecteuclid.org/euclid.ejs/1553565707

#### References

• [1] Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate., Ann. Statist. 34, 584–653.
• [2] Benditkis, J., Heesen, P. and Janssen, A. (2018). The false discovery rate (FDR) of multiple tests in a class room lecture., Statist. Probab. Lett. 134, 29–35.
• [3] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing., J. Roy. Statist. Soc. Ser. B 57, 289–300.
• [4] Benjamini, Y. and Hochberg, Y. (2000). On the adaptive control of the false discovery rate in multiple testing with independent statistics., J. Educ. Behav. Statist. 25, 60–83.
• [5] Benjamini, Y., Krieger, A. M. and Yekutieli, D. (2006). Adaptive linear step-up procedures that control the false discovery rate., Biometrika 93, 491–507.
• [6] Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependence., Ann. Statist. 29, 1165–1188.
• [7] Blanchard, G. and Roquain, E. (2008). Two simple sufficient conditions for FDR control., Electron. J. Stat. 2, 963–992.
• [8] Blanchard, G. and Roquain, E. (2009). Adaptive false discovery rate control under independence and dependence., J. Mach. Learn. Res. 10, 2837–2871.
• [9] Blanchard, G., Dickhaus, T., Roquain, E. and Villers, F. (2014). On least favorable configurations for step-up-down-tests., Statist. Sinica 24, 1–23.
• [10] Chi, Z. (2007). On the Performance of FDR Control: Constraints and a Partial Solution.., Ann. Stat. 35, 1409–1431.
• [11] Chi, Z. and Tan, Z. (2008). Positive false discovery proportions: intrinsic bounds and adaptive control., Statist. Sinica 18, 837–860.
• [12] Consul, P.C. and Famoye, F. (2006)., Lagrangian probability distributions. Birkhäuser Boston, Inc., Boston, MA.
• [13] Fan, J., Han, X. and Gu, W. (2012). Estimating false discovery proportion under arbitrary covariance dependence., J. Amer. Statist. Assoc. 107, 1019–1035.
• [14] Feller, W. (1968)., An introduction to probability theory and its applications Vol. I. Third edition. Wiley & Sons.
• [15] Ferreira, J. A. and Zwinderman, A. H. (2006). On the Benjamini-Hochberg method., Ann. Statist. 34, 1827–1849.
• [16] Finner, H., Dickhaus, T. and Roters, M. (2009). On the false discovery rate and an asymptotically optimal rejection curve., Ann. Statist. 37, 596–618.
• [17] Finner, H. and Gontscharuk, V. (2009). Controlling the familywise error rate with plug-in estimator for the proportion of true null hypotheses., J. R. Stat. Soc. Ser. B Stat. Methodol. 71, 1031–1048.
• [18] Finner, H., Kern, P. and Scheer, M. (2015). On some compound distributions with Borel summands., Insurance Math. Econom. 62, 234–244.
• [19] Finner, H. and Roters, M. (2001). On the false discovery rate and expected type I errors., Biom. J. 43, 985–1005.
• [20] Genovese, C. and Wassermann, L. (2004). A stochastic process approach to false discovery control., Ann. Statist. 32, 1035–1061.
• [21] Gontscharuk, V. (2010). Asymptotic and exact results on FWER and FDR in multiple hypothesis testing. Ph.D. thesis, Heinrich-Heine University Düsseldorf., https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=16990
• [22] Heesen, P. (2014). Adaptive step-up tests for the false discovery rate (FDR) under independence and dependence. Ph.D. thesis, Heinrich-Heine University Düsseldorf., https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=33047
• [23] Heesen, P. and Janssen, A. (2015). Inequalities for the false discovery rate (FDR) under dependence., Electron. J. Stat. 9, 679–716.
• [24] Heesen, P. and Janssen, A. (2016). Dynamic adaptive multiple tests with finites sample FDR control., J. Statist. Plann. Inference 168, 38–51.
• [25] Jain, G.C. (1975). A linear function Poisson distribution., Biom. Z. 17, 501–506.
• [26] Liang, K. and Nettleton, D. (2012). Adaptive and dynamic adaptive procedures for false discovery rate control and estimation., J. R. Stat. Soc. Ser. B. Stat. Methodol. 74, 163–182.
• [27] Meinshausen, N. and Bühlmann, P. (2005). Lower bounds for the number of false null hypotheses for multiple testing of associations under general dependence structures., Biometrika 92, 893–907.
• [28] Meinshausen, N. and Rice, J. (2006). Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses., Ann. Statist. 34, 373–393.
• [29] Neuvial, P. (2008). Asymptotic properties of false discovery rate controlling procedures under independence., Electron. J. Stat. 2, 1065–1110. Corrigendum 3, 1083.
• [30] Owen, A. B. (2005). Variance of the number of false discoveries., J. R. Stat. Soc. Ser. B Stat. Methodol. 67, 411–426.
• [31] Roquain, E. and Villers, F. (2011). Exact calculations for false discovery proportion with application to least favorable configurations., Ann. Statist. 39, 584–612.
• [32] Sarkar, S. K. (2008). On methods controlling the false discovery rate., Sankhy$\bar\texta$ 70, 135–168.
• [33] Sarkar, S. K., Guo, W. and Finner, H. (2012). On adaptive procedures controlling the familywise error rate., J. Statist. Plann. Inference 142, 65–78.
• [34] Scheer, M. (2012). Controlling the number of false rejections in multiple hypotheses testing. Phd-thesis. Heinrich-Heine University, Düsseldorf.
• [35] Schwartzman, A. and X. Lin (2011). The effect of correlation in false discovery rate estimation., Biometrika 98, 199–214.
• [36] Schweder, T. and Spjøtvoll, E. (1982). Plots of p-values to evaluate many tests simultaneously., Biometrika 69, 493–502.
• [37] Shorack, G.R. and Wellner, J.A. (2009)., Empirical Processes with Applications to Statistics. Society for Industrial and Applied Mathematics, Philadelphia.
• [38] Storey, J. D. (2002). A direct approach to false discovery rates., J. R. Stat. Soc. Ser. B Stat. Methodol. 64, 479–498.
• [39] Storey, J. D., Taylor, J. E. and Siegmund, D. (2004). Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach., J. R. Stat. Soc. Ser. B Stat. Methodol. 66, 187–205.
• [40] Storey, J. D. and Tibshirani, R. (2003). Statistical significance for genomewide studies., PNAS 100, 9440–9445.
• [41] Yekutieli, D. and Benjamini, Y. (1999). Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics., J. Statist. Plann. Inference 82, 171–196.
• [42] Zeisel, A., Zuk, O. and Domany, E. (2011). FDR control with adaptive procedures and FDR monotonicity., Ann. Appl. Stat. 5, 943–968.