Variability and stability of the false discovery proportion

Marc Ditzhaus; Arnold Janssen

doi:10.1214/19-EJS1544

2019 Variability and stability of the false discovery proportion

Marc Ditzhaus, Arnold Janssen

Electron. J. Statist. 13(1): 882-910 (2019). DOI: 10.1214/19-EJS1544

Abstract

Much effort has been done to control the “false discovery rate” (FDR) when $m$ hypotheses are tested simultaneously. The FDR is the expectation of the “false discovery proportion” $\text{FDP}=V/R$ given by the ratio of the number of false rejections $V$ and all rejections $R$. In this paper, we have a closer look at the FDP for adaptive linear step-up multiple tests. These tests extend the well known Benjamini and Hochberg test by estimating the unknown amount $m_{0}$ of the true null hypotheses. We give exact finite sample formulas for higher moments of the FDP and, in particular, for its variance. Using these allows us a precise discussion about the stability of the FDP, i.e., when the FDP is asymptotically close to its mean. We present sufficient and necessary conditions for this stability. They include the presence of a stable estimator for the proportion $m_{0}/m$. We apply our results to convex combinations of generalized Storey type estimators with various tuning parameters and (possibly) data-driven weights. The corresponding step-up tests allow a flexible adaptation. Moreover, these tests control the FDR at finite sample size. We compare these tests to the classical Benjamini and Hochberg test and discuss the advantages of them.

References

1.

[1] Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate., Ann. Statist. 34, 584–653. 1092.62005 10.1214/009053606000000074 euclid.aos/1151418235[1] Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate., Ann. Statist. 34, 584–653. 1092.62005 10.1214/009053606000000074 euclid.aos/1151418235

2.

[2] Benditkis, J., Heesen, P. and Janssen, A. (2018). The false discovery rate (FDR) of multiple tests in a class room lecture., Statist. Probab. Lett. 134, 29–35. MR3758578 1383.62189 10.1016/j.spl.2017.09.017[2] Benditkis, J., Heesen, P. and Janssen, A. (2018). The false discovery rate (FDR) of multiple tests in a class room lecture., Statist. Probab. Lett. 134, 29–35. MR3758578 1383.62189 10.1016/j.spl.2017.09.017

3.

[3] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing., J. Roy. Statist. Soc. Ser. B 57, 289–300. 0809.62014 10.1111/j.2517-6161.1995.tb02031.x[3] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing., J. Roy. Statist. Soc. Ser. B 57, 289–300. 0809.62014 10.1111/j.2517-6161.1995.tb02031.x

4.

[4] Benjamini, Y. and Hochberg, Y. (2000). On the adaptive control of the false discovery rate in multiple testing with independent statistics., J. Educ. Behav. Statist. 25, 60–83.[4] Benjamini, Y. and Hochberg, Y. (2000). On the adaptive control of the false discovery rate in multiple testing with independent statistics., J. Educ. Behav. Statist. 25, 60–83.

5.

[5] Benjamini, Y., Krieger, A. M. and Yekutieli, D. (2006). Adaptive linear step-up procedures that control the false discovery rate., Biometrika 93, 491–507. 1108.62069 10.1093/biomet/93.3.491[5] Benjamini, Y., Krieger, A. M. and Yekutieli, D. (2006). Adaptive linear step-up procedures that control the false discovery rate., Biometrika 93, 491–507. 1108.62069 10.1093/biomet/93.3.491

6.

[6] Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependence., Ann. Statist. 29, 1165–1188. MR1869245 1041.62061 10.1214/aos/1013699998 euclid.aos/1013699998[6] Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependence., Ann. Statist. 29, 1165–1188. MR1869245 1041.62061 10.1214/aos/1013699998 euclid.aos/1013699998

7.

[7] Blanchard, G. and Roquain, E. (2008). Two simple sufficient conditions for FDR control., Electron. J. Stat. 2, 963–992. 1320.62179 10.1214/08-EJS180[7] Blanchard, G. and Roquain, E. (2008). Two simple sufficient conditions for FDR control., Electron. J. Stat. 2, 963–992. 1320.62179 10.1214/08-EJS180

8.

[8] Blanchard, G. and Roquain, E. (2009). Adaptive false discovery rate control under independence and dependence., J. Mach. Learn. Res. 10, 2837–2871. 1235.62093[8] Blanchard, G. and Roquain, E. (2009). Adaptive false discovery rate control under independence and dependence., J. Mach. Learn. Res. 10, 2837–2871. 1235.62093

9.

[9] Blanchard, G., Dickhaus, T., Roquain, E. and Villers, F. (2014). On least favorable configurations for step-up-down-tests., Statist. Sinica 24, 1–23. 06290083[9] Blanchard, G., Dickhaus, T., Roquain, E. and Villers, F. (2014). On least favorable configurations for step-up-down-tests., Statist. Sinica 24, 1–23. 06290083

10.

[10] Chi, Z. (2007). On the Performance of FDR Control: Constraints and a Partial Solution.., Ann. Stat. 35, 1409–1431. 1125.62075 10.1214/009053607000000037 euclid.aos/1188405616[10] Chi, Z. (2007). On the Performance of FDR Control: Constraints and a Partial Solution.., Ann. Stat. 35, 1409–1431. 1125.62075 10.1214/009053607000000037 euclid.aos/1188405616

11.

[11] Chi, Z. and Tan, Z. (2008). Positive false discovery proportions: intrinsic bounds and adaptive control., Statist. Sinica 18, 837–860. 1149.62060[11] Chi, Z. and Tan, Z. (2008). Positive false discovery proportions: intrinsic bounds and adaptive control., Statist. Sinica 18, 837–860. 1149.62060

12.

[12] Consul, P.C. and Famoye, F. (2006)., Lagrangian probability distributions. Birkhäuser Boston, Inc., Boston, MA. 1103.62013[12] Consul, P.C. and Famoye, F. (2006)., Lagrangian probability distributions. Birkhäuser Boston, Inc., Boston, MA. 1103.62013

13.

[13] Fan, J., Han, X. and Gu, W. (2012). Estimating false discovery proportion under arbitrary covariance dependence., J. Amer. Statist. Assoc. 107, 1019–1035. 1395.62219 10.1080/01621459.2012.720478[13] Fan, J., Han, X. and Gu, W. (2012). Estimating false discovery proportion under arbitrary covariance dependence., J. Amer. Statist. Assoc. 107, 1019–1035. 1395.62219 10.1080/01621459.2012.720478

14.

[14] Feller, W. (1968)., An introduction to probability theory and its applications Vol. I. Third edition. Wiley & Sons. 0155.23101[14] Feller, W. (1968)., An introduction to probability theory and its applications Vol. I. Third edition. Wiley & Sons. 0155.23101

15.

[15] Ferreira, J. A. and Zwinderman, A. H. (2006). On the Benjamini-Hochberg method., Ann. Statist. 34, 1827–1849. 1246.62170 10.1214/009053606000000425 euclid.aos/1162567635[15] Ferreira, J. A. and Zwinderman, A. H. (2006). On the Benjamini-Hochberg method., Ann. Statist. 34, 1827–1849. 1246.62170 10.1214/009053606000000425 euclid.aos/1162567635

16.

[16] Finner, H., Dickhaus, T. and Roters, M. (2009). On the false discovery rate and an asymptotically optimal rejection curve., Ann. Statist. 37, 596–618. 1162.62068 10.1214/07-AOS569 euclid.aos/1236693143[16] Finner, H., Dickhaus, T. and Roters, M. (2009). On the false discovery rate and an asymptotically optimal rejection curve., Ann. Statist. 37, 596–618. 1162.62068 10.1214/07-AOS569 euclid.aos/1236693143

17.

[17] Finner, H. and Gontscharuk, V. (2009). Controlling the familywise error rate with plug-in estimator for the proportion of true null hypotheses., J. R. Stat. Soc. Ser. B Stat. Methodol. 71, 1031–1048. 1411.62223 10.1111/j.1467-9868.2009.00719.x[17] Finner, H. and Gontscharuk, V. (2009). Controlling the familywise error rate with plug-in estimator for the proportion of true null hypotheses., J. R. Stat. Soc. Ser. B Stat. Methodol. 71, 1031–1048. 1411.62223 10.1111/j.1467-9868.2009.00719.x

18.

[18] Finner, H., Kern, P. and Scheer, M. (2015). On some compound distributions with Borel summands., Insurance Math. Econom. 62, 234–244. 1320.60041 10.1016/j.insmatheco.2015.03.012[18] Finner, H., Kern, P. and Scheer, M. (2015). On some compound distributions with Borel summands., Insurance Math. Econom. 62, 234–244. 1320.60041 10.1016/j.insmatheco.2015.03.012

19.

[19] Finner, H. and Roters, M. (2001). On the false discovery rate and expected type I errors., Biom. J. 43, 985–1005. 0989.62061 10.1002/1521-4036(200112)43:8<985::AID-BIMJ985>3.0.CO;2-4[19] Finner, H. and Roters, M. (2001). On the false discovery rate and expected type I errors., Biom. J. 43, 985–1005. 0989.62061 10.1002/1521-4036(200112)43:8<985::AID-BIMJ985>3.0.CO;2-4

20.

[20] Genovese, C. and Wassermann, L. (2004). A stochastic process approach to false discovery control., Ann. Statist. 32, 1035–1061. MR2065197 1092.62065 10.1214/009053604000000283 euclid.aos/1085408494[20] Genovese, C. and Wassermann, L. (2004). A stochastic process approach to false discovery control., Ann. Statist. 32, 1035–1061. MR2065197 1092.62065 10.1214/009053604000000283 euclid.aos/1085408494

21.

[21] Gontscharuk, V. (2010). Asymptotic and exact results on FWER and FDR in multiple hypothesis testing. Ph.D. thesis, Heinrich-Heine University Düsseldorf., https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=16990[21] Gontscharuk, V. (2010). Asymptotic and exact results on FWER and FDR in multiple hypothesis testing. Ph.D. thesis, Heinrich-Heine University Düsseldorf., https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=16990

22.

[22] Heesen, P. (2014). Adaptive step-up tests for the false discovery rate (FDR) under independence and dependence. Ph.D. thesis, Heinrich-Heine University Düsseldorf., https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=33047[22] Heesen, P. (2014). Adaptive step-up tests for the false discovery rate (FDR) under independence and dependence. Ph.D. thesis, Heinrich-Heine University Düsseldorf., https://docserv.uni-duesseldorf.de/servlets/DocumentServlet?id=33047

23.

[23] Heesen, P. and Janssen, A. (2015). Inequalities for the false discovery rate (FDR) under dependence., Electron. J. Stat. 9, 679–716. 1309.62083 10.1214/15-EJS1016[23] Heesen, P. and Janssen, A. (2015). Inequalities for the false discovery rate (FDR) under dependence., Electron. J. Stat. 9, 679–716. 1309.62083 10.1214/15-EJS1016

24.

[24] Heesen, P. and Janssen, A. (2016). Dynamic adaptive multiple tests with finites sample FDR control., J. Statist. Plann. Inference 168, 38–51. 1328.62267 10.1016/j.jspi.2015.06.007[24] Heesen, P. and Janssen, A. (2016). Dynamic adaptive multiple tests with finites sample FDR control., J. Statist. Plann. Inference 168, 38–51. 1328.62267 10.1016/j.jspi.2015.06.007

25.

[25] Jain, G.C. (1975). A linear function Poisson distribution., Biom. Z. 17, 501–506. 0322.60016 10.1002/bimj.19750170804[25] Jain, G.C. (1975). A linear function Poisson distribution., Biom. Z. 17, 501–506. 0322.60016 10.1002/bimj.19750170804

26.

[26] Liang, K. and Nettleton, D. (2012). Adaptive and dynamic adaptive procedures for false discovery rate control and estimation., J. R. Stat. Soc. Ser. B. Stat. Methodol. 74, 163–182. 1411.62226 10.1111/j.1467-9868.2011.01001.x[26] Liang, K. and Nettleton, D. (2012). Adaptive and dynamic adaptive procedures for false discovery rate control and estimation., J. R. Stat. Soc. Ser. B. Stat. Methodol. 74, 163–182. 1411.62226 10.1111/j.1467-9868.2011.01001.x

27.

[27] Meinshausen, N. and Bühlmann, P. (2005). Lower bounds for the number of false null hypotheses for multiple testing of associations under general dependence structures., Biometrika 92, 893–907. 1151.62308 10.1093/biomet/92.4.893[27] Meinshausen, N. and Bühlmann, P. (2005). Lower bounds for the number of false null hypotheses for multiple testing of associations under general dependence structures., Biometrika 92, 893–907. 1151.62308 10.1093/biomet/92.4.893

28.

[28] Meinshausen, N. and Rice, J. (2006). Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses., Ann. Statist. 34, 373–393. 1091.62059 10.1214/009053605000000741 euclid.aos/1146576267[28] Meinshausen, N. and Rice, J. (2006). Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses., Ann. Statist. 34, 373–393. 1091.62059 10.1214/009053605000000741 euclid.aos/1146576267

29.

[29] Neuvial, P. (2008). Asymptotic properties of false discovery rate controlling procedures under independence., Electron. J. Stat. 2, 1065–1110. Corrigendum 3, 1083. 1320.62181 10.1214/08-EJS207[29] Neuvial, P. (2008). Asymptotic properties of false discovery rate controlling procedures under independence., Electron. J. Stat. 2, 1065–1110. Corrigendum 3, 1083. 1320.62181 10.1214/08-EJS207

30.

[30] Owen, A. B. (2005). Variance of the number of false discoveries., J. R. Stat. Soc. Ser. B Stat. Methodol. 67, 411–426. 1069.62102 10.1111/j.1467-9868.2005.00509.x[30] Owen, A. B. (2005). Variance of the number of false discoveries., J. R. Stat. Soc. Ser. B Stat. Methodol. 67, 411–426. 1069.62102 10.1111/j.1467-9868.2005.00509.x

31.

[31] Roquain, E. and Villers, F. (2011). Exact calculations for false discovery proportion with application to least favorable configurations., Ann. Statist. 39, 584–612. 1209.62164 10.1214/10-AOS847 euclid.aos/1297779857[31] Roquain, E. and Villers, F. (2011). Exact calculations for false discovery proportion with application to least favorable configurations., Ann. Statist. 39, 584–612. 1209.62164 10.1214/10-AOS847 euclid.aos/1297779857

32.

[32] Sarkar, S. K. (2008). On methods controlling the false discovery rate., Sankhy$\bar\texta$ 70, 135–168. 1193.62121[32] Sarkar, S. K. (2008). On methods controlling the false discovery rate., Sankhy$\bar\texta$ 70, 135–168. 1193.62121

33.

[33] Sarkar, S. K., Guo, W. and Finner, H. (2012). On adaptive procedures controlling the familywise error rate., J. Statist. Plann. Inference 142, 65–78. 1368.62219 10.1016/j.jspi.2011.06.022[33] Sarkar, S. K., Guo, W. and Finner, H. (2012). On adaptive procedures controlling the familywise error rate., J. Statist. Plann. Inference 142, 65–78. 1368.62219 10.1016/j.jspi.2011.06.022

34.

[34] Scheer, M. (2012). Controlling the number of false rejections in multiple hypotheses testing. Phd-thesis. Heinrich-Heine University, Düsseldorf.[34] Scheer, M. (2012). Controlling the number of false rejections in multiple hypotheses testing. Phd-thesis. Heinrich-Heine University, Düsseldorf.

35.

[35] Schwartzman, A. and X. Lin (2011). The effect of correlation in false discovery rate estimation., Biometrika 98, 199–214. 1215.62071 10.1093/biomet/asq075[35] Schwartzman, A. and X. Lin (2011). The effect of correlation in false discovery rate estimation., Biometrika 98, 199–214. 1215.62071 10.1093/biomet/asq075

36.

[36] Schweder, T. and Spjøtvoll, E. (1982). Plots of p-values to evaluate many tests simultaneously., Biometrika 69, 493–502.[36] Schweder, T. and Spjøtvoll, E. (1982). Plots of p-values to evaluate many tests simultaneously., Biometrika 69, 493–502.

37.

[37] Shorack, G.R. and Wellner, J.A. (2009)., Empirical Processes with Applications to Statistics. Society for Industrial and Applied Mathematics, Philadelphia. 1171.62057[37] Shorack, G.R. and Wellner, J.A. (2009)., Empirical Processes with Applications to Statistics. Society for Industrial and Applied Mathematics, Philadelphia. 1171.62057

38.

[38] Storey, J. D. (2002). A direct approach to false discovery rates., J. R. Stat. Soc. Ser. B Stat. Methodol. 64, 479–498. MR1924302 1090.62073 10.1111/1467-9868.00346[38] Storey, J. D. (2002). A direct approach to false discovery rates., J. R. Stat. Soc. Ser. B Stat. Methodol. 64, 479–498. MR1924302 1090.62073 10.1111/1467-9868.00346

39.

[39] Storey, J. D., Taylor, J. E. and Siegmund, D. (2004). Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach., J. R. Stat. Soc. Ser. B Stat. Methodol. 66, 187–205. 1061.62110 10.1111/j.1467-9868.2004.00439.x[39] Storey, J. D., Taylor, J. E. and Siegmund, D. (2004). Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach., J. R. Stat. Soc. Ser. B Stat. Methodol. 66, 187–205. 1061.62110 10.1111/j.1467-9868.2004.00439.x

40.

[40] Storey, J. D. and Tibshirani, R. (2003). Statistical significance for genomewide studies., PNAS 100, 9440–9445. 1130.62385 10.1073/pnas.1530509100[40] Storey, J. D. and Tibshirani, R. (2003). Statistical significance for genomewide studies., PNAS 100, 9440–9445. 1130.62385 10.1073/pnas.1530509100

41.

[41] Yekutieli, D. and Benjamini, Y. (1999). Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics., J. Statist. Plann. Inference 82, 171–196. 1063.62563 10.1016/S0378-3758(99)00041-5[41] Yekutieli, D. and Benjamini, Y. (1999). Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics., J. Statist. Plann. Inference 82, 171–196. 1063.62563 10.1016/S0378-3758(99)00041-5

42.

[42] Zeisel, A., Zuk, O. and Domany, E. (2011). FDR control with adaptive procedures and FDR monotonicity., Ann. Appl. Stat. 5, 943–968. 1232.62106 10.1214/10-AOAS399 euclid.aoas/1310562212[42] Zeisel, A., Zuk, O. and Domany, E. (2011). FDR control with adaptive procedures and FDR monotonicity., Ann. Appl. Stat. 5, 943–968. 1232.62106 10.1214/10-AOAS399 euclid.aoas/1310562212

Creative Commons Attribution 4.0 International License.

Citation Download Citation

Marc Ditzhaus and Arnold Janssen "Variability and stability of the false discovery proportion," Electronic Journal of Statistics 13(1), 882-910, (2019). https://doi.org/10.1214/19-EJS1544

Received: 1 August 2018; Published: 2019

Access the abstract

JOURNAL ARTICLE
29 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY