The Annals of Statistics

A unified treatment of multiple testing with prior knowledge using the p-filter

Aaditya K. Ramdas, Rina F. Barber, Martin J. Wainwright, and Michael I. Jordan

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


There is a significant literature on methods for incorporating knowledge into multiple testing procedures so as to improve their power and precision. Some common forms of prior knowledge include (a) beliefs about which hypotheses are null, modeled by nonuniform prior weights; (b) differing importances of hypotheses, modeled by differing penalties for false discoveries; (c) multiple arbitrary partitions of the hypotheses into (possibly overlapping) groups and (d) knowledge of independence, positive or arbitrary dependence between hypotheses or groups, suggesting the use of more aggressive or conservative procedures. We present a unified algorithmic framework called p-filter for global null testing and false discovery rate (FDR) control that allows the scientist to incorporate all four types of prior knowledge (a)–(d) simultaneously, recovering a variety of known algorithms as special cases.

Article information

Ann. Statist., Volume 47, Number 5 (2019), 2790-2821.

Received: April 2017
Revised: September 2018
First available in Project Euclid: 3 August 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Primary: 62J15: Paired and multiple comparisons 60G10: Stationary processes
Secondary: 62F03: Hypothesis testing

Multiple testing false discovery rate prior knowledge Simes Benjamini–Hochberg–Yekutieli adaptivity group FDR


Ramdas, Aaditya K.; Barber, Rina F.; Wainwright, Martin J.; Jordan, Michael I. A unified treatment of multiple testing with prior knowledge using the p-filter. Ann. Statist. 47 (2019), no. 5, 2790--2821. doi:10.1214/18-AOS1765.

Export citation


  • [1] Barber, R. F. and Ramdas, A. (2017). The $p$-filter: Multilayer false discovery rate control for grouped hypotheses. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 1247–1268.
  • [2] Benjamini, Y. and Bogomolov, M. (2014). Selective inference on multiple families of hypotheses. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 297–318.
  • [3] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
  • [4] Benjamini, Y. and Hochberg, Y. (1997). Multiple hypotheses testing with weights. Scand. J. Stat. 24 407–418.
  • [5] Benjamini, Y. and Hochberg, Y. (2000). On the adaptive control of the false discovery rate in multiple testing with independent statistics. J. Educ. Behav. Stat. 25 60–83.
  • [6] Benjamini, Y., Krieger, A. M. and Yekutieli, D. (2006). Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93 491–507.
  • [7] Benjamini, Y. and Liu, W. (1999). A distribution-free multiple test procedure that controls the false discovery rate. Technical report, Tel Aviv Univ.
  • [8] Benjamini, Y. and Liu, W. (1999). A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence. J. Statist. Plann. Inference 82 163–170.
  • [9] Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165–1188.
  • [10] Blanchard, G. and Roquain, E. (2008). Two simple sufficient conditions for FDR control. Electron. J. Stat. 2 963–992.
  • [11] Blanchard, G. and Roquain, É. (2009). Adaptive false discovery rate control under independence and dependence. J. Mach. Learn. Res. 10 2837–2871.
  • [12] Bogomolov, M., Peterson, C. B., Benjamini, Y. and Sabatti, C. (2017). Testing hypotheses on a tree: New error rates and controlling strategies. Preprint. Available at arXiv:1705.07529.
  • [13] Brzyski, D., Peterson, C. B., Sobczyk, P., Candès, E. J., Bogdan, M. and Sabatti, C. (2017). Controlling the rate of GWAS false discoveries. Genetics 205 61–75.
  • [14] Gabriel, K. R. (1969). Simultaneous test procedures—Some theory of multiple comparisons. Ann. Math. Stat. 40 224–250.
  • [15] Gavrilov, Y., Benjamini, Y. and Sarkar, S. K. (2009). An adaptive step-down procedure with proven FDR control under independence. Ann. Statist. 37 619–629.
  • [16] Genovese, C. R., Roeder, K. and Wasserman, L. (2006). False discovery control with $p$-value weighting. Biometrika 93 509–524.
  • [17] Heard, N. A. and Rubin-Delanchy, P. (2018). Choosing between methods of combining $p$-values. Biometrika 105 239–246.
  • [18] Hochberg, Y. and Benjamini, Y. (1990). More powerful procedures for multiple significance testing. Stat. Med. 9 811–818.
  • [19] Hochberg, Y. and Liberman, U. (1994). An extended Simes’ test. Statist. Probab. Lett. 21 101–105.
  • [20] Hu, J. X., Zhao, H. and Zhou, H. H. (2010). False discovery rate control with groups. J. Amer. Statist. Assoc. 105 1215–1227.
  • [21] Javanmard, A. and Montanari, A. (2018). Online rules for control of false discovery rate and false discovery exceedance. Ann. Statist. 46 526–554.
  • [22] Katsevich, E. and Ramdas, A. (2018). Towards “simultaneous selective inference”: Post-hoc bounds on the false discovery proportion. Preprint. Available at arXiv:1803.06790.
  • [23] Katsevich, E. and Sabatti, C. (2019). Multilayer knockoff filter: Controlled variable selection at multiple resolutions. Ann. Appl. Stat. 13 1–33.
  • [24] Katsevich, E., Sabatti, C. and Bogomolov, M. (2018). Controlling FDR while highlighting distinct discoveries. Preprint. Available at arXiv:1809.01792.
  • [25] Lehmann, E. L. (1966). Some concepts of dependence. Ann. Math. Stat. 37 1137–1153.
  • [26] Peterson, C. B., Bogomolov, M., Benjamini, Y. and Sabatti, C. (2016). Many phenotypes without many false discoveries: Error controlling strategies for multitrait association studies. Genet. Epidemiol. 40 45–56.
  • [27] Ramdas, A., Chen, J., Wainwright, M. J. and Jordan, M. I. (2017). QuTE: Decentralized multiple testing on sensor networks with false discovery rate control. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC) 6415–6421. IEEE, New York.
  • [28] Ramdas, A., Chen, J., Wainwright, M. J. and Jordan, M. I. (2019). A sequential algorithm for false discovery rate control on directed acyclic graphs. Biometrika 106 69–86.
  • [29] Ramdas, A., Yang, F., Wainwright, M. J. and Jordan, M. I. (2017). Online control of the false discovery rate with decaying memory. In Advances in Neural Information Processing Systems 5655–5664.
  • [30] Ramdas, A., Zrnic, T., Wainwright, M. J. and Jordan, M. I. (2018). SAFFRON: An adaptive algorithm for online control of the false discovery rate. In Proceedings of the 35th International Conference on Machine Learning 4286–4294.
  • [31] Ramdas, A. K, Barber, R. F, Wainwright, M. J and Jordan, M. I (2019). Supplement to “A unified treatment of multiple testing with prior knowledge using the p-filter.” DOI:10.1214/18-AOS1765SUPP.
  • [32] Romano, J. P., Shaikh, A. and Wolf, M. (2011). Consonance and the closure method in multiple testing. Int. J. Biostat. 7 Art. 12, 27.
  • [33] Romano, J. P. and Shaikh, A. M. (2006). On stepdown control of the false discovery proportion. In Optimality. Institute of Mathematical Statistics Lecture Notes—Monograph Series 49 33–50. IMS, Beachwood, OH.
  • [34] Sarkar, S. K. (2008). On methods controlling the false discovery rate. Sankhyā 70 135–168.
  • [35] Sarkar, S. K. (2008). Two-stage stepup procedures controlling FDR. J. Statist. Plann. Inference 138 1072–1084.
  • [36] Sarkar, T. K. (1969). Some lower bounds of reliability. Technical report, Stanford Univ.
  • [37] Seeger, P. (1968). A note on a method for the analysis of significances en masse. Technometrics 10 586–593.
  • [38] Simes, R. J. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika 73 751–754.
  • [39] Sonnemann, E. (1982). Allgemeine Lösungen multipler Testprobleme. Institut für Mathematische Statistik und Versicherungslehre, Univ. Bern.
  • [40] Sonnemann, E. (2008). General solutions to multiple testing problems. Biom. J. 50 641–656.
  • [41] Sonnemann, E. and Finner, H. (1988). Vollständigkeitssätze für multiple testprobleme. In Multiple Hypothesenprüfung/Multiple Hypotheses Testing 121–135. Springer, Berlin.
  • [42] Storey, J. D. (2002). A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B. Stat. Methodol. 64 479–498.
  • [43] Storey, J. D., Taylor, J. E. and Siegmund, D. (2004). Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach. J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 187–205.
  • [44] Stouffer, S. A., Suchman, E. A., DeVinney, L. C., Star, S. A. and Williams, R. M. Jr. (1949). The American Soldier: Adjustment During Army Life, Vol. 1. Studies in Social Psychology in World War II.
  • [45] Tukey, J. (1953). The Problem of Multiple Comparisons: Introduction and Parts A, B, and C. Princeton Univ., Princeton, NJ.
  • [46] Tukey, J. W. (1994). The Collected Works of John W. Tukey. Vol. VIII: Multiple Comparisons: 19481983. CRC Press, New York.
  • [47] Vovk, V. and Wang, R. (2018). Combining p-values via averaging. Preprint. Available at arXiv:1212.4966v4.

Supplemental materials

  • Supplement to “A unified treatment of multiple testing with prior knowledge using the p-filter”. Contains details on dotfractions, generalized Simes tests for the global null and the LOOP property.