The Annals of Applied Statistics

FDR control with adaptive procedures and FDR monotonicity

Amit Zeisel, Or Zuk, and Eytan Domany

Full-text: Open access

Abstract

The steep rise in availability and usage of high-throughput technologies in biology brought with it a clear need for methods to control the False Discovery Rate (FDR) in multiple tests. Benjamini and Hochberg (BH) introduced in 1995 a simple procedure and proved that it provided a bound on the expected value, FDRq. Since then, many authors tried to improve the BH bound, with one approach being designing adaptive procedures, which aim at estimating the number of true null hypothesis in order to get a better FDR bound. Our two main rigorous results are the following: (i) a theorem that provides a bound on the FDR for adaptive procedures that use any estimator for the number of true hypotheses (m0), (ii) a theorem that proves a monotonicity property of general BH-like procedures, both for the case where the hypotheses are independent. We also propose two improved procedures for which we prove FDR control for the independent case, and demonstrate their advantages over several available bounds, on simulated data and on a large number of gene expression data sets. Both applications are simple and involve a similar amount of computation as the original BH procedure. We compare the performance of our proposed procedures with BH and other procedures and find that in most cases we get more power for the same level of statistical significance.

Article information

Source
Ann. Appl. Stat. Volume 5, Number 2A (2011), 943-968.

Dates
First available in Project Euclid: 13 July 2011

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1310562212

Digital Object Identifier
doi:10.1214/10-AOAS399

Mathematical Reviews number (MathSciNet)
MR2840182

Zentralblatt MATH identifier
1232.62106

Keywords
False Discovery Rate improved BH monotonicity gene expression analysis

Citation

Zeisel, Amit; Zuk, Or; Domany, Eytan. FDR control with adaptive procedures and FDR monotonicity. Ann. Appl. Stat. 5 (2011), no. 2A, 943--968. doi:10.1214/10-AOAS399. https://projecteuclid.org/euclid.aoas/1310562212


Export citation

References

  • Andersson, A., Ritz, C., Lindgren, D., Edén, P., Lassen, C., Heldrup, J., Olofsson, T., Råde, J., Fontes, M., Porwit-Macdonald, A., Behrendtz, M., Höglund, M., Johansson, B. and Fioretos, T. (2007). Microarray-based classification of a consecutive series of 121 childhood acute leukemias: Prediction of leukemic and genetic subtype as well as of minimal residual disease status. Leukemia 21 1198–1203.
  • Aven, T. and Jensen, U. (1999). Stochastic Models in Reliability. Springer, New York.
  • Basso, K., Margolin, A. A., Stolovitzky, G., Klein, U., Dalla-Favera, R. and Califano, A. (2005). Reverse engineering of regulatory networks in human B cells. Nat. Genet. 37 382–390.
  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
  • Benjamini, Y., Krieger, A. M. and Yekutieli, D. (2006). Adaptive linear step-up procedures that control the false discovery rate. Biometrica 93 491–507.
  • Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165–1168.
  • Bittner, M. (2005). A window on the dynamics of biological switches. Nat. Biotechnol. 23 183–184.
  • Bullinger, L., Döhner, K., Bair, E., Fröhling, S., Schlenk, R. F., Tibshirani, R., Döhner, H. and Pollack, J. R. (2004). Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia. N. Engl. J. Med. 350 1605–1616.
  • Choi, Y. L., Tsukasaki, K., O’Neill, M. C., Yamada, Y., Onimaru, Y., Matsumoto, K., Ohashi, J., Yamashita, Y., Tsutsumi, S., Kaneda, R., Takada, S., Aburatani, H., Kamihira, S., Nakamura, T., Tomonaga, M. and Mano, H. (2007). A genomic analysis of adult T-cell leukemia. Oncogene 26 1245–1255.
  • Chowdary, D., Lathrop, J., Skelton, J., Curtin, K., Briggs, T., Zhang, Y., Yu, J., Wang, Y. and Mazumder, A. (2006). Prognostic gene expression signatures can be measured in tissues collected in RNA later preservative. J. Mol. Diagn. 8 31–39.
  • Gavrilov, Y., Benjamini, Y. and Sarkar, S. K. (2009). An adaptive step-down procedure with proven FDR control under independence. Ann. Statist. 37 619–629.
  • Genovese, C. and Wasserman, L. (2002). Operating characteristics and extensions of the false discovery rate procedure. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 499–517.
  • Graudens, E., Boulanger, V., Mollard, C., Mariage-Samson, R., Barlet, X., Grémy, G., Couillault, C., Lajémi, M., Piatier-Tonneau, D., Zaborski, P., Eveno, E., Auffray, C. and Imbeaud, S. (2006). Deciphering cellular states of innate tumor drug responses. Genome Biol. 7 R19–R19.
  • Koinuma, K., Yamashita, Y., Liu, W., Hatanaka, H., Kurashina, K., Wada, T., Takada, S., Kaneda, R., Choi, Y. L., Fujiwara, S.-I., Miyakura, Y., Nagai, H. and Mano, H. (2006). Epigenetic silencing of AXIN2 in colorectal carcinoma with microsatellite instability. Oncogene 25 139–146.
  • Laiho, P., Kokko, A., Vanharanta, S., Salovaara, R., Sammalkorpi, H., Järvinen, H., Mecklin, J.-P., Karttunen, T. J., Tuppurainen, K., Davalos, V., Schwartz, S., Arango, D., Mäkinen, M. J. and Aaltonen, L. A. (2007). Serrated carcinomas form a subclass of colorectal cancer with distinct molecular basis. Oncogene 26 312–320.
  • Miller, L. D., Smeds, J., George, J., Vega, V. B., Vergara, L., Ploner, A., Pawitan, Y., Hall, P., Klaar, S., Liu, E. T. and Bergh, J. (2005). An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc. Natl. Acad. Sci. USA 102 13550–13555.
  • Owen, A. B. (2005). Variance of the number of false discoveries. J. Roy. Statist. Soc. Ser. B 67 411–426.
  • Pawitan, Y., Bjöhle, J., Amler, L., Borg, A.-L., Egyhazi, S., Hall, P., Han, X., Holmberg, L., Huang, F., Klaar, S., Liu, E. T., Miller, L., Nordgren, H., Ploner, A., Sandelin, K., Shaw, P. M., Smeds, J., Skoog, L., Wedrén, S. and Bergh, J. (2005). Gene expression profiling spares early breast cancer patients from adjuvant therapy: Derived and validated in two population-based cohorts. Breast Cancer Res. 7 R953–R964.
  • Pounds, S. and Cheng, C. (2006). Robust estimation of the false discovery rate. Bioinformatics 22 1979–1987.
  • Reiner, A. (2007). FDR control by the BH procedure for two-sided correlated tests with implications to gene expression data analysis. Biom. J. 49 107–126.
  • Rhodes, D. R., Kalyana-Sundaram, S., Mahavisno, V., Varambally, R., Yu, J., Briggs, B. B., Barrette, T. R., Anstet, M. J., Kincead-Beal, C., Kulkarni, P., Varambally, S., Ghosh, D. and Chinnaiyan, A. M. (2007). Oncomine 3.0: Genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles. Neoplasia 9 166–180.
  • Ross, M. E., Zhou, X., Song, G., Shurtleff, S. A., Girtman, K., Williams, W. K., Liu, H.-C., Mahfouz, R., Raimondi, S. C., Lenny, N., Patel, A. and Downing, J. R. (2003). Classification of pediatric acute lymphoblastic leukemia by gene expression profiling. Blood 102 2951–2959.
  • Storey, J. D. (2002). A direct approach to false discovery rate. J. Roy. Statist. Soc. Ser. B 64 479–498.
  • Storey, J. D., Taylor, J. E. and Siegmund, D. (2004). Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: A unified approach. J. Roy. Statist. Soc. Ser. B 66 187–205.
  • Valk, P. J. M., Verhaak, R. G. W., Beijen, M. A., Erpelinck, C. A. J., Barjesteh van Waalwijk van Doorn-Khosrovani, S., Boer, J. M., Beverloo, H. B., Moorhouse, M. J., van der Spek, P. J., Löwenberg, B. and Delwel, R. (2004). Prognostically useful gene-expression profiles in acute myeloid leukemia. N. Engl. J. Med. 350 1617–1628.
  • van de Vijver, M. J., He, Y. D., van’t Veer, L. J., Dai, H., Hart, A. A. M., Voskuil, D. W., Schreiber, G. J., Peterse, J. L., Roberts, C., Marton, M. J., Parrish, M., Atsma, D., Witteveen, A., Glas, A., Delahaye, L., van der Velde, T., Bartelink, H., Rodenhuis, S., Rutgers, E. T., Friend, S. H. and Bernards, R. (2002). A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347 1999–2009.
  • Wang, Y., Klijn, J. G. M., Zhang, Y., Sieuwerts, A. M., Look, M. P., Yang, F., Talantov, D., Timmermans, M., Meijer-van Gelder, M. E., Yu, J., Jatkoe, T., Berns, E. M. J. J., Atkins, D. and Foekens, J. A. (2005). Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365 671–679.
  • Watanabe, T., Kobunai, T., Toda, E., Yamamoto, Y., Kanazawa, T., Kazama, Y., Tanaka, J., Tanaka, T., Konishi, T., Okayama, Y., Sugimoto, Y., Oka, T., Sasaki, S., Muto, T. and Nagawa, H. (2006). Distal colorectal cancers with microsatellite instability (MSI) display distinct gene expression profiles that are different from proximal MSI cancers. Cancer Res. 66 9804–9808.
  • Yekutieli, D. and Benjamini, Y. (1999). Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. J. Statist. Plann. Inference 82 171–196.
  • Yeoh, E.-J., Ross, M. E., Shurtleff, S. A., Williams, W. K., Patel, D., Mahfouz, R., Behm, F. G., Raimondi, S. C., Relling, M. V., Patel, A., Cheng, C., Campana, D., Wilkins, D., Zhou, X., Li, J., Liu, H., Pui, C.-H., Evans, W. E., Naeve, C., Wong, L. and Downing, J. R. (2002). Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1 133–143.
  • Zeisel, A., Zuk, O. and Domany, E. (2010). Supplement to “FDR control with adaptive procedures and FDR monotonicity.” DOI: 10.1214/10-AOAS399SUPP.
  • Zhao, H., Langerød, A., Ji, Y., Nowels, K. W., Nesland, J. M., Tibshirani, R., Bukholm, I. K., Kåresen, R., Botstein, D., Børresen-Dale, A.-L. and Jeffrey, S. S. (2004). Different gene expression patterns in invasive lobular and ductal carcinomas of the breast. Mol. Biol. Cell 15 2523–2536.
  • Zou, T.-T., Selaru, F. M., Xu, Y., Shustova, V., Yin, J., Mori, Y., Shibata, D., Sato, F., Wang, S., Olaru, A., Deacu, E., Liu, T. C., Abraham, J. M. and Meltzer, S. J. (2002). Application of cDNA microarrays to generate a molecular taxonomy capable of distinguishing between colon cancer and normal colon. Oncogene 21 4855–4862.

Supplemental materials

  • Supplementary material: Supplementary material for: FDR control with adaptive procedures and FDR monotonicity. In this supplementary file we provide proofs of the claims and theorem presented in the paper, together with technical details regarding the proposed estimator and of the simulations performed. The document includes the following sections: Supplement A: Proof of Theorem 2.3. Supplement B: Designing the IBHsum estimator. Supplement C: Proof of Claim 3.1. Supplement D: Proof of the monotonicity theorem. Supplement E: Details of the simulations.