### The positive false discovery rate: a Bayesian interpretation and the q-value

John D. Storey
Source: Ann. Statist. Volume 31, Number 6 (2003), 2013-2035.

#### Abstract

Multiple hypothesis testing is concerned with controlling the rate of false positives when testing several hypotheses simultaneously. One multiple hypothesis testing error measure is the false discovery rate (FDR), which is loosely defined to be the expected proportion of false positives among all significant hypotheses. The FDR is especially appropriate for exploratory analyses in which one is interested in finding several significant results among many tests. In this work, we introduce a modified version of the FDR called the "positive false discoveryrate" (pFDR). We discuss the advantages and disadvantages of the pFDR and investigate its statistical properties. When assuming the test statistics follow a mixture distribution, we show that the pFDR can be written as a Bayesian posterior probability and can be connected to classification theory. These properties remain asymptotically true under fairly general conditions, even under certain forms of dependence. Also, a new quantity called the "$q$-value" is introduced and investigated, which is a natural "Bayesian posterior p-value," or rather the pFDR analogue of the p-value.

First Page:
Primary Subjects: 62F03
Full-text: Open access

Permanent link to this document: http://projecteuclid.org/euclid.aos/1074290335
Digital Object Identifier: doi:10.1214/aos/1074290335
Mathematical Reviews number (MathSciNet): MR2036398
Zentralblatt MATH identifier: 02067675

### References

Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289--300.
Mathematical Reviews (MathSciNet): MR1325392
Benjamini, Y. and Hochberg, Y. (2000). On the adaptive control of the false discovery rate in multiple testing with independent statistics. J. Educational and Behavioral Statistics 25 60--83.
Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165--1188.
Mathematical Reviews (MathSciNet): MR1869245
Digital Object Identifier: doi:10.1214/aos/1013699998
Project Euclid: euclid.aos/1013699998
Zentralblatt MATH: 1041.62061
Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York.
Mathematical Reviews (MathSciNet): MR233396
Zentralblatt MATH: 0172.21201
Brown, P. O. and Botstein, D. (1999). Exploring the new world of the genome with DNA microarrays. Nature Genetics 21 33--37.
Cherkassky, V. S. and Mulier, F. M. (1998). Learning from Data: Concepts, Theory and Methods. Wiley, New York.
Zentralblatt MATH: 0960.62002
Mathematical Reviews (MathSciNet): MR2334401
Efron, B., Tibshirani, R., Storey, J. D. and Tusher, V. (2001). Empirical Bayes analysis of a microarray experiment. J. Amer. Statist. Assoc. 96 1151--1160.
Mathematical Reviews (MathSciNet): MR1946571
Digital Object Identifier: doi:10.1198/016214501753382129
Zentralblatt MATH: 1073.62511
Genovese, C. and Wasserman, L. (2002a). Operating characteristics and extensions of the procedure. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 499--517.
Mathematical Reviews (MathSciNet): MR1924303
Digital Object Identifier: doi:10.1111/1467-9868.00347
Zentralblatt MATH: 1090.62072
Genovese, C. and Wasserman, L. (2002b). False discovery rates. Technical report, Dept. Statistics, Carnegie Mellon Univ.
Lehmann, E. L. (1986). Testing Statistical Hypotheses, 2nd ed. Wiley, New York.
Mathematical Reviews (MathSciNet): MR852406
Zentralblatt MATH: 0608.62020
Morton, N. E. (1955). Sequential tests for the detection of linkage. Amer. J. Human Genetics 7 277--318.
Sarkar, S. K. (2002). Some results on false discovery rate in stepwise multiple testing procedures. Ann. Statist. 30 239--257.
Mathematical Reviews (MathSciNet): MR1892663
Digital Object Identifier: doi:10.1214/aos/1015362192
Project Euclid: euclid.aos/1015362192
Zentralblatt MATH: 1101.62349
Shaffer, J. (1995). Multiple hypothesis testing: A review. Annual Review of Psychology 46 561--584.
Simes, R. J. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika 73 751--754.
Mathematical Reviews (MathSciNet): MR897872
Zentralblatt MATH: 0613.62067
Storey, J. D. (2001). The positive false discovery rate: A Bayesian interpretation and the $q$-value. Technical Report 2001-12, Dept. Statistics, Stanford Univ.
Storey, J. D. (2002a). A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 479--498.
Mathematical Reviews (MathSciNet): MR1924302
Digital Object Identifier: doi:10.1111/1467-9868.00346
Zentralblatt MATH: 1090.62073
Storey, J. D. (2002b). False discovery rates: Theory and applications to DNA microarrays. Ph.D. dissertation, Dept. Statistics, Stanford Univ.
Storey, J. D., Taylor, J. E. and Siegmund, D. (2004). Strong control, conservative point estimation, and simultaneous conservative consistency of false discovery rates: A unified approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 66 187--205.
Mathematical Reviews (MathSciNet): MR2035766
Digital Object Identifier: doi:10.1111/j.1467-9868.2004.00439.x
Zentralblatt MATH: 1061.62110
Weller, J. I., Song, J. Z., Heyen, D. W., Lewin, H. A. and Ron, M. (1998). A new approach to the problem of multiple comparisons in the genetic dissection of complex traits. Genetics 150 1699--1706.
Zaykin, D. V., Young, S. S. and Westfall, P. H. (2000). Using the false discovery rate approach in the genetic dissection of complex traits: A response to Weller et al. Genetics 154 1917--1918.