Electronic Journal of Statistics

Multiple hypothesis testing on composite nulls using constrained p-values

Zhiyi Chi
Source: Electron. J. Statist. Volume 4 (2010), 271-299.

Abstract

Multiple hypothesis testing often encounters composite nulls and intractable alternative distributions. In this case, using p-values that are defined as maximum significance levels over all null distributions (“pmax”) often leads to very conservative testing. We propose constructing p-values via maximization under linear constraints imposed by data’s empirical distribution, and show that these p-values allow the false discovery rate (FDR) to be controlled with substantially more power than pmax.

First Page: Show Hide
Primary Subjects: 62G10, 62H15
Secondary Subjects: 62G20
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ejs/1266847728
Digital Object Identifier: doi:10.1214/08-EJS318
Mathematical Reviews number (MathSciNet): MR2645485

References

[1] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing., J. R. Stat. Soc. Ser. B 57, 1, 289–300.
Mathematical Reviews (MathSciNet): MR1325392
[2] Benjamini, Y. and Hochberg, Y. (2000). On the adaptive control of the false discovery rate in multiple testing with independent statistics., J. Educ. Behav. Statist. 25, 1, 60–83.
[3] Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency., Ann. Statist. 29, 4, 1165–1188.
Mathematical Reviews (MathSciNet): MR1869245
Zentralblatt MATH: 1041.62061
Digital Object Identifier: doi:10.1214/aos/1013699998
Project Euclid: euclid.aos/1013699998
[4] Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984)., Classification and regression trees. Wadsworth Statistics/Probability Series. Wadsworth Advanced Books and Software, Belmont, CA.
Mathematical Reviews (MathSciNet): MR726392
Zentralblatt MATH: 0541.62042
[5] Chi, Z. (2008). False discovery rate control with multivariate, p-values. Electron. J. Statist. 2, 368–411.
Mathematical Reviews (MathSciNet): MR2411440
Digital Object Identifier: doi:10.1214/07-EJS147
Project Euclid: euclid.ejs/1211317530
[6] Chow, Y. S. and Teicher, H. (1997)., Probability theory: independence, interchangeability, martingales , 3 ed. Springer Texts in Statistics. Springer-Verlag, New York.
Mathematical Reviews (MathSciNet): MR1476912
[7] Efron, B. (2004). Large-scale simultaneous hypothesis testing: the choice of a null hypothesis., J. Amer. Statist. Assoc. 99, 465 (Mar.), 96–104.
Mathematical Reviews (MathSciNet): MR2054289
Zentralblatt MATH: 1089.62502
Digital Object Identifier: doi:10.1198/016214504000000089
[8] Efron, B. (2008). Microarrays, empirical Bayes and the two-groups model., Statist. Sci. 23, 1, 1351–1377.
Mathematical Reviews (MathSciNet): MR2431866
Digital Object Identifier: doi:10.1214/07-STS236
Project Euclid: euclid.ss/1215441276
[9] Efron, B., Tibshirani, R., Storey, J. D., and Tusher, V. G. (2001). Empirical Bayes analysis of a microarray experiment., J. Amer. Statist. Assoc. 96, 456, 1151–1160.
Mathematical Reviews (MathSciNet): MR1946571
Zentralblatt MATH: 1073.62511
Digital Object Identifier: doi:10.1198/016214501753382129
[10] Finner, H., Dickhaus, T., and Roters, M. (2009). On the false discovery rate and an asymptotically optimal rejection curve., Ann. Statist. 37, 2, 596–618.
Mathematical Reviews (MathSciNet): MR2502644
Zentralblatt MATH: 1162.62068
Digital Object Identifier: doi:10.1214/07-AOS569
Project Euclid: euclid.aos/1236693143
[11] Finner, H. and Roters, M. (1998). Asymptotic comparison of step-down and step-up multiple test procedures based on exchangeable test statistics., Ann. Statist. 26, 2, 505–524.
Mathematical Reviews (MathSciNet): MR1626043
Zentralblatt MATH: 0934.62073
Digital Object Identifier: doi:10.1214/aos/1028144847
Project Euclid: euclid.aos/1028144847
[12] Genovese, C. and Wasserman, L. (2002). Operating characteristics and extensions of the false discovery rate procedure., J. R. Stat. Soc. Ser. B 64, 3, 499–517.
Mathematical Reviews (MathSciNet): MR1924303
Zentralblatt MATH: 1090.62072
Digital Object Identifier: doi:10.1111/1467-9868.00347
[13] Genovese, C. and Wasserman, L. (2004). A stochastic process approach to false discovery control., Ann. Statist. 32, 3, 1035–1061.
Mathematical Reviews (MathSciNet): MR2065197
Zentralblatt MATH: 1092.62065
Digital Object Identifier: doi:10.1214/009053604000000283
Project Euclid: euclid.aos/1085408494
[14] Genovese, C. and Wasserman, L. (2006). Exceedance control of the false discovery proportion., J. Amer. Statist. Assoc. 101, 476, 1408–1417.
Mathematical Reviews (MathSciNet): MR2279468
Zentralblatt MATH: 1171.62338
Digital Object Identifier: doi:10.1198/016214506000000339
[15] Hastie, T., Tibshirani, R., and Friedman, J. (2001)., The elements of statistical learning. Springer Series in Statistics. Springer-Verlag, New York. Data mining, inference, and prediction.
Mathematical Reviews (MathSciNet): MR1851606
[16] Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance., Biometrika 75, 4, 800–802.
Mathematical Reviews (MathSciNet): MR995126
Zentralblatt MATH: 0661.62067
Digital Object Identifier: doi:10.1093/biomet/75.4.800
[17] Jin, J. and Cai, T. T. (2007). Estimating the null and the proportional of nonnull effects in large-scale multiple comparisons., J. Amer. Statist. Assoc. 102, 478, 495–506.
Mathematical Reviews (MathSciNet): MR2325113
Digital Object Identifier: doi:10.1198/016214507000000167
[18] Karatzas, I. and Shreve, S. E. (1991)., Brownian motion and stochastic calculus , Second ed. Graduate Texts in Mathematics, Vol. 113. Springer-Verlag, New York.
Mathematical Reviews (MathSciNet): MR1121940
[19] Lehmann, E. L. and Romano, J. P. (2005). Generalizations of the familywise error rate., Ann. Statist. 33, 3, 1138–1154.
Mathematical Reviews (MathSciNet): MR2195631
Zentralblatt MATH: 1072.62060
Digital Object Identifier: doi:10.1214/009053605000000084
Project Euclid: euclid.aos/1120224098
[20] Lehmann, E. L., Romano, J. P., and Shaffer, J. P. (2005). On optimality of stepdown and stepup multiple test procedures., Ann. Statist. 33, 3, 1084–1108.
Mathematical Reviews (MathSciNet): MR2195631
Zentralblatt MATH: 1072.62060
Digital Object Identifier: doi:10.1214/009053605000000084
Project Euclid: euclid.aos/1120224098
[21] Massart, P. (1990). The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality., Ann. Probab. 18, 3, 1269–1283.
Mathematical Reviews (MathSciNet): MR1062069
Zentralblatt MATH: 0713.62021
Digital Object Identifier: doi:10.1214/aop/1176990746
Project Euclid: euclid.aop/1176990746
[22] R Development Core Team. (2005)., R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
[23] Sarkar, S. K. (2008). Two-stage stepup procedures controlling FDR., J. Statist. Plann. & Inference 138, 4, 1072–1084.
Mathematical Reviews (MathSciNet): MR2384506
Zentralblatt MATH: 1130.62069
Digital Object Identifier: doi:10.1016/j.jspi.2007.03.058
[24] Shaffer, J. P. (2004). Optimality results in multiple hypothesis testing. In, The First Erich L. Lehmann Symposium – Optimality. IMS Lecture NOTEs Monogr. Ser., Vol. 44. Inst. Math. Statist., Beachwood, OH, 11–35.
Mathematical Reviews (MathSciNet): MR2118559
[25] Shaffer, J. P. (2006). Recent developments towards optimality in multiple hypothesis testing. In, Optimality. IMS Lecture NOTEs Monogr. Ser., Vol. 49. Inst. Math. Statist., Beachwood, OH, 16–32.
Mathematical Reviews (MathSciNet): MR2337828
Digital Object Identifier: doi:10.1214/074921706000000374
[26] Shiryayev, A. N. (1984)., Probability. Graduate Texts in Mathematics, Vol. 95. Springer-Verlag, New York.
Mathematical Reviews (MathSciNet): MR737192
Zentralblatt MATH: 0536.60001
[27] Shorack, G. R. and Wellner, J. A. (1986)., Empirical processes with applications to statistics. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley & Sons Inc., New York.
Mathematical Reviews (MathSciNet): MR838963
[28] Simes, R. J. (1986). An improved Bonferroni procedure for multiple tests of significance., Biometrika 73, 3, 751–754.
Mathematical Reviews (MathSciNet): MR897872
Zentralblatt MATH: 0613.62067
Digital Object Identifier: doi:10.1093/biomet/73.3.751
[29] Storey, J. D. (2002). A direct approach to false discovery rates., J. R. Stat. Soc. Ser. B 64, 3, 479–498.
Mathematical Reviews (MathSciNet): MR1924302
Zentralblatt MATH: 1090.62073
Digital Object Identifier: doi:10.1111/1467-9868.00346
[30] Storey, J. D., Taylor, J. E., and Siegmund, D. (2004). Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach., J. R. Stat. Soc. Ser. B 66, 1, 187–205.
Mathematical Reviews (MathSciNet): MR2035766
Zentralblatt MATH: 1061.62110
Digital Object Identifier: doi:10.1111/j.1467-9868.2004.00439.x
[31] van der Laan, M. J., Dudoit, S., and Pollard, K. S. (2004). Augmentation procedures for control of the generalized family-wise error rate and tail probabilities for the proportion of false positives., Stat. Appl. Genet. Mol. Biol. 3, Art. 15, 27 pp. (electronic).
Mathematical Reviews (MathSciNet): MR2101464
Zentralblatt MATH: 0635.62115

2012 © Institute of Mathematical Statistics

Electronic Journal of Statistics

Electronic Journal of Statistics