Source: Ann. Statist. Volume 33, Number 1
(2005), 126-144.
A resurgence of interest in multiple hypothesis testing has occurred in the last decade. Motivated by studies in genomics, microarrays, DNA sequencing, drug screening, clinical trials, bioassays, education and psychology, statisticians have been devoting considerable research energy in an effort to properly analyze multiple endpoint data. In response to new applications, new criteria and new methodology, many ad hoc procedures have emerged. The classical requirement has been to use procedures which control the strong familywise error rate (FWE) at some predetermined level α. That is, the probability of any false rejection of a true null hypothesis should be less than or equal to α. Finding desirable and powerful multiple test procedures is difficult under this requirement.
One of the more recent ideas is concerned with controlling the false discovery rate (FDR), that is, the expected proportion of rejected hypotheses which are, in fact, true. Many multiple test procedures do control the FDR.
A much earlier approach to multiple testing was formulated by Lehmann [Ann. Math. Statist. 23 (1952) 541–552 and 28 (1957) 1–25]. Lehmann’s approach is decision theoretic and he treats the multiple endpoints problem as a 2k finite action problem when there are k endpoints. This approach is appealing since unlike the FWE and FDR criteria, the finite action approach pays attention to false acceptances as well as false rejections. In this paper we view the multiple endpoints problem as a 2k finite action problem. We study the popular procedures single-step, step-down and step-up from the point of view of admissibility, Bayes and limit of Bayes properties. For our model, which is a prototypical one, and our loss function, we are able to demonstrate the following results under some fairly general conditions to be specified:
(i) The single-step procedure is admissible.
(ii) A sequence of prior distributions is given for which the step-down procedure is a limit of a sequence of Bayes procedures.
(iii) For a vector risk function, where each component is the risk for an individual testing problem, various admissibility and inadmissibility results are obtained.
In a companion paper [Cohen and Sackrowitz, Ann. Statist. 33 (2005) 145–158], we are able to give a characterization of Bayes procedures and their limits. The characterization yields a complete class and the additional useful result that the step-up procedure is inadmissible. The inadmissibility of step-up is demonstrated there for a more stringent loss function. Additional decision theoretic type results are also obtained in this paper.
References
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289--300.
Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29 1165--1188.
Cohen, A. and Sackrowitz, H. B. (1984). Decision theory results for vector risks with applications. Statist. Decisions Suppl. 1 159--176.
Mathematical Reviews (MathSciNet):
MR785207
Cohen, A. and Sackrowitz, H. B. (2004). Monotonicity properties of multiple endpoint testing procedures. J. Statist. Plann. Inference 125 17--30.
Cohen, A. and Sackrowitz, H. B. (2005). Characterization of Bayes procedures for multiple endpoint problems and inadmissibility of the step-up procedure. Ann. Statist. 33 145--158.
Dudoit, S., Shaffer, J. P. and Boldrick, J. C. (2003). Multiple hypothesis testing in microarray experiments. Statist. Sci. 18 71--103.
Efron, B., Tibshirani, R., Storey, J. D. and Tusher, V. (2001). Empirical Bayes analysis and a microarray experiment. Technical Report 216, Dept. Statistics, Stanford Univ.
Finner, H. and Roters, M. (2002). Multiple hypotheses testing and expected number of type I errors. Ann. Statist. 30 220--238.
Finner, H. and Strassburger, K. (2002). The partitioning principle: A powerful tool in multiple decision theory. Ann. Statist. 30 1194--1213.
Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75 800--802.
Mathematical Reviews (MathSciNet):
MR995126
Hochberg, Y. and Tamhane, A. C. (1987). Multiple Comparison Procedures. Wiley, New York.
Mathematical Reviews (MathSciNet):
MR914493
Krishnaiah, P. R. and Pathak, P. K. (1967). Tests for the equality of covariance matrices under the intraclass correlation model. Ann. Math. Statist. 38 1286--1288.
Mathematical Reviews (MathSciNet):
MR214226
Lehmann, E. L. (1952). Testing multiparameter hypotheses. Ann. Math. Statist. 23 541--552.
Mathematical Reviews (MathSciNet):
MR52737
Lehmann, E. L. (1957). A theory of some multiple decision problems. I. Ann. Math. Statist. 28 1--25.
Mathematical Reviews (MathSciNet):
MR84952
Lehmann, E. L. (1986). Testing Statistical Hypotheses, 2nd ed. Wiley, New York.
Mathematical Reviews (MathSciNet):
MR852406
Marcus, R., Peritz, E. and Gabriel, K. R. (1976). On closed testing procedures with special reference to ordered analysis of variance. Biometrika 63 655--660.
Mathematical Reviews (MathSciNet):
MR468056
Marden, J. I. (1982). Minimal complete classes of tests of hypotheses with multivariate one-sided alternatives. Ann. Statist. 10 962--970.
Mathematical Reviews (MathSciNet):
MR663447
Matthes, T. K. and Truax, D. R. (1967). Tests of composite hypotheses for the multivariate exponential family. Ann. Math. Statist. 38 681--697.
Mathematical Reviews (MathSciNet):
MR208745
Mood, A. M., Graybill, F. A. and Boes, D. C. (1974). Introduction to the Theory of Statistics, 3rd ed. McGraw-Hill, New York.
Robertson, T., Wright, F. T. and Dykstra, R. L. (1988). Order Restricted Statistical Inference. Wiley, New York.
Mathematical Reviews (MathSciNet):
MR961262
Sarkar, S. (2002). Some results on false discovery rate in stepwise multiple testing procedures. Ann. Statist. 30 239--257.
Shaffer, J. P. (1995). Multiple hypothesis testing. Annual Review of Psychology 46 561--584.
Stefánsson, G., Kim, W. and Hsu, J. C. (1988). On confidence sets in multiple comparisons. In Statistical Decision Theory and Related Topics IV (S. S. Gupta and J. O. Berger, eds.) 2 89--104. Springer, New York.
Mathematical Reviews (MathSciNet):
MR927125
Stein, C. M. (1956). The admissibility of Hotelling's $T^2$-test. Ann. Math. Statist. 27 616--623.
Mathematical Reviews (MathSciNet):
MR80413