This paper revisits a meta-analysis method proposed by Pearson [Biometrika 26 (1934) 425–442] and first used by David [Biometrika 26 (1934) 1–11]. It was thought to be inadmissible for over fifty years, dating back to a paper of Birnbaum [J. Amer. Statist. Assoc. 49 (1954) 559–574]. It turns out that the method Birnbaum analyzed is not the one that Pearson proposed. We show that Pearson’s proposal is admissible. Because it is admissible, it has better power than the standard test of Fisher [Statistical Methods for Research Workers (1932) Oliver and Boyd] at some alternatives, and worse power at others. Pearson’s method has the advantage when all or most of the nonzero parameters share the same sign. Pearson’s test has proved useful in a genomic setting, screening for age-related genes. This paper also presents an FFT-based method for getting hard upper and lower bounds on the CDF of a sum of nonnegative random variables.
References
Agresti, A. (2002). Categorical Data Analysis, 2nd ed. Wiley, New York.
Bahadur, R. R. (1967). Rates of convergence of estimates and test statistics. Ann. Math. Statist. 38 303–324.
Mathematical Reviews (MathSciNet):
MR207085
Benjamini, Y. and Heller, R. (2007). Screening for partial conjunction hypotheses. Technical report, Dept. Statistics and OR, Tel Aviv Univ.
Birnbaum, A. (1954). Combining independent tests of significance. J. Amer. Statist. Assoc. 49 559–574.
Mathematical Reviews (MathSciNet):
MR65101
Birnbaum, A. (1955). Characterizations of complete classes of tests of some multiparametric hypotheses, with applications to likelihood ratio tests. Ann. Math. Statist. 26 21–36.
Mathematical Reviews (MathSciNet):
MR67438
Boyd, S. and Vandeberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge.
David, F. N. (1934). On the Pλn test for randomness: Remarks, futher illustration, and table of Pλn for given values of −log10(λn). Biometrika 26 1–11.
Dudoit, S. and van der Laan, M. J. (2008). Multiple Testing Procedures with Applications to Genetics. Springer, New York.
Esary, J. D., Proschan, F. and Walkup, D. W. (1967). Association of random variables, with applications. Ann. Math. Statist. 38 1466–1474.
Mathematical Reviews (MathSciNet):
MR217826
Fisher, R. A. (1932). Statistical Methods for Research Workers, 4th ed. Oliver and Boyd, Edinburgh.
Frigo, M. and Johnson, S. G. (2005). The design and implementation of FFTW3. Proceedings of the IEEE 93 216–231.
Greenberg, H. J. and Pierskalla, W. P. (1971). A review of quasi-convex functions. Oper. Res. 19 1553–1570.
Hedges, L. and Olkin, I. (1985). Statistical Methods for Meta-Analysis. Academic Press, Orlando, FL.
Mathematical Reviews (MathSciNet):
MR798597
Marden, J. I. (1985). Combining independent one-sided noncentral t or normal mean tests. Ann. Statist. 13 1535–1553.
Mathematical Reviews (MathSciNet):
MR811509
Matthes, T. K. and Truax, D. R. (1967). Tests of composite hypotheses for the multivariate exponential family. Ann. Math. Statist. 38 681–697.
Mathematical Reviews (MathSciNet):
MR208745
Monahan, J. F. (2001). Numerical Methods of Statistics. Cambridge Univ. Press, Cambridge.
Oosterhoff, J. (1969). Combination of One-Sided Statistical Tests. Mathematical Center Tracts, Amsterdam.
Mathematical Reviews (MathSciNet):
MR247707
Owen, A. B. (2009). Karl Pearson’s meta-analysis revisited: Supplementary report. Technical Report 2009–06, Dept. Statistics, Stanford Univ.
Owen, A. B. (2007). Pearson’s test in a large-scale multiple meta-analysis. Technical report, Dept. Statistics, Stanford Univ.
Pearson, E. S. (1938). The probability integral transformation for testing goodness of fit and combining independent tests of significance. Biometrika 30 134–148.
Pearson, K. (1933). On a method of determining whether a sample of size n supposed to have been drawn from a parent population having a known probability integral has probably been drawn at random. Biometrika 25 379–410.
Pearson, K. (1934). On a new method of deternining “goodness of fit.” Biometrika 26 425–442.
Simes, R. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika 73 751–754.
Mathematical Reviews (MathSciNet):
MR897872
Stein, C. (1956). The admissibility of Hotelling’s t2-test. Ann. Math. Statist. 27 616–623.
Mathematical Reviews (MathSciNet):
MR80413
Stouffer, S. A., Suchman, E. A., DeVinney, L. C., Star, S. A. and Williams, R. M., Jr. (1949). The American soldier. Adjustment during Army life 1. Princeton Univ. Press, Princeton, NJ.
Tippett, L. H. C. (1931). The Method of Statistics. Williams and Northgate, London.
Whitlock, M. C. (2005). Combining probability from independent tests: The weighted Z-method is superior to Fisher’s approach. J. Evolutionary Biology 18 1368–1373.
Wilkinson, B. (1951). A statistical consider ation in psychological research. Psychological Bulletin 48 156–158.
Zahn, J. M., Poosala, S., Owen, A. B., Ingram, D. K., Lustig, A., Carter, A., Weeratna, A. T., Taub, D. D., Gorospe, M., Mazan-Mamczarz, K., Lakatta, E. G., Boheler, K. R., Xu, X., Mattson, M. P., Falco, G., Ko, M. S. H., Schlessinger, D., Firman, J., Kummerfeld, S. K., Wood, W. H., III, Zonderman, A. B., Kim, S. K. and Becker, K. G. (2007). AGEMAP: A gene expression database for aging in mice. PLOS Genetics 3 2326–2337.