The Annals of Statistics

Karl Pearson’s meta-analysis revisited

Art B. Owen

Full-text: Open access


This paper revisits a meta-analysis method proposed by Pearson [Biometrika 26 (1934) 425–442] and first used by David [Biometrika 26 (1934) 1–11]. It was thought to be inadmissible for over fifty years, dating back to a paper of Birnbaum [J. Amer. Statist. Assoc. 49 (1954) 559–574]. It turns out that the method Birnbaum analyzed is not the one that Pearson proposed. We show that Pearson’s proposal is admissible. Because it is admissible, it has better power than the standard test of Fisher [Statistical Methods for Research Workers (1932) Oliver and Boyd] at some alternatives, and worse power at others. Pearson’s method has the advantage when all or most of the nonzero parameters share the same sign. Pearson’s test has proved useful in a genomic setting, screening for age-related genes. This paper also presents an FFT-based method for getting hard upper and lower bounds on the CDF of a sum of nonnegative random variables.

Article information

Ann. Statist. Volume 37, Number 6B (2009), 3867-3892.

First available in Project Euclid: 23 October 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F03: Hypothesis testing 62C15: Admissibility 65T99: None of the above, but in this section 52A01: Axiomatic and generalized convexity

Admissibility fast Fourier transform hypothesis testing microarrays


Owen, Art B. Karl Pearson’s meta-analysis revisited. Ann. Statist. 37 (2009), no. 6B, 3867--3892. doi:10.1214/09-AOS697.

Export citation


  • Agresti, A. (2002). Categorical Data Analysis, 2nd ed. Wiley, New York.
  • Bahadur, R. R. (1967). Rates of convergence of estimates and test statistics. Ann. Math. Statist. 38 303–324.
  • Benjamini, Y. and Heller, R. (2007). Screening for partial conjunction hypotheses. Technical report, Dept. Statistics and OR, Tel Aviv Univ.
  • Birnbaum, A. (1954). Combining independent tests of significance. J. Amer. Statist. Assoc. 49 559–574.
  • Birnbaum, A. (1955). Characterizations of complete classes of tests of some multiparametric hypotheses, with applications to likelihood ratio tests. Ann. Math. Statist. 26 21–36.
  • Boyd, S. and Vandeberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge.
  • David, F. N. (1934). On the Pλn test for randomness: Remarks, futher illustration, and table of Pλn for given values of −log10n). Biometrika 26 1–11.
  • Dudoit, S. and van der Laan, M. J. (2008). Multiple Testing Procedures with Applications to Genetics. Springer, New York.
  • Esary, J. D., Proschan, F. and Walkup, D. W. (1967). Association of random variables, with applications. Ann. Math. Statist. 38 1466–1474.
  • Fisher, R. A. (1932). Statistical Methods for Research Workers, 4th ed. Oliver and Boyd, Edinburgh.
  • Frigo, M. and Johnson, S. G. (2005). The design and implementation of FFTW3. Proceedings of the IEEE 93 216–231.
  • Greenberg, H. J. and Pierskalla, W. P. (1971). A review of quasi-convex functions. Oper. Res. 19 1553–1570.
  • Hedges, L. and Olkin, I. (1985). Statistical Methods for Meta-Analysis. Academic Press, Orlando, FL.
  • Marden, J. I. (1985). Combining independent one-sided noncentral t or normal mean tests. Ann. Statist. 13 1535–1553.
  • Matthes, T. K. and Truax, D. R. (1967). Tests of composite hypotheses for the multivariate exponential family. Ann. Math. Statist. 38 681–697.
  • Monahan, J. F. (2001). Numerical Methods of Statistics. Cambridge Univ. Press, Cambridge.
  • Oosterhoff, J. (1969). Combination of One-Sided Statistical Tests. Mathematical Center Tracts, Amsterdam.
  • Owen, A. B. (2009). Karl Pearson’s meta-analysis revisited: Supplementary report. Technical Report 2009–06, Dept. Statistics, Stanford Univ.
  • Owen, A. B. (2007). Pearson’s test in a large-scale multiple meta-analysis. Technical report, Dept. Statistics, Stanford Univ.
  • Pearson, E. S. (1938). The probability integral transformation for testing goodness of fit and combining independent tests of significance. Biometrika 30 134–148.
  • Pearson, K. (1933). On a method of determining whether a sample of size n supposed to have been drawn from a parent population having a known probability integral has probably been drawn at random. Biometrika 25 379–410.
  • Pearson, K. (1934). On a new method of deternining “goodness of fit.” Biometrika 26 425–442.
  • Simes, R. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika 73 751–754.
  • Stein, C. (1956). The admissibility of Hotelling’s t2-test. Ann. Math. Statist. 27 616–623.
  • Stouffer, S. A., Suchman, E. A., DeVinney, L. C., Star, S. A. and Williams, R. M., Jr. (1949). The American soldier. Adjustment during Army life 1. Princeton Univ. Press, Princeton, NJ.
  • Tippett, L. H. C. (1931). The Method of Statistics. Williams and Northgate, London.
  • Whitlock, M. C. (2005). Combining probability from independent tests: The weighted Z-method is superior to Fisher’s approach. J. Evolutionary Biology 18 1368–1373.
  • Wilkinson, B. (1951). A statistical consider ation in psychological research. Psychological Bulletin 48 156–158.
  • Zahn, J. M., Poosala, S., Owen, A. B., Ingram, D. K., Lustig, A., Carter, A., Weeratna, A. T., Taub, D. D., Gorospe, M., Mazan-Mamczarz, K., Lakatta, E. G., Boheler, K. R., Xu, X., Mattson, M. P., Falco, G., Ko, M. S. H., Schlessinger, D., Firman, J., Kummerfeld, S. K., Wood, W. H., III, Zonderman, A. B., Kim, S. K. and Becker, K. G. (2007). AGEMAP: A gene expression database for aging in mice. PLOS Genetics 3 2326–2337.