Institute of Mathematical Statistics Collections

The average likelihood ratio for large-scale multiple testing and detecting sparse mixtures

Guenther Walther

Full-text: Open access


Large-scale multiple testing problems require the simultaneous assessment of many p-values. This paper compares several methods to assess the evidence in multiple binomial counts of p-values: the maximum of the binomial counts after standardization (the “higher-criticism statistic”), the maximum of the binomial counts after a log-likelihood ratio transformation (the “Berk–Jones statistic”), and a newly introduced average of the binomial counts after a likelihood ratio transformation. Simulations show that the higher criticism statistic has a superior performance to the Berk–Jones statistic in the case of very sparse alternatives (sparsity coefficient $\beta \gtrapprox 0.75$), while the situation is reversed for $\beta \lessapprox 0.75$. The average likelihood ratio is found to combine the favorable performance of higher criticism in the very sparse case with that of the Berk–Jones statistic in the less sparse case and thus appears to dominate both statistics. Some asymptotic optimality theory is considered but found to set in too slowly to illuminate the above findings, at least for sample sizes up to one million. In contrast, asymptotic approximations to the critical values of the Berk–Jones statistic that have been developed by [In High Dimensional Probability III (2003) 321–332 Birkhäuser] and [ Ann. Statist. 35 (2007) 2018–2053] are found to give surprisingly accurate approximations even for quite small sample sizes.

Chapter information

Banerjee, M., Bunea, F., Huang, J., Koltchinskii, V., and Maathuis, M. H., eds., From Probability to Statistics and Back: High-Dimensional Models and Processes -- A Festschrift in Honor of Jon A. Wellner, (Beachwood, Ohio, USA: Institute of Mathematical Statistics, 2013) , 317-326

First available in Project Euclid: 8 March 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60G30: Continuity and singularity of induced measures 60G30: Continuity and singularity of induced measures
Secondary: 60G32

Average likelihood ratio sparse mixture higher criticism Berk–Jones statistic log-likelihood ratio transformation

Copyright © 2010, Institute of Mathematical Statistics


Walther, Guenther. The average likelihood ratio for large-scale multiple testing and detecting sparse mixtures. From Probability to Statistics and Back: High-Dimensional Models and Processes -- A Festschrift in Honor of Jon A. Wellner, 317--326, Institute of Mathematical Statistics, Beachwood, Ohio, USA, 2013. doi:10.1214/12-IMSCOLL923.

Export citation


  • [1] Arias-Castro, E., Donoho, D. L. and Huo, X. (2005). Near-optimal detection of geometric objects by fast multiscale methods. IEEE Trans. Inform. Th. 51 2402–2425.
  • [2] Berk, R. H. and Jones, D. H. (1979). Goodness-of-fit test statistics that dominate the Kolmogorov statistics. Z. Wahrsch. Verw. Gebiete. 47 47–59.
  • [3] Burnashev, M. V. and Begmatov, I. A. (1990). On a problem of detecting a signal that leads to stable distributions. Theory Probab. Appl. 35 556–560.
  • [4] Chan, H. P. (2009). Detection of spatial clustering with average likelihood ratio test statistics. Ann. Statist. 37 3985–4010.
  • [5] Chan, H. P. and Walther, G. (2011). Detection with the scan and the average likelihood ratio. Manuscript.
  • [6] Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Statist. 32 962–994.
  • [7] Dümbgen, L. (1998). New goodness-of-fit tests and their application to nonparametric confidence sets. Ann. Statist. 26 288–314.
  • [8] Eicker, F. (1979). The asymptotic distribution of the suprema of the standardized empirical processes. Ann. Statist. 7 116–138.
  • [9] Gangnon, R. E. and Clayton, M. K. (2001). The weighted average likelihood ratio test for spatial disease clustering. Stat. Med. 20 2977–2987.
  • [10] Hall, P. (1979). On the rate of convergence of normal extremes. J. Appl. Probab. 16 433–439.
  • [11] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13–30.
  • [12] Ingster, Y. I. (1997). Some problems of hypothesis testing leading to infinitely divisible distributions. Math. Methods Statist. 6 47–69.
  • [13] Ingster, Y. I. (1998). Minimax detection of a signal for $l^n$-balls. Math. Methods Statist. 7 401–428.
  • [14] Jaeschke, D. (1979). The asymptotic distribution of the supremum of the standardized empirical distribution function on subintervals. Ann. Statist. 7 108–115.
  • [15] Jager, L. and Wellner, J. A. (2007). Goodness-of-fit tests via phi-divergences. Ann. Statist. 35 2018–2053.
  • [16] Jin, J. (2004). Detecting a target in very noisy data from multiple looks. IMS Monograph. 45 255–286.
  • [17] Lehmann, E. L. and Romano, J. P. (2005). Testing Statistical Hypotheses, Third Edition, Springer, New York.
  • [18] Shiryaev, A. N. (1963). On optimum methods in quickest detection problems. Theory Probab. Appl. 8 22–46.
  • [19] Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics. Wiley, New York.
  • [20] Siegmund, D. (2001). Is peak height sufficient? Genet. Epidemiol. 20 403–408.
  • [21] Walther, G. (2010). Optimal and fast detection of spatial clusters with scan statistics. Ann. Statist. 38 1010–1033.
  • [22] Wellner, J. A. (2006). Goodness of fit via phi-divergences: A new family of test statistics. Talk at Northwest Probability Seminar. University of Washington, Seattle. October 22, 2006.
  • [23] Wellner, J. A. and Koltchinskii, V. (2003). A note on the asymptotic distribution of Berk–Jones type statistics under the null hypothesis. In High Dimensional Probability III (J. Hoffmann-Jorgensen, M. B. Marcus and J. A. Wellner, eds.) 321–332. Birkhäuser, Basel.