Electronic Journal of Statistics

The sparse Poisson means model

Ery Arias-Castro and Meng Wang

Full-text: Open access

Abstract

We consider the problem of detecting a sparse Poisson mixture. Our results parallel those for the detection of a sparse normal mixture, pioneered by Ingster (1997) and Donoho and Jin (2004), when the Poisson means are larger than logarithmic in the sample size. In particular, a form of higher criticism achieves the detection boundary in the whole sparse regime. When the Poisson means are smaller than logarithmic in the sample size, a different regime arises in which simple multiple testing with Bonferroni correction is enough in the sparse regime. We present some numerical experiments that confirm our theoretical findings.

Article information

Source
Electron. J. Statist., Volume 9, Number 2 (2015), 2170-2201.

Dates
Received: August 2014
First available in Project Euclid: 5 October 2015

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1444053023

Digital Object Identifier
doi:10.1214/15-EJS1066

Mathematical Reviews number (MathSciNet)
MR3406276

Zentralblatt MATH identifier
1337.62088

Keywords
Sparse Poisson means model goodness-of-fit tests multiple testing Bonferroni’s method Fisher’s method Pearson’s chi-squared test Tukey’s higher criticism sparse normal means model

Citation

Arias-Castro, Ery; Wang, Meng. The sparse Poisson means model. Electron. J. Statist. 9 (2015), no. 2, 2170--2201. doi:10.1214/15-EJS1066. https://projecteuclid.org/euclid.ejs/1444053023


Export citation

References

  • Arias-Castro, E. and Wang, M. (2013). Distribution-free tests for sparse heterogeneous mixtures. Preprint, arXiv:1308.0346.
  • Butucea, C. and Ingster, Y. I. (2013). Detection of a sparse submatrix of a high-dimensional noisy matrix., Bernoulli 19 2652–2688.
  • Cai, T. T., Jeng, X. J. and Jin, J. (2011). Optimal detection of heterogeneous and heteroscedastic mixtures., J. R. Stat. Soc. Ser. B Stat. Methodol. 73 629–662.
  • DasGupta, A. (2008)., Asymptotic theory of statistics and probability. Springer.
  • Dembo, A. and Zeitouni, O. (1998)., Large deviations techniques and applications 38, Second ed., Springer-Verlag, New York.
  • Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures., Ann. Statist. 32 962–994.
  • Dunne, A., Pawitan, Y. and Doody, L. (1996). Two-sided P-values from discrete asymmetric distributions based on uniformly most powerful unbiased tests., Statistician 45 397–405.
  • Holst, L. (1972). Asymptotic normality and efficiency for certain goodness-of-fit tests., Biometrika 59 137–145.
  • Ingster, Y. I. (1997). Some problems of hypothesis testing leading to infinitely divisible distributions., Math. Methods Statist. 6 47–69.
  • Lehmann, E. L. and Romano, J. P. (2005)., Testing statistical hypotheses, Third ed., Springer Texts in Statistics. Springer, New York.
  • Marioni, J. C., Mason, C. E., Mane, S. M., Stephens, M. and Gilad, Y. (2008). RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays., Genome Research 18 1509–1517.
  • Meinshausen, N. and Rice, J. (2006). Estimating the proportion of false null hypotheses among a large number of independently tested hypotheses., Ann. Statist. 34 373–393.
  • Morris, C. (1975). Central limit theorems for multinomial sums., Ann. Statist. 3 165–188.
  • Mukherjee, R., Pillai, N. S., Lin, X. et al. (2015). Hypothesis testing for high-dimensional sparse binary regression., Ann. Statist. 43 352–381.
  • Shorack, G. R. and Wellner, J. A. (1986)., Empirical processes with applications to statistics. John Wiley & Sons Inc., New York.