## Electronic Journal of Statistics

### Detection of sparse mixtures: higher criticism and scan statistic

#### Abstract

We consider the problem of detecting a sparse mixture as studied by Ingster (1997) and Donoho and Jin (2004). We consider a wide array of base distributions. In particular, we study the situation when the base distribution has polynomial tails, a situation that has not received much attention in the literature. Perhaps surprisingly, we find that in the context of such a power-law distribution, the higher criticism does not achieve the detection boundary. However, the scan statistic does.

#### Article information

Source
Electron. J. Statist., Volume 13, Number 1 (2019), 208-230.

Dates
First available in Project Euclid: 16 January 2019

https://projecteuclid.org/euclid.ejs/1547607852

Digital Object Identifier
doi:10.1214/18-EJS1512

Mathematical Reviews number (MathSciNet)
MR3899951

Zentralblatt MATH identifier
1411.62161

#### Citation

Arias-Castro, Ery; Ying, Andrew. Detection of sparse mixtures: higher criticism and scan statistic. Electron. J. Statist. 13 (2019), no. 1, 208--230. doi:10.1214/18-EJS1512. https://projecteuclid.org/euclid.ejs/1547607852

#### References

• Anderson, T. W. and Darling, D. A. (1952). Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes., The Annals of Mathematical Statistics 193–212.
• Arias-Castro, E. and Chen, S. (2017). Distribution-free multiple testing., Electronic Journal of Statistics 11 1983–2001.
• Arias-Castro, E., Donoho, D. L. and Huo, X. (2005). Near-optimal detection of geometric objects by fast multiscale methods., IEEE Transactions on Information Theory 51 2402–2425.
• Arias-Castro, E. and Wang, M. (2017). Distribution-free tests for sparse heterogeneous mixtures., TEST 26 71–94.
• Benjamini, Y. and Hochberg, Y. (1995). Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing., Journal of the Royal Statistical Society. Series B (Methodological) 57 289–300.
• Berk, R. H. and Jones, D. H. (1979). Goodness-of-fit test statistics that dominate the Kolmogorov statistics., Probability Theory and Related Fields 47 47–59.
• Cai, T. T., Jeng, X. J. and Jin, J. (2011). Optimal detection of heterogeneous and heteroscedastic mixtures., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 629–662.
• Cai, T. T., Jin, J. and Low, M. G. (2007). Estimation and confidence sets for sparse normal mixtures., The Annals of Statistics 35 2421–2449.
• Cai, T. T. and Wu, Y. (2014). Optimal Detection of Sparse Mixtures Against a Given Null Distribution., IEEE Transactions on Information Theory 60 2217–2232.
• Chen, S. and Arias-Castro, E. (2017). Sequential Multiple Testing., arXiv preprint arXiv:1705.10190.
• Chen, S., Ying, A. and Arias-Castro, E. (2018). A Scan Procedure for Multiple Testing., arXiv preprint arXiv:1808.00631.
• Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures., The Annals of Statistics 32 962–994.
• Genovese, C. and Wasserman, L. (2002). Operating characteristics and extensions of the false discovery rate procedure., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 499–517.
• Genovese, C. and Wasserman, L. (2004). A stochastic process approach to false discovery control., The Annals of Statistics 1035–1061.
• Gibbons, J. D. and Chakraborti, S. (2011)., Nonparametric Statistical Inference. Springer.
• Huber, P. J. and Ronchetti, E. M. (2009)., Robust Statistics. John Wiley & Sons.
• Ingster, Y. I. (1997). Some problems of hypothesis testing leading to infinitely divisible distributions., Mathematical Methods of Statistics 6 47–69.
• Jaeschke, D. (1979). The Asymptotic Distribution of the Supremum of the Standardized Empirical Distribution Function on Subintervals., The Annals of Statistics 7 108–115.
• Jin, J., Starck, J.-L., Donoho, D. L., Aghanim, N. and Forni, O. (2005). Cosmological non-Gaussian signature detection: Comparing performance of different statistical tests., EURASIP Journal on Advances in Signal Processing 2005 297184.
• Kabluchko, Z. (2011). Extremes of the standardized Gaussian noise., Stochastic Processes and their Applications 121 515–533.
• Kulldorff, M. (1997). A spatial scan statistic., Communications in Statistics: Theory and Methods 26 1481–1496.
• Moscovich, A., Nadler, B. and Spiegelman, C. (2016). On the exact Berk-Jones statistics and their $p$-value calculation., Electronic Journal of Statistics 10 2329–2354.
• Naus, J. I. (1965). The distribution of the size of the maximum cluster of points on a line., Journal of the American Statistical Association 60 532–538.
• Sharpnack, J. and Arias-Castro, E. (2016). Exact asymptotics for the scan statistic and fast alternatives., Electronic Journal of Statistics 10 2641–2684.
• Stouffer, S. A., Suchman, E. A., DeVinney, L. C., Star, S. A. and Williams Jr, R. M. (1949)., The American Soldier, Vol 1: Adjustment During Army Life. Princeton University Press.
• Tippett, L. H. C. (1931)., Methods of Statistics. Williams Norgate: London.