The Annals of Statistics

Signal detection in high dimension: The multispiked case

Alexei Onatski, Marcelo J. Moreira, and Marc Hallin

Full-text: Open access

Abstract

This paper applies Le Cam’s asymptotic theory of statistical experiments to the signal detection problem in high dimension. We consider the problem of testing the null hypothesis of sphericity of a high-dimensional covariance matrix against an alternative of (unspecified) multiple symmetry-breaking directions (multispiked alternatives). Simple analytical expressions for the Gaussian asymptotic power envelope and the asymptotic powers of previously proposed tests are derived. Those asymptotic powers remain valid for non-Gaussian data satisfying mild moment restrictions. They appear to lie very substantially below the Gaussian power envelope, at least for small values of the number of symmetry-breaking directions. In contrast, the asymptotic power of Gaussian likelihood ratio tests based on the eigenvalues of the sample covariance matrix are shown to be very close to the envelope. Although based on Gaussian likelihoods, those tests remain valid under non-Gaussian densities satisfying mild moment conditions. The results of this paper extend to the case of multispiked alternatives and possibly non-Gaussian densities, the findings of an earlier study [Ann. Statist. 41 (2013) 1204–1231] of the single-spiked case. The methods we are using here, however, are entirely new, as the Laplace approximation methods considered in the single-spiked context do not extend to the multispiked case.

Article information

Source
Ann. Statist., Volume 42, Number 1 (2014), 225-254.

Dates
First available in Project Euclid: 19 March 2014

Permanent link to this document
https://projecteuclid.org/euclid.aos/1395234977

Digital Object Identifier
doi:10.1214/13-AOS1181

Mathematical Reviews number (MathSciNet)
MR3189485

Zentralblatt MATH identifier
1296.62123

Subjects
Primary: 62H15: Hypothesis testing 62B15: Theory of statistical experiments
Secondary: 41A60: Asymptotic approximations, asymptotic expansions (steepest descent, etc.) [See also 30E15]

Keywords
Sphericity tests large dimensionality asymptotic power spiked covariance contiguity power envelope

Citation

Onatski, Alexei; Moreira, Marcelo J.; Hallin, Marc. Signal detection in high dimension: The multispiked case. Ann. Statist. 42 (2014), no. 1, 225--254. doi:10.1214/13-AOS1181. https://projecteuclid.org/euclid.aos/1395234977


Export citation

References

  • Azaïs, J.-M. and Wschebor, M. (2002). The distribution of the maximum of a Gaussian process: Rice method revisited. In In and Out of Equilibrium (Mambucaba, 2000). Progress in Probability 51 321–348. Birkhäuser, Boston, MA.
  • Bai, Z. D. and Silverstein, J. W. (2004). CLT for linear spectral statistics of large-dimensional sample covariance matrices. Ann. Probab. 32 553–605.
  • Bai, Z., Jiang, D., Yao, J.-F. and Zheng, S. (2009). Corrections to LRT on large-dimensional covariance matrix by RMT. Ann. Statist. 37 3822–3840.
  • Baik, J., Ben Arous, G. and Péché, S. (2005). Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann. Probab. 33 1643–1697.
  • Baik, J. and Silverstein, J. W. (2006). Eigenvalues of large sample covariance matrices of spiked population models. J. Multivariate Anal. 97 1382–1408.
  • Cai, T. T. and Ma, Z. (2013). Optimal hypothesis testing for high-dimensional covariance matrices. Bernoulli 19 2359–2388.
  • Cai, T. T., Ma, Z. and Wu, Y. (2013). Optimal estimation and rank detection for sparse spiked covariance matrices. Available at arXiv:1305.3235.
  • Chen, S. X., Zhang, L.-X. and Zhong, P.-S. (2010). Tests for high-dimensional covariance matrices. J. Amer. Statist. Assoc. 105 810–819.
  • Féral, D. and Péché, S. (2009). The largest eigenvalues of sample covariance matrices for a spiked population: Diagonal case. J. Math. Phys. 50 073302.
  • Guionnet, A. and Maïda, M. (2005). A Fourier view on the $R$-transform and related asymptotics of spherical integrals. J. Funct. Anal. 222 435–490.
  • James, A. T. (1964). Distributions of matrix variates and latent roots derived from normal samples. Ann. Math. Statist. 35 475–501.
  • John, S. (1971). Some optimal multivariate tests. Biometrika 58 123–127.
  • Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
  • Ledoit, O. and Wolf, M. (2002). Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size. Ann. Statist. 30 1081–1102.
  • Mo, M. Y. (2012). Rank 1 real Wishart spiked model. Comm. Pure Appl. Math. 65 1528–1638.
  • Onatski, A. (2014). Detection of weak signals in high-dimensional complex-valued data. Random Matrices: Theory and Applications. To appear.
  • Onatski, A., Moreira, M. J. and Hallin, M. (2013). Asymptotic power of sphericity tests for high-dimensional data. Ann. Statist. 41 1204–1231.
  • Onatski, A., Moreira, M. J. and Hallin, M. (2014). Supplement to “Signal detection in high dimension: The multispiked case.” DOI:10.1214/13-AOS1181SUPP.
  • Péché, S. (2009). Universality results for the largest eigenvalues of some sample covariance matrix ensembles. Probab. Theory Related Fields 143 481–516.
  • Schott, J. R. (2006). A high-dimensional test for the equality of the smallest eigenvalues of a covariance matrix. J. Multivariate Anal. 97 827–843.
  • Silverstein, J. W. and Bai, Z. D. (1995). On the empirical distribution of eigenvalues of a class of large-dimensional random matrices. J. Multivariate Anal. 54 175–192.
  • Srivastava, M. S. (2005). Some tests concerning the covariance matrix in high dimensional data. J. Japan Statist. Soc. 35 251–272.
  • Uhlig, H. (1994). On singular Wishart and singular multivariate beta distributions. Ann. Statist. 22 395–405.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press, Cambridge.
  • Wang, D. (2012). The largest eigenvalue of real symmetric, Hermitian and Hermitian self-dual random matrix models with rank one external source, Part I. J. Stat. Phys. 146 719–761.

Supplemental materials

  • Supplementary material: Appendix to “Signal detection in high dimension: The multispiked case”. This supplement [Onatski, Moreira and Hallin (2014)] provides an extended version of the mathematical appendix above, including Sections A.2–A.4, A.6–A.7 and A.10–A.13.