April 2022 Statistical inference for principal components of spiked covariance matrices
Zhigang Bao, Xiucai Ding, Jingming Wang, Ke Wang
Author Affiliations +
Ann. Statist. 50(2): 1144-1169 (April 2022). DOI: 10.1214/21-AOS2143

Abstract

In this paper, we study the asymptotic behavior of the extreme eigenvalues and eigenvectors of the high-dimensional spiked sample covariance matrices, in the supercritical case when a reliable detection of spikes is possible. In particular, we derive the joint distribution of the extreme eigenvalues and the generalized components of the associated eigenvectors, that is, the projections of the eigenvectors onto arbitrary given direction, assuming that the dimension and sample size are comparably large. In general, the joint distribution is given in terms of linear combinations of finitely many Gaussian and Chi-square variables, with parameters depending on the projection direction and the spikes. Our assumption on the spikes is fully general. First, the strengths of spikes are only required to be slightly above the critical threshold and no upper bound on the strengths is needed. Second, multiple spikes, that is, spikes with the same strength, are allowed. Third, no structural assumption is imposed on the spikes. Thanks to the general setting, we can then apply the results to various high dimensional statistical hypothesis testing problems involving both the eigenvalues and eigenvectors. Specifically, we propose accurate and powerful statistics to conduct hypothesis testing on the principal components. These statistics are data-dependent and adaptive to the underlying true spikes. Numerical simulations also confirm the accuracy and powerfulness of our proposed statistics and illustrate significantly better performance compared to the existing methods in the literature. In particular, our methods are accurate and powerful even when either the spikes are small or the dimension is large.

Funding Statement

The first author was partially supported by Hong Kong RGC grant GRF 16301519 and NSFC 11871425. The second author is partially supported by NSF-DMS 2113489 and grateful for the AMS-Simons Travel Grants (2020–2022). The third author was partially supported by Hong Kong RGC grant ECS 26301517 and GRF 16300618. The fourth author was partially supported by Hong Kong RGC grant GRF 16301618, GRF 16308219 and ECS 26304920.

Acknowledgments

The second author would like to thank Igor Silin for sharing the Python codes of [75] and providing some insights on the statistical applications. We would also like thank Alexander Aue, Jiang Hu, Zeng Li, Debashis Paul, Dong Xia, Yanrong Yang, Jeff Yao and Lin Zhang for many helpful discussions.

Citation

Download Citation

Zhigang Bao. Xiucai Ding. Jingming Wang. Ke Wang. "Statistical inference for principal components of spiked covariance matrices." Ann. Statist. 50 (2) 1144 - 1169, April 2022. https://doi.org/10.1214/21-AOS2143

Information

Received: 1 September 2020; Revised: 1 October 2021; Published: April 2022
First available in Project Euclid: 7 April 2022

MathSciNet: MR4404931
zbMATH: 1486.62180
Digital Object Identifier: 10.1214/21-AOS2143

Subjects:
Primary: 60B20 , 62G10
Secondary: 15B52 , 62H10 , 62H25

Keywords: Adaptive estimator , eigenvector , principal component , Random matrix , Sample covariance matrix , spiked model

Rights: Copyright © 2022 Institute of Mathematical Statistics

Vol.50 • No. 2 • April 2022
Back to Top