Universal rank inference via residual subsampling with application to large networks

Xiao Han; Qing Yang; Yingying Fan

doi:10.1214/23-AOS2282

Abstract

Determining the precise rank is an important problem in many large-scale applications with matrix data exploiting low-rank plus noise models. In this paper, we suggest a universal approach to rank inference via residual subsampling (RIRS) for testing and estimating rank in a wide family of models, including many popularly used network models such as the degree corrected mixed membership model as a special case. Our procedure constructs a test statistic via subsampling entries of the residual matrix after extracting the spiked components. The test statistic converges in distribution to the standard normal under the null hypothesis, and diverges to infinity with asymptotic probability one under the alternative hypothesis. The effectiveness of RIRS procedure is justified theoretically, utilizing the asymptotic expansions of eigenvectors and eigenvalues for large random matrices recently developed in (J. Amer. Statist. Assoc. 117 (2022) 996–1009) and (J. R. Stat. Soc. Ser. B. Stat. Methodol. 84 (2022) 630–653). The advantages of the newly suggested procedure are demonstrated through several simulation and real data examples.

Funding Statement

Qing Yang and Xiao Han were partly supported by National Natural Science Foundation of China (No. 12101585, No. 12001518). Qing Yang was also supported by Young Elite Scientist Sponsorship Program by Cast (NO.YESS20220125). Yingying Fan was supported by NSF Grant DMS-2052964.

Acknowledgments.

Yingying Fan and Qing Yang serve as co-corresponding authors. The authors would like to thank the anonymous referees, an Associate Editor and the Editor for their constructive comments that improved the quality of this paper.

Citation

Download Citation

Xiao Han. Qing Yang. Yingying Fan. "Universal rank inference via residual subsampling with application to large networks." Ann. Statist. 51 (3) 1109 - 1133, June 2023. https://doi.org/10.1214/23-AOS2282

Information

Received: 1 March 2021; Revised: 1 February 2023; Published: June 2023

First available in Project Euclid: 20 August 2023

MathSciNet: MR4630942

zbMATH: 07732741

Digital Object Identifier: 10.1214/23-AOS2282

Subjects:

Primary: 62F03 , 62F12

Secondary: 60B20 , 62F35

Keywords: asymptotic expansions , Eigenvalues , eigenvectors , high dimensionality , large random matrices , low-rank models , Rank inference , robustness

Abstract

Funding Statement

Acknowledgments.

Citation

Information

KEYWORDS/PHRASES

PUBLICATION TITLE:

PUBLICATION YEARS