Abstract
Determining the precise rank is an important problem in many large-scale applications with matrix data exploiting low-rank plus noise models. In this paper, we suggest a universal approach to rank inference via residual subsampling (RIRS) for testing and estimating rank in a wide family of models, including many popularly used network models such as the degree corrected mixed membership model as a special case. Our procedure constructs a test statistic via subsampling entries of the residual matrix after extracting the spiked components. The test statistic converges in distribution to the standard normal under the null hypothesis, and diverges to infinity with asymptotic probability one under the alternative hypothesis. The effectiveness of RIRS procedure is justified theoretically, utilizing the asymptotic expansions of eigenvectors and eigenvalues for large random matrices recently developed in (J. Amer. Statist. Assoc. 117 (2022) 996–1009) and (J. R. Stat. Soc. Ser. B. Stat. Methodol. 84 (2022) 630–653). The advantages of the newly suggested procedure are demonstrated through several simulation and real data examples.
Funding Statement
Qing Yang and Xiao Han were partly supported by National Natural Science Foundation of China (No. 12101585, No. 12001518). Qing Yang was also supported by Young Elite Scientist Sponsorship Program by Cast (NO.YESS20220125). Yingying Fan was supported by NSF Grant DMS-2052964.
Acknowledgments.
Yingying Fan and Qing Yang serve as co-corresponding authors. The authors would like to thank the anonymous referees, an Associate Editor and the Editor for their constructive comments that improved the quality of this paper.
Citation
Xiao Han. Qing Yang. Yingying Fan. "Universal rank inference via residual subsampling with application to large networks." Ann. Statist. 51 (3) 1109 - 1133, June 2023. https://doi.org/10.1214/23-AOS2282
Information