Abstract
Principal component analysis (PCA) is a well-known tool in multivariate statistics. One significant challenge in using PCA is the choice of the number of principal components. In order to address this challenge, we propose distribution-based methods with exact type 1 error controls for hypothesis testing and construction of confidence intervals for signals in a noisy matrix with finite samples. Assuming Gaussian noise, we derive exact type 1 error controls based on the conditional distribution of the singular values of a Gaussian matrix by utilizing a post-selection inference framework, and extending the approach of [Taylor, Loftus and Tibshirani (2013)] in a PCA setting. In simulation studies, we find that our proposed methods compare well to existing approaches.
Citation
Yunjin Choi. Jonathan Taylor. Robert Tibshirani. "Selecting the number of principal components: Estimation of the true rank of a noisy matrix." Ann. Statist. 45 (6) 2590 - 2617, December 2017. https://doi.org/10.1214/16-AOS1536
Information