Open Access
December 2017 Selecting the number of principal components: Estimation of the true rank of a noisy matrix
Yunjin Choi, Jonathan Taylor, Robert Tibshirani
Ann. Statist. 45(6): 2590-2617 (December 2017). DOI: 10.1214/16-AOS1536

Abstract

Principal component analysis (PCA) is a well-known tool in multivariate statistics. One significant challenge in using PCA is the choice of the number of principal components. In order to address this challenge, we propose distribution-based methods with exact type 1 error controls for hypothesis testing and construction of confidence intervals for signals in a noisy matrix with finite samples. Assuming Gaussian noise, we derive exact type 1 error controls based on the conditional distribution of the singular values of a Gaussian matrix by utilizing a post-selection inference framework, and extending the approach of [Taylor, Loftus and Tibshirani (2013)] in a PCA setting. In simulation studies, we find that our proposed methods compare well to existing approaches.

Citation

Download Citation

Yunjin Choi. Jonathan Taylor. Robert Tibshirani. "Selecting the number of principal components: Estimation of the true rank of a noisy matrix." Ann. Statist. 45 (6) 2590 - 2617, December 2017. https://doi.org/10.1214/16-AOS1536

Information

Received: 1 May 2015; Revised: 1 November 2016; Published: December 2017
First available in Project Euclid: 15 December 2017

zbMATH: 06838144
MathSciNet: MR3737903
Digital Object Identifier: 10.1214/16-AOS1536

Subjects:
Primary: 62F03 , 62J05 , 62J07

Keywords: exact $p$-value , hypothesis test , principal components

Rights: Copyright © 2017 Institute of Mathematical Statistics

Vol.45 • No. 6 • December 2017
Back to Top