Open Access
December 2010 Convergence and prediction of principal component scores in high-dimensional settings
Seunggeun Lee, Fei Zou, Fred A. Wright
Ann. Statist. 38(6): 3605-3629 (December 2010). DOI: 10.1214/10-AOS821

Abstract

A number of settings arise in which it is of interest to predict Principal Component (PC) scores for new observations using data from an initial sample. In this paper, we demonstrate that naive approaches to PC score prediction can be substantially biased toward 0 in the analysis of large matrices. This phenomenon is largely related to known inconsistency results for sample eigenvalues and eigenvectors as both dimensions of the matrix increase. For the spiked eigenvalue model for random matrices, we expand the generality of these results, and propose bias-adjusted PC score prediction. In addition, we compute the asymptotic correlation coefficient between PC scores from sample and population eigenvectors. Simulation and real data examples from the genetics literature show the improved bias and numerical properties of our estimators.

Citation

Download Citation

Seunggeun Lee. Fei Zou. Fred A. Wright. "Convergence and prediction of principal component scores in high-dimensional settings." Ann. Statist. 38 (6) 3605 - 3629, December 2010. https://doi.org/10.1214/10-AOS821

Information

Published: December 2010
First available in Project Euclid: 30 November 2010

zbMATH: 1204.62097
MathSciNet: MR2766862
Digital Object Identifier: 10.1214/10-AOS821

Subjects:
Primary: 62H25
Secondary: 15A18‎

Keywords: PC regression , PC scores , PCA , Random matrix

Rights: Copyright © 2010 Institute of Mathematical Statistics

Vol.38 • No. 6 • December 2010
Back to Top