Open Access
February 2016 Bi-Cross-Validation for Factor Analysis
Art B. Owen, Jingshu Wang
Statist. Sci. 31(1): 119-139 (February 2016). DOI: 10.1214/15-STS539

Abstract

Factor analysis is over a century old, but it is still problematic to choose the number of factors for a given data set. We provide a systematic review of current methods and then introduce a method based on bi-cross-validation, using randomly held-out submatrices of the data to choose the optimal number of factors. We find it performs better than many existing methods especially when both the number of variables and the sample size are large and some of the factors are relatively weak. Our performance criterion is based on recovery of an underlying signal, equal to the product of the usual factor and loading matrices. Like previous comparisons, our work is simulation based. Recent advances in random matrix theory provide principled choices for the number of factors when the noise is homoscedastic, but not for the heteroscedastic case. The simulations we chose are designed using guidance from random matrix theory. In particular, we include factors which are asymptotically too small to detect, factors large enough to detect but not large enough to improve the estimate, and two classes of factors (weak and strong) large enough to be useful. We also find that a form of early stopping regularization improves the recovery of the signal matrix.

Citation

Download Citation

Art B. Owen. Jingshu Wang. "Bi-Cross-Validation for Factor Analysis." Statist. Sci. 31 (1) 119 - 139, February 2016. https://doi.org/10.1214/15-STS539

Information

Published: February 2016
First available in Project Euclid: 10 February 2016

zbMATH: 06946215
MathSciNet: MR3458596
Digital Object Identifier: 10.1214/15-STS539

Keywords: parallel analysis , Random matrix theory , scree plot , unwanted variation

Rights: Copyright © 2016 Institute of Mathematical Statistics

Vol.31 • No. 1 • February 2016
Back to Top