The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 3, Number 2 (2009), 564-594.
Bi-cross-validation of the SVD and the nonnegative matrix factorization
This article presents a form of bi-cross-validation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of columns, and then predict the left out entries by low rank operations on the retained data. We prove a self-consistency result expressing the prediction error as a residual from a low rank approximation. Random matrix theory and some empirical results suggest that smaller hold-out sets lead to more over-fitting, while larger ones are more prone to under-fitting. In simulated examples we find that a method leaving out half the rows and half the columns performs well.
Ann. Appl. Stat. Volume 3, Number 2 (2009), 564-594.
First available in Project Euclid: 22 June 2009
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Owen, Art B.; Perry, Patrick O. Bi-cross-validation of the SVD and the nonnegative matrix factorization. Ann. Appl. Stat. 3 (2009), no. 2, 564--594. doi:10.1214/08-AOAS227. https://projecteuclid.org/euclid.aoas/1245676186