Open Access
June 2009 Bi-cross-validation of the SVD and the nonnegative matrix factorization
Art B. Owen, Patrick O. Perry
Ann. Appl. Stat. 3(2): 564-594 (June 2009). DOI: 10.1214/08-AOAS227

Abstract

This article presents a form of bi-cross-validation (BCV) for choosing the rank in outer product models, especially the singular value decomposition (SVD) and the nonnegative matrix factorization (NMF). Instead of leaving out a set of rows of the data matrix, we leave out a set of rows and a set of columns, and then predict the left out entries by low rank operations on the retained data. We prove a self-consistency result expressing the prediction error as a residual from a low rank approximation. Random matrix theory and some empirical results suggest that smaller hold-out sets lead to more over-fitting, while larger ones are more prone to under-fitting. In simulated examples we find that a method leaving out half the rows and half the columns performs well.

Citation

Download Citation

Art B. Owen. Patrick O. Perry. "Bi-cross-validation of the SVD and the nonnegative matrix factorization." Ann. Appl. Stat. 3 (2) 564 - 594, June 2009. https://doi.org/10.1214/08-AOAS227

Information

Published: June 2009
First available in Project Euclid: 22 June 2009

zbMATH: 1166.62047
MathSciNet: MR2578836
Digital Object Identifier: 10.1214/08-AOAS227

Keywords: cross-validation , principal components , Random matrix theory , sample reuse , weak factor model

Rights: Copyright © 2009 Institute of Mathematical Statistics

Vol.3 • No. 2 • June 2009
Back to Top