Source: Ann. Statist. Volume 40, Number 2
(2012), 1171-1197.
We analyze a class of estimators based on convex relaxation for solving high-dimensional matrix decomposition problems. The observations are noisy realizations of a linear transformation $\mathfrak{X}$ of the sum of an (approximately) low rank matrix $\Theta^{\star}$ with a second matrix $\Gamma^{\star}$ endowed with a complementary form of low-dimensional structure; this set-up includes many statistical models of interest, including factor analysis, multi-task regression and robust covariance estimation. We derive a general theorem that bounds the Frobenius norm error for an estimate of the pair $(\Theta^{\star},\Gamma^{\star})$ obtained by solving a convex optimization problem that combines the nuclear norm with a general decomposable regularizer. Our results use a “spikiness” condition that is related to, but milder than, singular vector incoherence. We specialize our general result to two cases that have been studied in past work: low rank plus an entrywise sparse matrix, and low rank plus a columnwise sparse matrix. For both models, our theory yields nonasymptotic Frobenius error bounds for both deterministic and stochastic noise matrices, and applies to matrices $\Theta^{\star}$ that can be exactly or approximately low rank, and matrices $\Gamma^{\star}$ that can be exactly or approximately sparse. Moreover, for the case of stochastic noise matrices and the identity observation operator, we establish matching lower bounds on the minimax error. The sharpness of our nonasymptotic predictions is confirmed by numerical simulations.
Full-text: Access denied (no subscription
detected)
We're sorry, but we are unable to provide
you with the full text of this article because we are not able to identify
you as a subscriber.
If you have a personal subscription to
this journal, then please login. If you are already logged in, then you
may need to update your profile to register your subscription.
Read more about accessing full-text
References
[1] Agarwal, A., Negahban, S. and Wainwright, M. J. (2012). Supplement to “Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions.”
DOI:10.1214/12-AOS1000SUPP.
[2] Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, Hoboken, NJ.
[3] Ando, R. K. and Zhang, T. (2005). A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res. 6 1817–1853.
[4] Blitzer, J., Foster, D. P. and Kakade, S. M. (2009). Zero-shot domain adaptation: A multi-view approach. Technical report, Toyota Technological Institute at Chicago.
[5] Blitzer, J., Mcdonald, R. and Pereira, F. (2006). Domain adaptation with structural correspondence learning. In EMNLP Conference, Sydney, Australia.
[6] Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge Univ. Press, Cambridge.
[7] Candès, E. J., Li, X., Ma, Y. and Wright, J. (2011). Robust principal component analysis? J. ACM 58 Art. 11, 37.
[8] Chandrasekaran, V., Parillo, P. A. and Willsky, A. S. (2010). Latent variable graphical model selection via convex optimization. Technical report, Massachusetts Institute of Technology.
[9] Chandrasekaran, V., Sanghavi, S., Parrilo, P. A. and Willsky, A. S. (2011). Rank-sparsity incoherence for matrix decomposition. SIAM J. Optim. 21 572–596.
[10] Fan, J., Liao, Y. and Mincheva, M. (2012). Large covariance estimation by thresholding principal orthogonal components. Technical report, Princeton Univ. Available at
arXiv:1201.0175v1.
[11] Hsu, D., Kakade, S. M. and Zhang, T. (2011). Robust matrix decomposition with sparse corruptions. IEEE Trans. Inform. Theory 57 7221–7234.
[12] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
[13] McCoy, M. and Tropp, J. A. (2011). Two proposals for robust PCA using semidefinite programming. Electron. J. Stat. 5 1123–1160.
[14] Negahban, S., Ravikumar, P., Wainwright, M. J. and Yu, B. (2009). A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers. In
NIPS Conference, Vancouver, Canada, December 2009. Full length version available at
arXiv:1010.2731v1.
Statist. Sci. To appear.
[15] Negahban, S. and Wainwright, M. J. (2011). Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Ann. Statist. 39 1069–1097.
[16] Negahban, S. and Wainwright, M. J. (2012). Restricted strong convexity and (weighted) matrix completion: Optimal bounds with noise. J. Mach. Learn. Res. 13 1665–1697.
[17] Raskutti, G., Wainwright, M. J. and Yu, B. (2011). Minimax rates of estimation for high-dimensional linear regression over $\ell_q$-balls. IEEE Trans. Inform. Theory 57 6976–6994.
[18] Recht, B., Fazel, M. and Parrilo, P. A. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52 471–501.
[19] Rockafellar, R. T. (1970). Convex Analysis. Princeton Mathematical Series 28. Princeton Univ. Press, Princeton, NJ.
Mathematical Reviews (MathSciNet):
MR274683
[20] Rohde, A. and Tsybakov, A. B. (2011). Estimation of high-dimensional low-rank matrices. Ann. Statist. 39 887–930.
[21] Xu, H., Caramanis, C. and Sanghavi, S. (2010). Robust PCA via outlier pursuit. Technical report, Univ. Texas, Austin. Available at
arXiv:1010.4237.
[22] Yuan, M., Ekici, A., Lu, Z. and Monteiro, R. (2007). Dimension reduction and coefficient estimation in multivariate linear regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 329–346.