A Bayesian approach for noisy matrix completion: Optimal rate under general sampling distribution

The Tien Mai; Pierre Alquier

doi:10.1214/15-EJS1020

2015 A Bayesian approach for noisy matrix completion: Optimal rate under general sampling distribution

The Tien Mai, Pierre Alquier

Electron. J. Statist. 9(1): 823-841 (2015). DOI: 10.1214/15-EJS1020

Abstract

Bayesian methods for low-rank matrix completion with noise have been shown to be very efficient computationally [3, 18, 19, 24, 28]. While the behaviour of penalized minimization methods is well understood both from the theoretical and computational points of view (see [7, 9, 16, 23] among others) in this problem, the theoretical optimality of Bayesian estimators have not been explored yet. In this paper, we propose a Bayesian estimator for matrix completion under general sampling distribution. We also provide an oracle inequality for this estimator. This inequality proves that, whatever the rank of the matrix to be estimated, our estimator reaches the minimax-optimal rate of convergence (up to a logarithmic factor). We end the paper with a short simulation study.

References

1.

[1] Alquier, P., Bayesian methods for low-rank matrix estimation: Short survey and theoretical study. In, Algorithmic Learning Theory 2013, pages 309–323. Springer, 2013. MR3133074 10.1007/978-3-642-40935-6_22[1] Alquier, P., Bayesian methods for low-rank matrix estimation: Short survey and theoretical study. In, Algorithmic Learning Theory 2013, pages 309–323. Springer, 2013. MR3133074 10.1007/978-3-642-40935-6_22

2.

[2] Alquier, P. and Biau, G., Sparse single-index model., The Journal of Machine Learning Research, 14(1):243–280, 2013. MR3033331 06276236[2] Alquier, P. and Biau, G., Sparse single-index model., The Journal of Machine Learning Research, 14(1):243–280, 2013. MR3033331 06276236

3.

[3] Alquier, P., Cottet, V., Chopin, N., and Rousseau, J., Bayesian matrix completion: Prior specification., arXiv :1406.1440, 2014.[3] Alquier, P., Cottet, V., Chopin, N., and Rousseau, J., Bayesian matrix completion: Prior specification., arXiv :1406.1440, 2014.

4.

[4] Alquier, P. and Lounici, K., Pac-Bayesian bounds for sparse regression estimation with exponential weights., Electronic Journal of Statistics, 5:127–145, 2011. MR2786484 1274.62463 10.1214/11-EJS601 euclid.ejs/1300108317 [4] Alquier, P. and Lounici, K., Pac-Bayesian bounds for sparse regression estimation with exponential weights., Electronic Journal of Statistics, 5:127–145, 2011. MR2786484 1274.62463 10.1214/11-EJS601 euclid.ejs/1300108317

5.

[5] Bennett, J. and Lanning, S., The netflix prize. In, Proceedings of KDD Cup and Workshop, volume 2007, page 35, 2007.[5] Bennett, J. and Lanning, S., The netflix prize. In, Proceedings of KDD Cup and Workshop, volume 2007, page 35, 2007.

6.

[6] Boucheron, S., Lugosi, G., and Massart, P., Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013. MR3185193[6] Boucheron, S., Lugosi, G., and Massart, P., Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013. MR3185193

7.

[7] Candès, E. J. and Plan, Y., Matrix completion with noise., Proceedings of the IEEE, 98(6):925–936, 2010.[7] Candès, E. J. and Plan, Y., Matrix completion with noise., Proceedings of the IEEE, 98(6):925–936, 2010.

8.

[8] Candès, E. J. and Recht, B., Exact matrix completion via convex optimization., Found. Comput. Math., 9(6):717–772, 2009. MR2565240 10.1007/s10208-009-9045-5[8] Candès, E. J. and Recht, B., Exact matrix completion via convex optimization., Found. Comput. Math., 9(6):717–772, 2009. MR2565240 10.1007/s10208-009-9045-5

9.

[9] Candès, E. J. and Tao, T., The power of convex relaxation: Near-optimal matrix completion., IEEE Trans. Inform. Theory, 56(5) :2053–2080, 2010. MR2723472 10.1109/TIT.2010.2044061[9] Candès, E. J. and Tao, T., The power of convex relaxation: Near-optimal matrix completion., IEEE Trans. Inform. Theory, 56(5) :2053–2080, 2010. MR2723472 10.1109/TIT.2010.2044061

10.

[10] Catoni, O., A PAC-Bayesian Approach to Adaptive Classification. Preprint Laboratoire de Probabilités et Modèles Aléatoires PMA-840, 2003.[10] Catoni, O., A PAC-Bayesian Approach to Adaptive Classification. Preprint Laboratoire de Probabilités et Modèles Aléatoires PMA-840, 2003.

11.

[11] Catoni, O., Statistical Learning Theory and Stochastic Optimization. Saint-Flour Summer School on Probability Theory 2001 (Jean Picard ed.), Lecture Notes in Mathematics. Springer, 2004. MR2163920 1076.93002[11] Catoni, O., Statistical Learning Theory and Stochastic Optimization. Saint-Flour Summer School on Probability Theory 2001 (Jean Picard ed.), Lecture Notes in Mathematics. Springer, 2004. MR2163920 1076.93002

12.

[12] Catoni, O., PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning. Institute of Mathematical Statistics Lecture Notes—Monograph Series, 56. Institute of Mathematical Statistics, Beachwood, OH, 2007. MR2483528 1277.62015[12] Catoni, O., PAC-Bayesian Supervised Classification: The Thermodynamics of Statistical Learning. Institute of Mathematical Statistics Lecture Notes—Monograph Series, 56. Institute of Mathematical Statistics, Beachwood, OH, 2007. MR2483528 1277.62015

13.

[13] Dalalyan, A. and Tsybakov, A. B., Aggregation by exponential weighting, sharp pac-bayesian bounds and sparsity., Machine Learning, 72(1–2):39–61, 2008.[13] Dalalyan, A. and Tsybakov, A. B., Aggregation by exponential weighting, sharp pac-bayesian bounds and sparsity., Machine Learning, 72(1–2):39–61, 2008.

14.

[14] Foygel, R., Shamir, O., Srebro, N., and Salakhutdinov, R., Learning with the weighted trace-norm under arbitrary sampling distributions. In, Advances in Neural Information Processing Systems, pages 2133–2141, 2011.[14] Foygel, R., Shamir, O., Srebro, N., and Salakhutdinov, R., Learning with the weighted trace-norm under arbitrary sampling distributions. In, Advances in Neural Information Processing Systems, pages 2133–2141, 2011.

15.

[15] Klopp, O., Noisy low-rank matrix completion with general sampling distribution., Bernoulli, 20(1):282–303, 2014. MR3160583 10.3150/12-BEJ486 euclid.bj/1390407290 [15] Klopp, O., Noisy low-rank matrix completion with general sampling distribution., Bernoulli, 20(1):282–303, 2014. MR3160583 10.3150/12-BEJ486 euclid.bj/1390407290

16.

[16] Koltchinskii, V., Lounici, K., and Tsybakov, A. B., Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion., The Annals of Statistics, 39(5) :2302–2329, 2011. MR2906869 1231.62097 10.1214/11-AOS894 euclid.aos/1322663459 [16] Koltchinskii, V., Lounici, K., and Tsybakov, A. B., Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion., The Annals of Statistics, 39(5) :2302–2329, 2011. MR2906869 1231.62097 10.1214/11-AOS894 euclid.aos/1322663459

17.

[17] Kotecha, J. H. and Djuric, P. M., Gibbs sampling approach for generation of truncated multivariate Gaussian random variables., Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, 3 :1757–1760, 1999.[17] Kotecha, J. H. and Djuric, P. M., Gibbs sampling approach for generation of truncated multivariate Gaussian random variables., Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, 3 :1757–1760, 1999.

18.

[18] Lawrence, N. D. and Urtasun, R., Non-linear matrix factorization with gaussian processes. In, Proceedings of the 26th Annual International Conference on Machine Learning, pages 601–608. ACM, 2009.[18] Lawrence, N. D. and Urtasun, R., Non-linear matrix factorization with gaussian processes. In, Proceedings of the 26th Annual International Conference on Machine Learning, pages 601–608. ACM, 2009.

19.

[19] Lim, Y. J. and Teh, Y. W., Variational bayesian approach to movie rating prediction. In, Proceedings of KDD Cup and Workshop, volume 7, pages 15–21, 2007.[19] Lim, Y. J. and Teh, Y. W., Variational bayesian approach to movie rating prediction. In, Proceedings of KDD Cup and Workshop, volume 7, pages 15–21, 2007.

20.

[20] Massart, P., Concentration Inequalities and Model Selection, volume 1896 of Lecture Notes in Mathematics. Springer, Berlin, 2007. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, Edited by Jean Picard. MR2319879 06223348[20] Massart, P., Concentration Inequalities and Model Selection, volume 1896 of Lecture Notes in Mathematics. Springer, Berlin, 2007. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, Edited by Jean Picard. MR2319879 06223348

21.

[21] McAllester, D., Some PAC-Bayesian theorems. In, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pages 230–234, New York, 1998. ACM. MR1811587 0945.68157 10.1145/279943.279989[21] McAllester, D., Some PAC-Bayesian theorems. In, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pages 230–234, New York, 1998. ACM. MR1811587 0945.68157 10.1145/279943.279989

22.

[22] Negahban, S. and Wainwright, M. J., Restricted strong convexity and weighted matrix completion: Optimal bounds with noise., The Journal of Machine Learning Research, 13(1) :1665–1697, 2012. MR2930649 06276162[22] Negahban, S. and Wainwright, M. J., Restricted strong convexity and weighted matrix completion: Optimal bounds with noise., The Journal of Machine Learning Research, 13(1) :1665–1697, 2012. MR2930649 06276162

23.

[23] Recht, B. and Ré, C., Parallel stochastic gradient algorithms for large-scale matrix completion., Mathematical Programming Computation, 5(2):201–226, 2013. MR3069879 1275.90039 10.1007/s12532-013-0053-8[23] Recht, B. and Ré, C., Parallel stochastic gradient algorithms for large-scale matrix completion., Mathematical Programming Computation, 5(2):201–226, 2013. MR3069879 1275.90039 10.1007/s12532-013-0053-8

24.

[24] Salakhutdinov, R. and Mnih, A., Bayesian probabilistic matrix factorization using Markov Chain Monte Carlo. In, Proceedings of the 25th International Conference on Machine Learning, pages 880–887. ACM, 2008.[24] Salakhutdinov, R. and Mnih, A., Bayesian probabilistic matrix factorization using Markov Chain Monte Carlo. In, Proceedings of the 25th International Conference on Machine Learning, pages 880–887. ACM, 2008.

25.

[25] Shawe-Taylor, J. and Williamson, R., A PAC analysis of a Bayes estimator. In, Proceedings of the Tenth Annual Conference on Computational Learning Theory, pages 2–9, New York, 1997. ACM.[25] Shawe-Taylor, J. and Williamson, R., A PAC analysis of a Bayes estimator. In, Proceedings of the Tenth Annual Conference on Computational Learning Theory, pages 2–9, New York, 1997. ACM.

26.

[26] Suzuki, T., Convergence rate of bayesian tensor estimation: optimal rate without restricted strong convexity. arXiv, :1408.3092.[26] Suzuki, T., Convergence rate of bayesian tensor estimation: optimal rate without restricted strong convexity. arXiv, :1408.3092.

27.

[27] Wilhelm, S., Package “tmvtnorm”, http://cran.r-project.org/web/packages/tmvtnorm/.[27] Wilhelm, S., Package “tmvtnorm”, http://cran.r-project.org/web/packages/tmvtnorm/.

28.

[28] Zhou, M., Wang, C., Chen, M., Paisley, J., Dunson, D., and Carin, L., Nonparametric bayesian matrix completion., Proc. IEEE SAM, 2010.[28] Zhou, M., Wang, C., Chen, M., Paisley, J., Dunson, D., and Carin, L., Nonparametric bayesian matrix completion., Proc. IEEE SAM, 2010.

Citation Download Citation

The Tien Mai and Pierre Alquier "A Bayesian approach for noisy matrix completion: Optimal rate under general sampling distribution," Electronic Journal of Statistics 9(1), 823-841, (2015). https://doi.org/10.1214/15-EJS1020

Published: 2015

Access the abstract

JOURNAL ARTICLE
19 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY