## Electronic Journal of Statistics

### Nonparametric estimation of low rank matrix valued function

Fan Zhou

#### Abstract

Let $A:[0,1]\rightarrow\mathbb{H}_{m}$ (the space of Hermitian matrices) be a matrix valued function which is low rank with entries in Hölder class $\Sigma(\beta,L)$. The goal of this paper is to study statistical estimation of $A$ based on the regression model $\mathbb{E}(Y_{j}|\tau_{j},X_{j})=\langle A(\tau_{j}),X_{j}\rangle,$ where $\tau_{j}$ are i.i.d. uniformly distributed in $[0,1]$, $X_{j}$ are i.i.d. matrix completion sampling matrices, $Y_{j}$ are independent bounded responses. We propose an innovative nuclear norm penalized local polynomial estimator and establish an upper bound on its point-wise risk measured by Frobenius norm. Then we extend this estimator globally and prove an upper bound on its integrated risk measured by $L_{2}$-norm. We also propose another new estimator based on bias-reducing kernels to study the case when $A$ is not necessarily low rank and establish an upper bound on its risk measured by $L_{\infty}$-norm. We show that the obtained rates are all optimal up to some logarithmic factor in minimax sense. Finally, we propose an adaptive estimation procedure based on Lepskii’s method and model selection with data splitting technique, which is computationally efficient and can be easily implemented and parallelized on distributed systems.

#### Article information

Source
Electron. J. Statist., Volume 13, Number 2 (2019), 3851-3892.

Dates
First available in Project Euclid: 3 October 2019

https://projecteuclid.org/euclid.ejs/1570068045

Digital Object Identifier
doi:10.1214/19-EJS1582

Subjects
Primary: 62G05: Estimation 62G08: Nonparametric regression
Secondary: 62H12: Estimation

#### Citation

Zhou, Fan. Nonparametric estimation of low rank matrix valued function. Electron. J. Statist. 13 (2019), no. 2, 3851--3892. doi:10.1214/19-EJS1582. https://projecteuclid.org/euclid.ejs/1570068045

#### References

• [1] R. Ahlswede and A. Winter. Strong converse for identification via quantum channels., IEEE Transactions on Information Theory, 48(3):569–579, 2002.
• [2] J.-P. Aubin and I. Ekeland., Applied nonlinear analysis. Courier Corporation, 2006.
• [3] A. R. Barron. Complexity regularization with application to artificial neural networks., Nonparametric Functional Estimation and Related Topics, 335:561–576, 1991.
• [4] O. Bousquet. A Bennett concentration inequality and its application to suprema of empirical processes., Comptes Rendus Mathematique, 334(6):495–500, 2002.
• [5] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers., Foundations and Trends® in Machine learning, 3(1):1–122, 2011.
• [6] J.-F. Cai, E. J. Candès, and Z. Shen. A singular value thresholding algorithm for matrix completion., SIAM Journal on Optimization, 20(4) :1956–1982, 2010.
• [7] E. J. Candes and Y. Plan. Matrix completion with noise., Proceedings of the IEEE, 98(6):925–936, 2010.
• [8] E. J. Candès and B. Recht. Exact matrix completion via convex optimization., Foundations of Computational Mathematics, 9(6):717, 2009.
• [9] E. J. Candès and T. Tao. The power of convex relaxation: Near-optimal matrix completion., IEEE Transactions on Information Theory, 56(5) :2053–2080, 2010.
• [10] S. Chatterjee. Matrix estimation by universal singular value thresholding., The Annals of Statistics, 43(1):177–214, 2015.
• [11] C. Chen, B. He, and X. Yuan. Matrix completion via an alternating direction method., IMA Journal of Numerical Analysis, 32(1):227–245, 2012.
• [12] C. C. Craig. On the Tchebychef inequality of Bernstein., The Annals of Mathematical Statistics, 4(2):94–102, 1933.
• [13] J. Fan and I. Gijbels., Local polynomial modelling and its applications. Monographs on Statistics and Applied Probability, volume 66. CRC Press, 1996.
• [14] E. N. Gilbert. A comparison of signalling alphabets., Bell Labs Technical Journal, 31(3):504–522, 1952.
• [15] D. Gross. Recovering low-rank matrices from few coefficients in any basis., IEEE Transactions on Information Theory, 57(3) :1548–1566, 2011.
• [16] V. Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization., The Annals of Statistics, 34(6) :2593–2656, 2006.
• [17] V. Koltchinskii. Oracle inequalities in empirical risk minimization and sparse recovery problems., 2011.
• [18] V. Koltchinskii. Von Neumann entropy penalization and low-rank matrix estimation., The Annals of Statistics, pages 2936–2973, 2011.
• [19] V. Koltchinskii. Sharp oracle inequalities in low rank estimation. In, Empirical Inference, pages 217–230. Springer, 2013.
• [20] V. Koltchinskii, K. Lounici, and A. B. Tsybakov. Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion., The Annals of Statistics, 39(5) :2302–2329, 2011.
• [21] Y. Koren. Collaborative filtering with temporal dynamics., Communications of the ACM, 53(4):89–97, 2010.
• [22] Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems., Computer, 42(8), 2009.
• [23] O. V. Lepski, E. Mammen, and V. G. Spokoiny. Optimal spatial adaptation to inhomogeneous smoothness: an approach based on kernel estimates with variable bandwidth selectors., The Annals of Statistics, 25(3):929–947, 1997.
• [24] O. V. Lepski and V. G. Spokoiny. Optimal pointwise adaptive methods in nonparametric estimation., The Annals of Statistics, 25(6) :2512–2546, 1997.
• [25] O. V. Lepskii. On a problem of adaptive estimation in gaussian white noise., Theory of Probability & Its Applications, 35(3):454–466, 1991.
• [26] E. H. Lieb. Convex trace functions and the Wigner-Yanase-Dyson conjecture., Advances in Mathematics, 11(3):267–288, 1973.
• [27] Z. Lin, M. Chen, and Y. Ma. The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices., arXiv preprint arXiv :1009.5055, 2010.
• [28] W. H. Miller. The classical s-matrix in molecular collisions., Advances in Chemical Physics, 30:77, 1973.
• [29] S. Negahban and M. J. Wainwright. Restricted strong convexity and weighted matrix completion: Optimal bounds with noise., Journal of Machine Learning Research, 13(May) :1665–1697, 2012.
• [30] B. Recht, M. Fazel, and P. A. Parrilo. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization., SIAM Review, 52(3):471–501, 2010.
• [31] A. Rohde and A. B. Tsybakov. Estimation of high-dimensional low-rank matrices., The Annals of Statistics, 39(2):887–930, 2011.
• [32] A. Singer and M. Cucuringu. Uniqueness of low-rank matrix completion by rigidity theory., SIAM Journal on Matrix Analysis and Applications, 31(4) :1621–1641, 2010.
• [33] M. Talagrand. New concentration inequalities in product spaces., Inventiones Mathematicae, 126(3):505–563, 1996.
• [34] J. A. Tropp. User-friendly tail bounds for sums of random matrices., Foundations of Computational Mathematics, 12(4):389–434, 2012.
• [35] A. B. Tsybakov. Introduction to nonparametric estimation. revised and extended from the 2004 french original. Translated by Vladimir Zaiats, 2009.
• [36] R. R. Varshamov. Estimate of the number of signals in error correcting codes., Dokl. Akad. Nauk SSSR, 117:739–741, 1957.
• [37] M. Wegkamp. Model selection in nonparametric regression., The Annals of Statistics, 31(1):252–273, 2003.
• [38] J. Weickert and T. Brox. Diffusion and regularization of vector-and matrix-valued., Inverse Problems, Image Analysis, and Medical Imaging: AMS Special Session on Interaction of Inverse Problems and Image Analysis, January 10-13, 2001, New Orleans, Louisiana, 313:251, 2002.
• [39] L. Zhang, G. Wahba, and M. Yuan. Distance shrinkage and euclidean embedding via regularized kernel estimation., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(4):849–867, 2016.