The Annals of Statistics

Optimal selection of reduced rank estimators of high-dimensional matrices

Florentina Bunea, Yiyuan She, and Marten H. Wegkamp

Full-text: Open access

Abstract

We introduce a new criterion, the Rank Selection Criterion (RSC), for selecting the optimal reduced rank estimator of the coefficient matrix in multivariate response regression models. The corresponding RSC estimator minimizes the Frobenius norm of the fit plus a regularization term proportional to the number of parameters in the reduced rank model.

The rank of the RSC estimator provides a consistent estimator of the rank of the coefficient matrix; in general, the rank of our estimator is a consistent estimate of the effective rank, which we define to be the number of singular values of the target matrix that are appropriately large. The consistency results are valid not only in the classic asymptotic regime, when n, the number of responses, and p, the number of predictors, stay bounded, and m, the number of observations, grows, but also when either, or both, n and p grow, possibly much faster than m.

We establish minimax optimal bounds on the mean squared errors of our estimators. Our finite sample performance bounds for the RSC estimator show that it achieves the optimal balance between the approximation error and the penalty term.

Furthermore, our procedure has very low computational complexity, linear in the number of candidate models, making it particularly appealing for large scale problems. We contrast our estimator with the nuclear norm penalized least squares (NNP) estimator, which has an inherently higher computational complexity than RSC, for multivariate regression models. We show that NNP has estimation properties similar to those of RSC, albeit under stronger conditions. However, it is not as parsimonious as RSC. We offer a simple correction of the NNP estimator which leads to consistent rank estimation.

We verify and illustrate our theoretical findings via an extensive simulation study.

Article information

Source
Ann. Statist., Volume 39, Number 2 (2011), 1282-1309.

Dates
First available in Project Euclid: 9 May 2011

Permanent link to this document
https://projecteuclid.org/euclid.aos/1304947051

Digital Object Identifier
doi:10.1214/11-AOS876

Mathematical Reviews number (MathSciNet)
MR2816355

Zentralblatt MATH identifier
1216.62086

Subjects
Primary: 62H15: Hypothesis testing 62J07: Ridge regression; shrinkage estimators

Keywords
Multivariate response regression reduced rank estimators dimension reduction rank selection adaptive estimation oracle inequalities nuclear norm low rank matrix approximation

Citation

Bunea, Florentina; She, Yiyuan; Wegkamp, Marten H. Optimal selection of reduced rank estimators of high-dimensional matrices. Ann. Statist. 39 (2011), no. 2, 1282--1309. doi:10.1214/11-AOS876. https://projecteuclid.org/euclid.aos/1304947051


Export citation

References

  • Anderson, T. W. (1951). Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann. Math. Statist. 22 327–351.
  • Anderson, T. W. (1999). Asymptotic distribution of the reduced rank regression estimator under general conditions. Ann. Statist. 27 1141–1154.
  • Anderson, T. W. (2002). Specification and misspecification in reduced rank regression. Sankhyā Ser. A 64 193–205.
  • Candès, E. J. and Plan, Y. (2010). Tight oracle bounds for low-rank matrix recovery from a minimal number of random measurements. Available at arxiv:1001.0339 [cs.IT].
  • Candès, E. J. and Tao, T. (2010). The power of convex relaxation: Near-optimal matrix completion. IEEE Trans. Inform. Theory 56 2053–2080.
  • Cavalier, L., Golubev, G. K., Picard, D. and Tsybakov, A. B. (2002). Oracle inequalities for inverse problems. Ann. Statist. 30 843–874.
  • Fazel, M. (2002). Matrix rank minimization with applications. Ph.D. thesis, Stanford University.
  • Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge Univ. Press, Cambridge.
  • Izenman, A. J. (1975). Reduced-rank regression for the multivariate linear model. J. Multivariate Anal. 5 248–264.
  • Izenman, A. J. (2008). Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning. Springer, New York.
  • Kolmogorov, A. N. and Tihomirov, V. M. (1961). ε-entropy and ε-capacity of sets in functional spaces. Amer. Math. Soc. Transl. (2) 17 277–364.
  • Lu, Z., Monteiro, R. and Yuan, M. (2010). Convex optimization methods for dimension reduction and coefficient estimation in multivariate linear regression. Math. Program. To appear.
  • Ma, S., Goldfarb, D. and Chen, L. (2009). Fixed point and Bregman iterative methods for matrix rank minimization. Available at arxiv:0905.1643 [math.OC].
  • Negahban, S. and Wainwright, M. J. (2009). Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Available at arxiv:0912.5100v1 [math.ST].
  • Rao, C. R. (1980). Matrix approximations and reduction of dimensionality in multivariate statistical analysis. In Multivariate Analysis, V (Proc. Fifth Internat. Sympos., Univ. Pittsburgh, Pittsburgh, PA, 1978) 3–22. North-Holland, Amsterdam.
  • Recht, B., Fazel, M. and Parrilo, P. A. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52 471–501.
  • Reinsel, G. C. and Velu, R. P. (1998). Multivariate Reduced-Rank Regression: Theory and Applications. Lecture Notes in Statist. 136. Springer, New York.
  • Robinson, P. M. (1973). Generalized canonical analysis for time series. J. Multivariate Anal. 3 141–160.
  • Robinson, P. M. (1974). Identification, estimation and large-sample theory for regressions containing unobservable variables. Internat. Econom. Rev. 15 680–692.
  • Rohde, A. and Tsybakov, A. B. (2010). Estimation of high-dimensional low-rank matrices. Available at arxiv:0912.5338v2 [math.ST].
  • Rudelson, M. and Vershynin, R. (2010). Non-asymptotic theory of random matrices: Extreme singular values. In Proceedings of the International Congress of Mathematicians. Hyderabad, India. To appear.
  • Takane, Y. and Hunter, M. A. (2001). Constrained principal component analysis: A comprehensive theory. Appl. Algebra Engrg. Comm. Comput. 12 391–419.
  • Takane, Y. and Hwang, H. (2007). Regularized linear and kernel redundancy analysis. Comput. Statist. Data Anal. 52 394–405.
  • Toh, K. C., Todd, M. J. and Tütüncü, R. H. (1999). SDPT3—a MATLAB software package for semidefinite programming, version 1.3. Optim. Methods Softw. 11/12 545–581.
  • van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York.
  • Yuan, M., Ekici, A., Lu, Z. and Monteiro, R. (2007). Dimension reduction and coefficient estimation in multivariate linear regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 329–346.