Annals of Statistics

Optimal rates of convergence for covariance matrix estimation

T. Tony Cai, Cun-Hui Zhang, and Harrison H. Zhou

Full-text: Open access


Covariance matrix plays a central role in multivariate statistical analysis. Significant advances have been made recently on developing both theory and methodology for estimating large covariance matrices. However, a minimax theory has yet been developed. In this paper we establish the optimal rates of convergence for estimating the covariance matrix under both the operator norm and Frobenius norm. It is shown that optimal procedures under the two norms are different and consequently matrix estimation under the operator norm is fundamentally different from vector estimation. The minimax upper bound is obtained by constructing a special class of tapering estimators and by studying their risk properties. A key step in obtaining the optimal rate of convergence is the derivation of the minimax lower bound. The technical analysis requires new ideas that are quite different from those used in the more conventional function/sequence estimation problems.

Article information

Ann. Statist., Volume 38, Number 4 (2010), 2118-2144.

First available in Project Euclid: 11 July 2010

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H12: Estimation
Secondary: 62F12: Asymptotic properties of estimators 62G09: Resampling methods

Covariance matrix Frobenius norm minimax lower bound operator norm optimal rate of convergence tapering


Cai, T. Tony; Zhang, Cun-Hui; Zhou, Harrison H. Optimal rates of convergence for covariance matrix estimation. Ann. Statist. 38 (2010), no. 4, 2118--2144. doi:10.1214/09-AOS752.

Export citation


  • Assouad, P. (1983). Deux remarques sur l’estimation. C. R. Acad. Sci. Paris Sér. I Math. 296 1021–1024.
  • Bickel, P. J. and Levina, E. (2008a). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
  • Bickel, P. J. and Levina, E. (2008b). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604.
  • Csiszár, I. (1967). Information-type measures of difference of probability distributions and indirect observation. Studia Sci. Math. Hungar. 2 229–318.
  • El Karoui, N. (2008). Operator norm consistent estimation of large dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756.
  • Golub, G. H. and Van Loan, C. F. (1983). Matrix Computations. John Hopkins Univ. Press, Baltimore.
  • Fan, J., Fan, Y. and Lv, J. (2008). High dimensional covariance matrix estimation using a factor model. J. Econometrics 147 186–197.
  • Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. J. Multivariate Anal. 98 227–255.
  • Huang, J., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood. Biometrika 93 85–98.
  • Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.
  • Johnstone, I. M. and Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc. 104 682–693.
  • Lam, C. and Fan, J. (2007). Sparsistency and rates of convergence in large covariance matrices estimation. Technical report, Princeton Univ.
  • Muirhead, R. J. (1987). Developments in eigenvalue estimation. In Advances in Multivariate Statistical Analysis (A. K. Gupta, ed.) 277–288. Reidel, Dordrecht.
  • Ravikumar, P., Wainwright, M. J., Raskutti, G. and Yu, B. (2008). High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence. Technical report, Univ. California, Berkeley.
  • Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Stat. 2 494–515.
  • Rudelson, M. and Vershynin, R. (2007). Sampling from large matrices: An approach through geometric functional analysis. J. ACM 54 Art. 21, 19 pp. (electronic).
  • Saulis, L. and Statulevičius, V. A. (1991). Limit Theorems for Large Deviations. Springer, Berlin.
  • Wu, W. B. and Pourahmadi, M. (2009). Banding sample covariance matrices of stationary processes. Statist. Sinica 19 1755–1768.
  • Yu, B. (1997). Assouad, Fano and Le Cam. In Festschrift for Lucien Le Cam (D. Pollard, E. Torgersen and G. Yang, eds.) 423–435. Springer, Berlin.
  • Zhang, C.-H. and Huang, J. (2008). The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann. Statist. 36 1567–1594.
  • Zou, H., Hastie, T. and Tibshirani, R. (2006). Sparse principal components analysis. J. Comput. Graph. Statist. 15 265–286.