The Annals of Statistics

Sparsistency and rates of convergence in large covariance matrix estimation

Clifford Lam and Jianqing Fan

Source: Ann. Statist. Volume 37, Number 6B (2009), 4254-4278.

Abstract

This paper studies the sparsistency and rates of convergence for estimating sparse covariance and precision matrices based on penalized likelihood with nonconvex penalty functions. Here, sparsistency refers to the property that all parameters that are zero are actually estimated as zero with probability tending to one. Depending on the case of applications, sparsity priori may occur on the covariance matrix, its inverse or its Cholesky decomposition. We study these three sparsity exploration problems under a unified framework with a general penalty function. We show that the rates of convergence for these problems under the Frobenius norm are of order (sn log pn/n)1/2, where sn is the number of nonzero elements, pn is the size of the covariance matrix and n is the sample size. This explicitly spells out the contribution of high-dimensionality is merely of a logarithmic factor. The conditions on the rate with which the tuning parameter λn goes to 0 have been made explicit and compared under different penalties. As a result, for the L1-penalty, to guarantee the sparsistency and optimal rate of convergence, the number of nonzero elements should be small: sn'=O(pn) at most, among O(pn2) parameters, for estimating sparse covariance or correlation matrix, sparse precision or inverse correlation matrix or sparse Cholesky factor, where sn' is the number of the nonzero elements on the off-diagonal entries. On the other hand, using the SCAD or hard-thresholding penalty functions, there is no such a restriction.

Primary Subjects: 62F12
Secondary Subjects: 62J07
Keywords: Covariance matrix; high-dimensionality; consistency; nonconcave penalized likelihood; sparsistency; asymptotic normality

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1256303543
Digital Object Identifier: doi:10.1214/09-AOS720
Zentralblatt MATH identifier: 05644272
Mathematical Reviews number (MathSciNet): MR2572459

References

Bai, Z. and Silverstein, J. W. (2006). Spectral Analysis of Large Dimensional Random Matrices. Science Press, Beijing.
Zentralblatt MATH: 0949.60077
Bickel, P. J. and Levina, E. (2008a). Covariance regularization by thresholding. Ann. Statist. 36 2577–2604.
Mathematical Reviews (MathSciNet): MR2485008
Zentralblatt MATH: 05503371
Digital Object Identifier: doi:10.1214/08-AOS600
Project Euclid: euclid.aos/1231165180
Bickel, P. J. and Levina, E. (2008b). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.
Mathematical Reviews (MathSciNet): MR2387969
Zentralblatt MATH: 1132.62040
Digital Object Identifier: doi:10.1214/009053607000000758
Project Euclid: euclid.aos/1201877299
Cai, T., Zhang, C.-H. and Zhou, H. (2008). Optimal rates of convergence for covariance matrix estimation. Technical report, The Wharton School, Univ. Pennsylvania.
d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2008). First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30 56–66.
Mathematical Reviews (MathSciNet): MR2399568
Zentralblatt MATH: 1156.90423
Digital Object Identifier: doi:10.1137/060670985
Dempster, A. P. (1972). Covariance selection. Biometrics 28 157–175.
Diggle, P. and Verbyla, A. (1998). Nonparametric estimation of covariance structure in longitudinal data. Biometrics 54 401–415.
El Karoui, N. (2008). Operator norm consistent estimation of a large dimensional sparse covariance matrices. Ann. Statist. 36 2717–2756.
Fan, J., Feng, Y. and Wu, Y. (2009). Network exploration via the adaptive LASSO and SCAD penalties. Ann. Appl. Stat. 3 521–541.
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
Mathematical Reviews (MathSciNet): MR1946581
Zentralblatt MATH: 1073.62547
Digital Object Identifier: doi:10.1198/016214501753382273
Fan, J. and Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Ann. Statist. 32 928–961.
Mathematical Reviews (MathSciNet): MR2065194
Zentralblatt MATH: 1092.62031
Digital Object Identifier: doi:10.1214/009053604000000256
Project Euclid: euclid.aos/1085408491
Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical LASSO. Biostatistics 9 432–441.
Huang, J., Horowitz, J. and Ma, S. (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann. Statist. 36 587–613.
Mathematical Reviews (MathSciNet): MR2396808
Zentralblatt MATH: 1133.62048
Digital Object Identifier: doi:10.1214/009053607000000875
Project Euclid: euclid.aos/1205420512
Huang, J., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood. Biometrika 93 85–98.
Mathematical Reviews (MathSciNet): MR2277742
Zentralblatt MATH: 1152.62346
Digital Object Identifier: doi:10.1093/biomet/93.1.85
Levina, E., Rothman, A. J. and Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested Lasso penalty. Ann. Appl. Stat. 2 245–263.
Mathematical Reviews (MathSciNet): MR2415602
Zentralblatt MATH: 1137.62338
Digital Object Identifier: doi:10.1214/07-AOAS139
Project Euclid: euclid.aoas/1206367820
Meier, L., van de Geer, S. and Bühlmann, P. (2008). The group Lasso for logistic regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 53–71.
Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 1436–1462.
Pourahmadi, M. (1999). Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation. Biometrika 86 677–690.
Mathematical Reviews (MathSciNet): MR1723786
Zentralblatt MATH: 0949.62066
Digital Object Identifier: doi:10.1093/biomet/86.3.677
Ravikumar, P., Lafferty, J., Liu, H. and Wasserman, L. (2007). Sparse additive models. In Advances in Neural Information Processing Systems 20. MIT Press, Cambridge, MA.
Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Stat. 2 494–515.
Mathematical Reviews (MathSciNet): MR2417391
Digital Object Identifier: doi:10.1214/08-EJS176
Project Euclid: euclid.ejs/1214491853
Smith, M. and Kohn, R. (2002). Parsimonious covariance matrix estimation for longitudinal data. J. Amer. Statist. Assoc. 97 1141–1153.
Mathematical Reviews (MathSciNet): MR1951266
Zentralblatt MATH: 1041.62044
Digital Object Identifier: doi:10.1198/016214502388618942
Wagaman, A. S. and Levina, E. (2008). Discovering sparse covariance structures with the Isomap. J. Comput. Graph. Statist. 18. To appear.
Wong, F., Carter, C. and Kohn, R. (2003). Efficient estimation of covariance selection models. Biometrika 90 809–830.
Mathematical Reviews (MathSciNet): MR2024759
Digital Object Identifier: doi:10.1093/biomet/90.4.809
Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika 94 1–17.
Mathematical Reviews (MathSciNet): MR2024760
Digital Object Identifier: doi:10.1093/biomet/90.4.831
Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 90 831–844.
Mathematical Reviews (MathSciNet): MR2367824
Zentralblatt MATH: 1142.62408
Digital Object Identifier: doi:10.1093/biomet/asm018
Zhang, C. H. (2007). Penalized linear unbiased selection. Technical report 2007-003, The Statistics Dept., Rutgers Univ.
Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541–2563.
Mathematical Reviews (MathSciNet): MR2274449
Zou, H. (2006). The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.
Mathematical Reviews (MathSciNet): MR2279469
Zentralblatt MATH: 1171.62326
Digital Object Identifier: doi:10.1198/016214506000000735
Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models (with discussion). Ann. Statist. 36 1509–1533.
Mathematical Reviews (MathSciNet): MR2435443
Digital Object Identifier: doi:10.1214/009053607000000802
Project Euclid: euclid.aos/1216237287

2010 © Institute of Mathematical Statistics