Annals of Statistics
- Ann. Statist.
- Volume 36, Number 6 (2008), 2577-2604.
Covariance regularization by thresholding
Peter J. Bickel and Elizaveta Levina
Full-text: Open access
Abstract
This paper considers regularizing a covariance matrix of p variables estimated from n observations, by hard thresholding. We show that the thresholded estimate is consistent in the operator norm as long as the true covariance matrix is sparse in a suitable sense, the variables are Gaussian or sub-Gaussian, and (log p)/n→0, and obtain explicit rates. The results are uniform over families of covariance matrices which satisfy a fairly natural notion of sparsity. We discuss an intuitive resampling scheme for threshold selection and prove a general cross-validation result that justifies this approach. We also compare thresholding to other covariance estimators in simulations and on an example from climate data.
Article information
Source
Ann. Statist., Volume 36, Number 6 (2008), 2577-2604.
Dates
First available in Project Euclid: 5 January 2009
Permanent link to this document
https://projecteuclid.org/euclid.aos/1231165180
Digital Object Identifier
doi:10.1214/08-AOS600
Mathematical Reviews number (MathSciNet)
MR2387969
Zentralblatt MATH identifier
1196.62062
Subjects
Primary: 62H12: Estimation
Secondary: 62F12: Asymptotic properties of estimators 62G09: Resampling methods
Keywords
Covariance estimation regularization sparsity thresholding large p small n high dimension low sample size
Citation
Bickel, Peter J.; Levina, Elizaveta. Covariance regularization by thresholding. Ann. Statist. 36 (2008), no. 6, 2577--2604. doi:10.1214/08-AOS600. https://projecteuclid.org/euclid.aos/1231165180
References
- [1] Abramovich, F., Benjamini, Y., Donoho, D. and Johnstone, I. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34 584–653.Mathematical Reviews (MathSciNet): MR2281879
Zentralblatt MATH: 1092.62005
Digital Object Identifier: doi:10.1214/009053606000000074
Project Euclid: euclid.aos/1151418235 - [2] Bickel, P. J. and Levina, E. (2004). Some theory for Fisher’s linear discriminant function, “naive Bayes,” and some alternatives when there are many more variables than observations. Bernoulli 10 989–1010.Mathematical Reviews (MathSciNet): MR2108040
Digital Object Identifier: doi:10.3150/bj/1106314847
Project Euclid: euclid.bj/1106314847 - [3] Bickel, P. J. and Levina, E. (2008). Regularized estimation of large covariance matrices. Ann. Statist. 36 199–227.Mathematical Reviews (MathSciNet): MR2387969
Zentralblatt MATH: 1132.62040
Digital Object Identifier: doi:10.1214/009053607000000758
Project Euclid: euclid.aos/1201877299 - [4] Bickel, P. J., Ritov, Y. and Zakai, A. (2006). Some theory for generalized boosting algorithms. J. Mach. Learn. Res. 7 705–732.Mathematical Reviews (MathSciNet): MR2274384
- [5] d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2007). First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30 56–66.Mathematical Reviews (MathSciNet): MR2399568
Zentralblatt MATH: 1156.90423
Digital Object Identifier: doi:10.1137/060670985 - [6] d’Aspremont, A., El Ghaoui, L., Jordan, M. I. and Lanckriet, G. R. G. (2007). A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49 434–448.
- [7] Dey, D. K. and Srinivasan, C. (1985). Estimation of a covariance matrix under Stein’s loss. Ann. Statist. 13 1581–1591.Mathematical Reviews (MathSciNet): MR811511
Zentralblatt MATH: 0582.62042
Digital Object Identifier: doi:10.1214/aos/1176349756
Project Euclid: euclid.aos/1176349756 - [8] Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika 81 425–455.Mathematical Reviews (MathSciNet): MR1311089
Zentralblatt MATH: 0815.62019
Digital Object Identifier: doi:10.1093/biomet/81.3.425
JSTOR: links.jstor.org - [9] Dudoit, S. and van der Laan, M. J. (2005). Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Statist. Methodol. 2 131–154.Mathematical Reviews (MathSciNet): MR2161394
Digital Object Identifier: doi:10.1016/j.stamet.2005.02.003 - [10] El Karoui, N. (2007a). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Ann. Statist. To appear.Mathematical Reviews (MathSciNet): MR2485011
Zentralblatt MATH: 05503374
Digital Object Identifier: doi:10.1214/07-AOS559
Project Euclid: euclid.aos/1231165183 - [11] El Karoui, N. (2007b). Spectrum estimation for large-dimensional covariance matrices using random matrix theory. Ann. Statist. To appear.Mathematical Reviews (MathSciNet): MR2485012
Zentralblatt MATH: 1168.62052
Digital Object Identifier: doi:10.1214/07-AOS581
Project Euclid: euclid.aos/1231165184 - [12] El Karoui, N. (2007c). Tracy–Widom limit for the largest eigenvalue of a large class of complex sample covariance matrices. Ann. Probab. 35 663–714.
- [13] Fan, J., Fan, Y. and Lv, J. (2008). High-dimensional covariance matrix estimation using a factor model. J. Econometrics. To appear.Mathematical Reviews (MathSciNet): MR2472991
Digital Object Identifier: doi:10.1016/j.jeconom.2008.09.017 - [14] Fan, J., Feng, Y. and Wu, Y. (2007). Network exploration via the adaptive LASSO and SCAD penalties. Unpublished manuscript.
- [15] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.Mathematical Reviews (MathSciNet): MR1946581
Zentralblatt MATH: 1073.62547
Digital Object Identifier: doi:10.1198/016214501753382273
JSTOR: links.jstor.org - [16] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432–441.
- [17] Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. J. Multivariate Anal. 98 227–255.Mathematical Reviews (MathSciNet): MR2301751
Digital Object Identifier: doi:10.1016/j.jmva.2006.08.003 - [18] Golub, G. H. and Van Loan, C. F. (1989). Matrix Computations, 2nd ed. Johns Hopkins Univ. Press, Baltimore, MD.Mathematical Reviews (MathSciNet): MR1002570
- [19] Gyorfi, L., Kohler, M., Krzyzak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer, New York.Mathematical Reviews (MathSciNet): MR1920390
- [20] Haff, L. R. (1980). Empirical Bayes estimation of the multivariate normal covariance matrix. Ann. Statist. 8 586–597.Mathematical Reviews (MathSciNet): MR568722
Zentralblatt MATH: 0441.62045
Digital Object Identifier: doi:10.1214/aos/1176345010
Project Euclid: euclid.aos/1176345010 - [21] Huang, J., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood. Biometrika 93 85–98.Mathematical Reviews (MathSciNet): MR2277742
Zentralblatt MATH: 1152.62346
Digital Object Identifier: doi:10.1093/biomet/93.1.85 - [22] Johnstone, I. and Silverman, B. (2005). Empirical Bayes selection of wavelet thresholds. Ann. Statist. 33 1700–1752.Mathematical Reviews (MathSciNet): MR2166560
Zentralblatt MATH: 1078.62005
Digital Object Identifier: doi:10.1214/009053605000000345
Project Euclid: euclid.aos/1123250227 - [23] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis. Ann. Statist. 29 295–327.Mathematical Reviews (MathSciNet): MR1863961
Zentralblatt MATH: 1016.62078
Digital Object Identifier: doi:10.1214/aos/1009210544
Project Euclid: euclid.aos/1009210544 - [24] Johnstone, I. M. and Lu, A. Y. (2004). Sparse principal components analysis. Unpublished manuscript.
- [25] Lam, C. and Fan, J. (2007). Sparsistency and rates of convergence in large covariance matrices estimation. Manuscript.
- [26] Ledoit, O. and Wolf, M. (2003). A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88 365–411.Mathematical Reviews (MathSciNet): MR2026339
Zentralblatt MATH: 1032.62050
Digital Object Identifier: doi:10.1016/S0047-259X(03)00096-4 - [27] Ledoux, M. (2001). The Concentration of Measure Phenomenon. Amer. Math. Soc., Providence, RI.Mathematical Reviews (MathSciNet): MR1849347
- [28] Levina, E., Rothman, A. J. and Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested Lasso penalty. Ann. Appl. Statist. 2 245–263.Mathematical Reviews (MathSciNet): MR2415602
Zentralblatt MATH: 1137.62338
Digital Object Identifier: doi:10.1214/07-AOAS139
Project Euclid: euclid.aoas/1206367820 - [29] Marčenko, V. A. and Pastur, L. A. (1967). Distributions of eigenvalues of some sets of random matrices. Math. USSR-Sb 1 507–536.
- [30] Paul, D. (2007). Asymptotics of the leading sample eigenvalues for a spiked covariance model. Statist. Sinica 17 1617–1642.
- [31] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Stat. 2 494–515.Mathematical Reviews (MathSciNet): MR2417391
Digital Object Identifier: doi:10.1214/08-EJS176
Project Euclid: euclid.ejs/1214491853 - [32] Saulis, L. and Statulevičius, V. A. (1991). Limit Theorems for Large Deviations. Kluwer, Dordrecht.
- [33] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York.
- [34] Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika 90 831–844.
- [35] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19–35.Mathematical Reviews (MathSciNet): MR2367824
Zentralblatt MATH: 1142.62408
Digital Object Identifier: doi:10.1093/biomet/asm018 - [36] Zou, H., Hastie, T. and Tibshirani, R. (2006). Sparse principal components analysis. J. Comput. Graph. Statist. 15 265–286.Mathematical Reviews (MathSciNet): MR2252527
Digital Object Identifier: doi:10.1198/106186006X113430

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- Regularized estimation of large covariance matrices
Bickel, Peter J. and Levina, Elizaveta, Annals of Statistics, 2008 - Confidence intervals for high-dimensional inverse covariance estimation
Janková, Jana and van de Geer, Sara, Electronic Journal of Statistics, 2015 - Debiasing the lasso: Optimal sample size for Gaussian designs
Javanmard, Adel and Montanari, Andrea, Annals of Statistics, 2018
- Regularized estimation of large covariance matrices
Bickel, Peter J. and Levina, Elizaveta, Annals of Statistics, 2008 - Confidence intervals for high-dimensional inverse covariance estimation
Janková, Jana and van de Geer, Sara, Electronic Journal of Statistics, 2015 - Debiasing the lasso: Optimal sample size for Gaussian designs
Javanmard, Adel and Montanari, Andrea, Annals of Statistics, 2018 - Sign-constrained least squares estimation for high-dimensional regression
Meinshausen, Nicolai, Electronic Journal of Statistics, 2013 - The Smooth-Lasso and other ℓ1+ℓ2-penalized methods
Hebiri, Mohamed and van de Geer, Sara, Electronic Journal of Statistics, 2011 - The Smooth-Lasso and other ℓ1+ℓ2-penalized methods
Hebiri, Mohamed and van de Geer, Sara, Electronic Journal of Statistics, 2011 - Covariance Estimation: The GLM and Regularization Perspectives
Pourahmadi, Mohsen, Statistical Science, 2011 - Operator norm consistent estimation of large-dimensional sparse covariance matrices
El Karoui, Noureddine, Annals of Statistics, 2008 - Statistical and computational limits for sparse matrix detection
Cai, T. Tony and Wu, Yihong, Annals of Statistics, 2020 - Robust sparse covariance estimation by thresholding Tyler’s M-estimator
Goes, John, Lerman, Gilad, and Nadler, Boaz, Annals of Statistics, 2020
