The paper proposes a method for constructing a sparse estimator for the inverse covariance (concentration) matrix in high-dimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty. We establish a rate of convergence in the Frobenius norm as both data dimension p and sample size n are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlation-based version of the method exhibits better rates in the operator norm. We also derive a fast iterative algorithm for computing the estimator, which relies on the popular Cholesky decomposition of the inverse but produces a permutation-invariant estimator. The method is compared to other estimators on simulated data and on a real data example of tumor tissue classification using gene expression data.
References
Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D., and Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays., Proc Natl Acad Sci USA, 96(12):6745–6750.
Bazaraa, M. S., Sherali, H. D., and Shetty, C. M. (2006)., Nonlinear Programming: Theory and Algorithms. Wiley, New Jersey, 3rd edition.
Bickel, P. J. and Levina, E. (2004). Some theory for Fisher’s linear discriminant function, “naive Bayes”, and some alternatives when there are many more variables than observations., Bernoulli, 10(6):989–1010.
Bickel, P. J. and Levina, E. (2007). Covariance regularization by thresholding., Ann. Statist. To appear.
Bickel, P. J. and Levina, E. (2008). Regularized estimation of large covariance matrices., Ann. Statist., 36(1):199–227.
Chaudhuri, S., Drton, M., and Richardson, T. S. (2007). Estimation of a covariance matrix with zeros., Biometrika, 94(1):199–216.
d’Aspremont, A., Banerjee, O., and El Ghaoui, L. (2008). First-order methods for sparse covariance selection., SIAM Journal on Matrix Analysis and its Applications, 30(1):56–66.
Dey, D. K. and Srinivasan, C. (1985). Estimation of a covariance matrix under Stein’s loss., Ann. Statist., 13(4):1581–1591.
Mathematical Reviews (MathSciNet):
MR811511
Drton, M. and Perlman, M. D. (2008). A SINful approach to Gaussian graphical model selection., J. Statist. Plann. Inference, 138(4):1179–1200.
El Karoui, N. (2007). Operator norm consistent estimation of large dimensional sparse covariance matrices., Ann. Statist. To appear.
Fan, J., Fan, Y., and Lv, J. (2008). High dimensional covariance matrix estimation using a factor model., Journal of Econometrics. To appear.
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties., J. Amer. Statist. Assoc., 96(456):1348–1360.
Friedman, J., Hastie, T., and Tibshirani, R. (2007). Pathwise coordinate optimization., Annals of Applied Statistics, 1(2):302–332.
Friedman, J., Hastie, T., and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso., Biostatistics. Pre-published online, DOI 10.1093/biostatistics/kxm045.
Fu, W. (1998). Penalized regressions: the bridge versus the lasso., Journal of Computational and Graphical Statistics, 7(3):397–416.
Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants., Journal of Multivariate Analysis, 98(2):227–255.
Golub, G. H. and Van Loan, C. F. (1989)., Matrix Computations. The John Hopkins University Press, Baltimore, Maryland, 2nd edition.
Haff, L. R. (1980). Empirical Bayes estimation of the multivariate normal covariance matrix., Ann. Statist., 8(3):586–597.
Mathematical Reviews (MathSciNet):
MR568722
Huang, J., Liu, N., Pourahmadi, M., and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood., Biometrika, 93(1):85–98.
Hunter, D. R. and Li, R. (2005). Variable selection using mm algorithms., Ann. Statist., 33(4):1617–1642.
Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis., Ann. Statist., 29(2):295–327.
Johnstone, I. M. and Lu, A. Y. (2004). Sparse principal components analysis. Unpublished, manuscript.
Kalisch, M. and Bühlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm., J. Mach. Learn. Res., 8:613–636.
Lam, C. and Fan, J. (2007). Sparsistency and rates of convergence in large covariance matrices estimation., Manuscript.
Ledoit, O. and Wolf, M. (2003). A well-conditioned estimator for large-dimensional covariance matrices., Journal of Multivariate Analysis, 88:365–411.
Levina, E., Rothman, A. J., and Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested Lasso penalty., Annals of Applied Statistics, 2(1):245–263.
Lin, S. P. and Perlman, M. D. (1985). A Monte Carlo comparison of four estimators for a covariance matrix. In Krishnaiah, P. R., editor, Multivariate Analysis, volume 6, pages 411–429. Elsevier Science Publishers.
Mathematical Reviews (MathSciNet):
MR822310
Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979)., Multivariate Analysis. Academic Press, New York.
Mathematical Reviews (MathSciNet):
MR560319
Meinshausen, N. and Bühlmann, P. (2006). High dimensional graphs and variable selection with the Lasso., Ann. Statist., 34(3):1436–1462.
Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model., Stat. Sinica, 17(4):1617–1642.
Saulis, L. and Statulevičius, V. A. (1991)., Limit Theorems for Large Deviations. Kluwer Academic Publishers, Dordrecht.
Smith, M. and Kohn, R. (2002). Parsimonious covariance matrix estimation for longitudinal data., J. Amer. Statist. Assoc., 97(460):1141–1153.
Wang, L., Zhu, J., and Zou, H. (2007). Hybrid huberized support vector machines for microarray classification. In, ICML ’07: Proceedings of the 24th International Conference on Machine Learning, pages 983–990, New York, NY, USA. ACM Press.
Wong, F., Carter, C., and Kohn, R. (2003). Efficient estimation of covariance selection models., Biometrika, 90:809–830.
Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data., Biometrika, 90:831–844.
Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model., Biometrika, 94(1):19–35.