We present a graph-based technique for estimating sparse covariance matrices and their inverses from high-dimensional data. The method is based on learning a directed acyclic graph (DAG) and estimating parameters of a multivariate Gaussian distribution based on a DAG. For inferring the underlying DAG we use the PC-algorithm [27] and for estimating the DAG-based covariance matrix and its inverse, we use a Cholesky decomposition approach which provides a positive (semi-)definite sparse estimate. We present a consistency result in the high-dimensional framework and we compare our method with the Glasso [12, 8, 2] for simulated and real data.
References
[1] Anderson, T. W. (1984)., An Introduction to Multivariate Statistical Analysis. Wiley, NY.
Mathematical Reviews (MathSciNet):
MR771294
[2] Banerjee, O., Ghaoui, L. E. and d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data., Journal of Machine Learning Research 9 485-516.
[3] Bickel, P. J. and Levina, E. (2004). Some theory for Fisher’s linear discriminant function, “naive Bayes”, and some alternatives when there are many morevariables than observations., Bernoulli 10 989–1010.
[4] Bickel, P. J. and Levina, E. (2008). Covariance regularization by thresholding., The Annals of Statistics 36 2577–2604.
[5] Bickel, P. J. and Levina, E. (2008). Regulatized estimation of large covariance matrices., The Annals of Statistics 36 199–227.
[6] Chaudhuri, S., Drton, M. and Richardson, T. S. (2007). Estimation of a covariance matrix with zeros., Biometrika 94 1–18.
[7] Cox, D. R. and Wermuth, N. (1996)., Multivariate Dependencies, First ed. Monographs on Statistics and Applied Probability. Chapman and Hall.
[8] d’Aspremont, A., Banerjee, O. and Ghaoui, L. E. (2008). First-order methods for sparse covariance selection., SIAM Journal on Matrix Analysis and Applications 30 56–66.
[9] Deng, X. and Yuan, M. (2009). Large Gaussian covariance matrix estimation with Markov structures., Journal of Computational and Graphical Statistics 18 640–657.
[10] Drton, M. and Richardson, T. S. (2004). Iterative conditional fitting for Gaussian ancestral graph models. In, AUAI ’04: Proceedings of the 20th conference on Uncertainty in artificial intelligence 130–137. AUAI Press, Arlington, Virginia, United States.
[11] Fox, J. (1997)., Applied Regression Analysis, Linear Models, and Related Methods. Sage Publications.
[12] Friedman, J., Hastie, T. and Tibshirani, R. (2007). Sparse inverse covariance estimation with the graphical Lasso., Biostatistics 9 432-441.
[13] Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants., Journal of Multivariate Analysis 98 227–255.
[14] Huang, J. Z., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood., Biometrika 93 85–98.
[15] Kalisch, M. and Bühlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm., Journal of Machine Learning Research 8 613–636.
[16] Kalisch, M. and Bühlmann, P. (2008). Robustification of the PC-algorithm for directed acyclic graphs., Journal of Computational and Graphical Statistics 17 773–789.
[17] Kalisch, M. and Mächler, M. Estimating the skeleton and equivalence class of a DAG. Manual to the R-package, pcalg.
[18] Lauritzen, S. L. (1996)., Graphical Models. Oxford Statistical Science Series, 17. Oxford Clarendon Press.
[19] Levina, E., Rothman, A. and Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested Lasso penalty., The Annals of Applied Statistics 2 245–263.
[20] Maathuis, M. H., Kalisch, M. and Bühlmann, P. (2009). Estimating high-dimensional intervention effects from observational data., The Annals of Statistics 37 3133–3164.
[21] Marchetti, G. M. (2006). Independencies induced from a graphical Markov model after marginalization and conditioning: The R-package ggm., Journal of Statistical Software 15 1–15.
[22] Maronna, R. A. and Zamar, R. H. (2002). Robust estimates of location and dispersion for high-dimensional datasets., Technometrics 44 307–317.
[23] Meek, C. (1995). Causal Inference and Causal Explanation with Background Knowledge. In, Proceedings of the 11th Annual Conference on Uncertainty in Artificial Intelligence (UAI-95) 403–441. Morgan Kaufmann, San Francisco, CA.
[24] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso., The Annals of Statistics 34 1436–1462.
[25] Pearl, J. (2008)., Causality - Models, Reasoning, and Inference. Cambridge University Press, NY.
[26] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation., Electronic Journal of Statistics 2 494-515.
[27] Spirtes, P., Glymour, C. and Scheines, R. (2000)., Causation, Prediction, and Search, 2nd ed. The MIT Press, Cambridge, Massachusetts, London, England.
[28] van der Vaart, A. W. and Wellner, J. A. (1996)., Weak Convergence and Empirical Processes; With Applications to Statistics. Springer Series in Statistics. Springer-Verlag, New York.
[29] West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Jr., J. O., Marks, J. and Nevins, J. (2001). Predicting the clinical status of human breast cancer by using gene expression profiles., PNAS 98 11462–11467.
[30] Wille, A., Zimmermann, P., Vranova, E., Fürholz, A., Laule, O., Bleuler, S., Hennig, L., Prelic, A., von Rohr, P., Thiele, L., Zitzler, E., Gruissem, W. and Bühlmann, P. (2004). Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana., Genome Biology 5 R92.
[31] Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data., Biometrika 90 831–844.