Electronic Journal of Statistics

High dimensional sparse covariance estimation via directed acyclic graphs

Philipp Rütimann and Peter Bühlmann

Full-text: Open access

Abstract

We present a graph-based technique for estimating sparse covariance matrices and their inverses from high-dimensional data. The method is based on learning a directed acyclic graph (DAG) and estimating parameters of a multivariate Gaussian distribution based on a DAG. For inferring the underlying DAG we use the PC-algorithm [27] and for estimating the DAG-based covariance matrix and its inverse, we use a Cholesky decomposition approach which provides a positive (semi-)definite sparse estimate. We present a consistency result in the high-dimensional framework and we compare our method with the Glasso [12, 8, 2] for simulated and real data.

Article information

Source
Electron. J. Statist., Volume 3 (2009), 1133-1160.

Dates
First available in Project Euclid: 1 December 2009

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1259677088

Digital Object Identifier
doi:10.1214/09-EJS534

Mathematical Reviews number (MathSciNet)
MR2566184

Zentralblatt MATH identifier
1326.62124

Subjects
Primary: 62H12: Estimation
Secondary: 62F12: Asymptotic properties of estimators

Keywords
Concentration matrix covariance matrix directed acyclic graphs graphical Lasso high-dimensional data PC-algorithm

Citation

Rütimann, Philipp; Bühlmann, Peter. High dimensional sparse covariance estimation via directed acyclic graphs. Electron. J. Statist. 3 (2009), 1133--1160. doi:10.1214/09-EJS534. https://projecteuclid.org/euclid.ejs/1259677088


Export citation

References

  • [1] Anderson, T. W. (1984)., An Introduction to Multivariate Statistical Analysis. Wiley, NY.
  • [2] Banerjee, O., Ghaoui, L. E. and d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data., Journal of Machine Learning Research 9 485-516.
  • [3] Bickel, P. J. and Levina, E. (2004). Some theory for Fisher’s linear discriminant function, “naive Bayes”, and some alternatives when there are many morevariables than observations., Bernoulli 10 989–1010.
  • [4] Bickel, P. J. and Levina, E. (2008). Covariance regularization by thresholding., The Annals of Statistics 36 2577–2604.
  • [5] Bickel, P. J. and Levina, E. (2008). Regulatized estimation of large covariance matrices., The Annals of Statistics 36 199–227.
  • [6] Chaudhuri, S., Drton, M. and Richardson, T. S. (2007). Estimation of a covariance matrix with zeros., Biometrika 94 1–18.
  • [7] Cox, D. R. and Wermuth, N. (1996)., Multivariate Dependencies, First ed. Monographs on Statistics and Applied Probability. Chapman and Hall.
  • [8] d’Aspremont, A., Banerjee, O. and Ghaoui, L. E. (2008). First-order methods for sparse covariance selection., SIAM Journal on Matrix Analysis and Applications 30 56–66.
  • [9] Deng, X. and Yuan, M. (2009). Large Gaussian covariance matrix estimation with Markov structures., Journal of Computational and Graphical Statistics 18 640–657.
  • [10] Drton, M. and Richardson, T. S. (2004). Iterative conditional fitting for Gaussian ancestral graph models. In, AUAI ’04: Proceedings of the 20th conference on Uncertainty in artificial intelligence 130–137. AUAI Press, Arlington, Virginia, United States.
  • [11] Fox, J. (1997)., Applied Regression Analysis, Linear Models, and Related Methods. Sage Publications.
  • [12] Friedman, J., Hastie, T. and Tibshirani, R. (2007). Sparse inverse covariance estimation with the graphical Lasso., Biostatistics 9 432-441.
  • [13] Furrer, R. and Bengtsson, T. (2007). Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants., Journal of Multivariate Analysis 98 227–255.
  • [14] Huang, J. Z., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood., Biometrika 93 85–98.
  • [15] Kalisch, M. and Bühlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm., Journal of Machine Learning Research 8 613–636.
  • [16] Kalisch, M. and Bühlmann, P. (2008). Robustification of the PC-algorithm for directed acyclic graphs., Journal of Computational and Graphical Statistics 17 773–789.
  • [17] Kalisch, M. and Mächler, M. Estimating the skeleton and equivalence class of a DAG. Manual to the R-package, pcalg.
  • [18] Lauritzen, S. L. (1996)., Graphical Models. Oxford Statistical Science Series, 17. Oxford Clarendon Press.
  • [19] Levina, E., Rothman, A. and Zhu, J. (2008). Sparse estimation of large covariance matrices via a nested Lasso penalty., The Annals of Applied Statistics 2 245–263.
  • [20] Maathuis, M. H., Kalisch, M. and Bühlmann, P. (2009). Estimating high-dimensional intervention effects from observational data., The Annals of Statistics 37 3133–3164.
  • [21] Marchetti, G. M. (2006). Independencies induced from a graphical Markov model after marginalization and conditioning: The R-package ggm., Journal of Statistical Software 15 1–15.
  • [22] Maronna, R. A. and Zamar, R. H. (2002). Robust estimates of location and dispersion for high-dimensional datasets., Technometrics 44 307–317.
  • [23] Meek, C. (1995). Causal Inference and Causal Explanation with Background Knowledge. In, Proceedings of the 11th Annual Conference on Uncertainty in Artificial Intelligence (UAI-95) 403–441. Morgan Kaufmann, San Francisco, CA.
  • [24] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso., The Annals of Statistics 34 1436–1462.
  • [25] Pearl, J. (2008)., Causality - Models, Reasoning, and Inference. Cambridge University Press, NY.
  • [26] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation., Electronic Journal of Statistics 2 494-515.
  • [27] Spirtes, P., Glymour, C. and Scheines, R. (2000)., Causation, Prediction, and Search, 2nd ed. The MIT Press, Cambridge, Massachusetts, London, England.
  • [28] van der Vaart, A. W. and Wellner, J. A. (1996)., Weak Convergence and Empirical Processes; With Applications to Statistics. Springer Series in Statistics. Springer-Verlag, New York.
  • [29] West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Jr., J. O., Marks, J. and Nevins, J. (2001). Predicting the clinical status of human breast cancer by using gene expression profiles., PNAS 98 11462–11467.
  • [30] Wille, A., Zimmermann, P., Vranova, E., Fürholz, A., Laule, O., Bleuler, S., Hennig, L., Prelic, A., von Rohr, P., Thiele, L., Zitzler, E., Gruissem, W. and Bühlmann, P. (2004). Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana., Genome Biology 5 R92.
  • [31] Wu, W. B. and Pourahmadi, M. (2003). Nonparametric estimation of large covariance matrices of longitudinal data., Biometrika 90 831–844.