The Annals of Statistics

Gemini: Graph estimation with matrix variate normal instances

Shuheng Zhou

Full-text: Open access


Undirected graphs can be used to describe matrix variate distributions. In this paper, we develop new methods for estimating the graphical structures and underlying parameters, namely, the row and column covariance and inverse covariance matrices from the matrix variate data. Under sparsity conditions, we show that one is able to recover the graphs and covariance matrices with a single random matrix from the matrix variate normal distribution. Our method extends, with suitable adaptation, to the general setting where replicates are available. We establish consistency and obtain the rates of convergence in the operator and the Frobenius norm. We show that having replicates will allow one to estimate more complicated graphical structures and achieve faster rates of convergence. We provide simulation evidence showing that we can recover graphical structures as well as estimating the precision matrices, as predicted by theory.

Article information

Ann. Statist. Volume 42, Number 2 (2014), 532-562.

First available in Project Euclid: 20 May 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F12: Asymptotic properties of estimators
Secondary: 62F30: Inference under constraints

Graphical model selection covariance estimation inverse covariance estimation graphical Lasso matrix variate normal distribution


Zhou, Shuheng. Gemini: Graph estimation with matrix variate normal instances. Ann. Statist. 42 (2014), no. 2, 532--562. doi:10.1214/13-AOS1187.

Export citation


  • [1] Allen, G. I. and Tibshirani, R. (2010). Transposable regularized covariance models with an application to missing data imputation. Ann. Appl. Stat. 4 764–790.
  • [2] Banerjee, O., El Ghaoui, L. and d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9 485–516.
  • [3] Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell_1$ minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106 594–607.
  • [4] Dawid, A. P. (1981). Some matrix-variate distribution theory: Notational considerations and a Bayesian application. Biometrika 68 265–274.
  • [5] Dutilleul, P. (1999). The MLE algorithm for the matrix normal distribution. J. Stat. Comput. Simul. 64 105–123.
  • [6] Efron, B. (2009). Are a set of microarrays independent of each other? Ann. Appl. Stat. 3 922–942.
  • [7] Fan, J., Feng, Y. and Wu, Y. (2009). Network exploration via the adaptive lasso and SCAD penalties. Ann. Appl. Stat. 3 521–541.
  • [8] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
  • [9] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432–441.
  • [10] Gupta, A. K. and Varga, T. (1992). Characterization of matrix variate normal distributions. J. Multivariate Anal. 41 80–88.
  • [11] Kalaitzis, A., Lafferty, J., Lawrence, N. and Zhou, S. (2013). The bigraphical lasso. In Proceedings of the 30th International Conference on Machine Learning (ICML-13). JMLR W&CP 28 1229–1237. Atlanta, GA.
  • [12] Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Statist. 37 4254–4278.
  • [13] Leng, C. and Tang, C. Y. (2012). Sparse matrix graphical models. J. Amer. Statist. Assoc. 107 1187–1200.
  • [14] Lu, N. and Zimmerman, D. L. (2005). The likelihood ratio test for a separable covariance matrix. Statist. Probab. Lett. 73 449–457.
  • [15] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • [16] Peng, J., Zhou, N. and Zhu, J. (2009). Partial correlation estimation by joint sparse regression models. J. Amer. Statist. Assoc. 104 735–746.
  • [17] Ravikumar, P., Wainwright, M. J., Raskutti, G. and Yu, B. (2011). High-dimensional covariance estimation by minimizing $\ell_1$-penalized log-determinant divergence. Electron. J. Stat. 5 935–980.
  • [18] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Stat. 2 494–515.
  • [19] Tsiligkaridis, T., Hero, A. O. III and Zhou, S. (2013). On convergence of Kronecker graphical lasso algorithms. IEEE Trans. Signal Process. 61 1743–1755.
  • [20] UCI (1999). UCI machine learning repository. Available at
  • [21] Vershynin, R. (2012). Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing 210–268. Cambridge Univ. Press, Cambridge.
  • [22] Weichsel, P. M. (1962). The Kronecker product of graphs. Proc. Amer. Math. Soc. 13 47–52.
  • [23] Werner, K., Jansson, M. and Stoica, P. (2008). On estimation of covariance matrices with Kronecker product structure. IEEE Trans. Signal Process. 56 478–491.
  • [24] Yin, J. and Li, H. (2012). Model selection and estimation in the matrix normal graphical model. J. Multivariate Anal. 107 119–140.
  • [25] Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. J. Mach. Learn. Res. 11 2261–2286.
  • [26] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19–35.
  • [27] Zhang, X. L., Begleiter, H., Porjesz, B., Wang, W. and Litke, A. (1995). Event related potentials during object recognition tasks. Brain Res. Bull. 38 531–538.
  • [28] Zhang, Y. and Schneider, J. (2010). Learning multiple tasks with a sparse matrix-normal penalty. In Advances in Neural Information Processing Systems 23 (NIPS 2010) (J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel and A. Culotta, eds.).
  • [29] Zhou, S. (2013). Supplement to “Gemini: Graph estimation with matrix variate normal instances.” DOI:10.1214/13-AOS1187SUPP.
  • [30] Zhou, S., Lafferty, J. and Wasserman, L. (2010). Time varying undirected graphs. Machine Learning 80 298–319.
  • [31] Zhou, S., Rütimann, P., Xu, M. and Bühlmann, P. (2011). High-dimensional covariance estimation based on Gaussian graphical models. J. Mach. Learn. Res. 12 2975–3026.
  • [32] Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509–1533.

Supplemental materials

  • Supplementary material: Supplementary material for “Gemini: Graph estimation with matrix variate normal instances”. The technical proofs are given in the supplementary material [29].