Electronic Journal of Statistics

Posterior convergence rates for estimating large precision matrices using graphical models

Sayantan Banerjee and Subhashis Ghosal

Full-text: Open access

Abstract

We consider Bayesian estimation of a $p\times p$ precision matrix, when $p$ can be much larger than the available sample size $n$. It is well known that consistent estimation in such ultra-high dimensional situations requires regularization such as banding, tapering or thresholding. We consider a banding structure in the model and induce a prior distribution on a banded precision matrix through a Gaussian graphical model, where an edge is present only when two vertices are within a given distance. For a proper choice of the order of graph, we obtain the convergence rate of the posterior distribution and Bayes estimators based on the graphical model in the $L_{\infty}$-operator norm uniformly over a class of precision matrices, even if the true precision matrix may not have a banded structure. Along the way to the proof, we also compute the convergence rate of the maximum likelihood estimator (MLE) under the same set of condition, which is of independent interest. The graphical model based MLE and Bayes estimators are automatically positive definite, which is a desirable property not possessed by some other estimators in the literature. We also conduct a simulation study to compare finite sample performance of the Bayes estimators and the MLE based on the graphical model with that obtained by using a Cholesky decomposition of the precision matrix. Finally, we discuss a practical method of choosing the order of the graphical model using the marginal likelihood function.

Article information

Source
Electron. J. Statist., Volume 8, Number 2 (2014), 2111-2137.

Dates
First available in Project Euclid: 29 October 2014

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1414588188

Digital Object Identifier
doi:10.1214/14-EJS945

Mathematical Reviews number (MathSciNet)
MR3273620

Zentralblatt MATH identifier
1305.62376

Subjects
Primary: 62H12: Estimation
Secondary: 62F12: Asymptotic properties of estimators 62F15: Bayesian inference

Keywords
Precision matrix G-Wishart convergence rate

Citation

Banerjee, Sayantan; Ghosal, Subhashis. Posterior convergence rates for estimating large precision matrices using graphical models. Electron. J. Statist. 8 (2014), no. 2, 2111--2137. doi:10.1214/14-EJS945. https://projecteuclid.org/euclid.ejs/1414588188


Export citation

References

  • [1] Atay-Kayis, A. and Massam, H. (2005). A Monte-Carlo method for computing the marginal likelihood in nondecomposable Gaussian graphical models., Biometrika 92 317–335.
  • [2] Bickel, P. J. and Levina, E. (2008a). Covariance regularization by thresholding., Ann. Statist. 36 2577–2604.
  • [3] Bickel, P. J. and Levina, E. (2008b). Regularized estimation of large covariance matrices., Ann. Statist. 36 199–227.
  • [4] Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation., J. Amer. Statist. Assoc. 106 672–684.
  • [5] Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell_1$-minimization approach to sparse precision matrix estimation., J. Amer. Statist. Assoc. 106 594–607.
  • [6] Cai, T. T. and Yuan, M. (2012). Adaptive covariance matrix estimation through block thresholding., Ann. Statist. 40 2014–2042.
  • [7] Cai, T. T., Zhang, C. H. and Zhou, H. H. (2010). Optimal rates of convergence for covariance matrix estimation., Ann. Statist. 38 2118–2144.
  • [8] Carvalho, C. M., Massam, H. and West, M. (2007). Simulation of hyper-inverse Wishart distributions in graphical models., Biometrika 94 647–659.
  • [9] Carvalho, C. M. and Scott, J. G. (2009). Objective Bayesian model selection in Gaussian graphical models., Biometrika 96 497–512.
  • [10] Dawid, A. P. and Lauritzen, S. L. (1993). Hyper Markov laws in the statistical analysis of decomposable graphical models., Ann. Statist. 21 1272–1317.
  • [11] Dobra, A., Lenkoski, A. and Rodriguez, A. (2011). Bayesian inference for general Gaussian graphical models with application to multivariate lattice data., J. Amer. Statist. Assoc. 106 1418–1433.
  • [12] Dobra, A., Hans, C., Jones, B., Nevins, J. R., Yao, G. and West, M. (2004). Sparse graphical models for exploring gene expression data., J. Multivariate Anal. 90 196–212.
  • [13] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso., Biostatistics 9 432– 441.
  • [14] Ghosal, S. (2000). Asymptotic normality of posterior distributions for exponential families when the number of parameters tends to infinity., J. Multivariate Anal. 74 49–68.
  • [15] Gröne, R., Johnson, C. R., Sá, E. M. and Wolkowicz, H. (1984). Positive definite completions of partial Hermitian matrices., Linear Algebra Appl. 58 109–124.
  • [16] Huang, J. Z., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood., Biometrika 93 85–98.
  • [17] Karoui, N. E. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices., Ann. Statist. 36 2717–2756.
  • [18] Lam, C. and Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation., Ann. Statist. 37 4254.
  • [19] Lauritzen, S. L. (1996)., Graphical Models. Clarendon Press, Oxford.
  • [20] Ledoit, O. and Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices., J. Multivariate Anal. 88 365–411.
  • [21] Lenkoski, A. and Dobra, A. (2011). Computational aspects related to inference in Gaussian graphical models with the G-Wishart prior., J. Comput. Graphical Statist. 20 140–157.
  • [22] Letac, G. and Massam, H. (2007). Wishart distributions for decomposable graphs., Ann. Statist. 35 1278–1323.
  • [23] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso., Ann. Statist. 34 1436–1462.
  • [24] Muirhead, R. (2005)., Aspects of Multivariate Statistical Theory. Wiley, New York.
  • [25] Pati, D., Bhattacharya, A., Pillai, N. S. and Dunson, D. (2014). Posterior contraction in sparse Bayesian factor models for massive covariance matrices., Ann. Statist. 42 1102–1130.
  • [26] Rajaratnam, B., Massam, H. and Carvalho, C. M. (2008). Flexible covariance estimation in graphical Gaussian models., Ann. Statist. 36 2818–2849.
  • [27] Rothman, A. J., Levina, E. and Zhu, J. (2009). Generalized thresholding of large covariance matrices., J. Amer. Statist. Assoc. 104 177–186.
  • [28] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation., Electron. J. Statist. 2 494–515.
  • [29] Roverato, A. (2000). Cholesky decomposition of a hyper inverse Wishart matrix., Biometrika 87 99–112.
  • [30] Yang, R. and Berger, J. O. (1994). Estimation of a covariance matrix using the reference prior., Ann. Statist. 22 1195–1211.
  • [31] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model., Biometrika 94 19–35.