The Annals of Statistics

Wishart distributions for decomposable covariance graph models

Kshitij Khare and Bala Rajaratnam

Full-text: Open access

Abstract

Gaussian covariance graph models encode marginal independence among the components of a multivariate random vector by means of a graph G. These models are distinctly different from the traditional concentration graph models (often also referred to as Gaussian graphical models or covariance selection models) since the zeros in the parameter are now reflected in the covariance matrix Σ, as compared to the concentration matrix Ω = Σ−1. The parameter space of interest for covariance graph models is the cone PG of positive definite matrices with fixed zeros corresponding to the missing edges of G. As in Letac and Massam [Ann. Statist. 35 (2007) 1278–1323], we consider the case where G is decomposable. In this paper, we construct on the cone PG a family of Wishart distributions which serve a similar purpose in the covariance graph setting as those constructed by Letac and Massam [Ann. Statist. 35 (2007) 1278–1323] and Dawid and Lauritzen [Ann. Statist. 21 (1993) 1272–1317] do in the concentration graph setting. We proceed to undertake a rigorous study of these “covariance” Wishart distributions and derive several deep and useful properties of this class. First, they form a rich conjugate family of priors with multiple shape parameters for covariance graph models. Second, we show how to sample from these distributions by using a block Gibbs sampling algorithm and prove convergence of this block Gibbs sampler. Development of this class of distributions enables Bayesian inference, which, in turn, allows for the estimation of Σ, even in the case when the sample size is less than the dimension of the data (i.e., when “n < p”), otherwise not generally possible in the maximum likelihood framework. Third, we prove that when G is a homogeneous graph, our covariance priors correspond to standard conjugate priors for appropriate directed acyclic graph (DAG) models. This correspondence enables closed form expressions for normalizing constants and expected values, and also establishes hyper-Markov properties for our class of priors. We also note that when G is homogeneous, the family IWQG of Letac and Massam [Ann. Statist. 35 (2007) 1278–1323] is a special case of our covariance Wishart distributions. Fourth, and finally, we illustrate the use of our family of conjugate priors on real and simulated data.

Article information

Source
Ann. Statist., Volume 39, Number 1 (2011), 514-555.

Dates
First available in Project Euclid: 15 February 2011

Permanent link to this document
https://projecteuclid.org/euclid.aos/1297779855

Digital Object Identifier
doi:10.1214/10-AOS841

Mathematical Reviews number (MathSciNet)
MR2797855

Zentralblatt MATH identifier
1274.62369

Subjects
Primary: 62H12: Estimation 62C10: Bayesian problems; characterization of Bayes procedures 62F15: Bayesian inference

Keywords
Graphical model Gaussian covariance graph model Wishart distribution decomposable graph Gibbs sampler

Citation

Khare, Kshitij; Rajaratnam, Bala. Wishart distributions for decomposable covariance graph models. Ann. Statist. 39 (2011), no. 1, 514--555. doi:10.1214/10-AOS841. https://projecteuclid.org/euclid.aos/1297779855


Export citation

References

  • [1] Andersson, S. A. and Wojnar, G. G. (2004). Wishart distributions on homogeneous cones. J. Theoret. Probab. 17 781–818.
  • [2] Athreya, K. B., Doss, H. and Sethuraman, J. (1996). On the convergence of the Markov chain simulaton method. Ann. Statist. 24 69–100.
  • [3] Butte, A. J., Tamayo, P., Slonim, D., Golub, T. R. and Kohane, I. S. (2000). Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc. Natl. Acad. Sci. 97 12182–12186.
  • [4] Chaudhuri, S., Drton, M. and Richardson, T. S. (2007). Estimation of a covariance matrix with zeroes. Biometrika 94 199–216.
  • [5] Consonni, G. and Veronese, P. (2003). Enriched conjugate and reference priors for the Wishart family on the symmetric cones. Ann. Statist. 31 1491–1516.
  • [6] Cox, D. R. and Wermuth, M. (1993). Linear dependencies represented by chain graphs (with discussion). Statist. Sci. 8 204–218, 247–277.
  • [7] Cox, D. R. and Wermuth, M. (1996). Multivariate Dependencies: Models, Analysis and Interpretation. Chapman & Hall, London.
  • [8] Daniels, M. J. and Pourahmadi, M. (2002). Bayesian analysis of covariance matrices and dynamic models for longitudinal data. Biometrika 89 553–566.
  • [9] Dawid, A. P. and Lauritzen, S. L. (1993). Hyper-Markov laws in the statistical analysis of decomposable graphical models. Ann. Statist. 21 1272–1317.
  • [10] Diaconis, P. and Ylvisaker, D. (1979). Conjugate priors for exponential families. Ann. Statist. 7 269–281.
  • [11] Drton, M. and Richardson, T. S. (2008). Graphical methods for efficient likelihood inference in Gaussian covariance models. J. Mach. Learn. Res. 9 893–914.
  • [12] Edwards, D. M. (2000). Introduction to Graphical Modelling, 2nd ed. Springer, New York.
  • [13] Gasch, A. P., Spellman, P. T., Kao, C. M., Carmel-Harel, O., Eisen, M. B., Storz, G., Botstein, D. and Brown, P. O. (2000). Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell 11 4241–4257.
  • [14] Grone, R., Johnson, C. R., Sá, E. M. and Wolkowicz, H. (1984). Positive definite completions of partial Hermitian matrices. Linear Algebra Appl. 58 109–124.
  • [15] Grzebyk, M., Wild, P. and Chouaniere, D. (2004). On identification of multifactor models with correlated residuals. Biometrika 91 141–151.
  • [16] Huang, J., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood. Biometrika 93 8598.
  • [17] Kauermann, G. (1996). On a dualization of graphical Gaussian models. Scand. J. Statist. 23 105–116.
  • [18] Khare, K. and Rajaratnam, B. (2008). Wishart distributions for covariance graph models. Technical report 2008-11, Dept. Statistics, Stanford Univ.
  • [19] Lauritzen, S. L. (1996). Graphical Models. Oxford Univ. Press, New York.
  • [20] Letac, G. and Massam, H. (2007). Wishart distributions for decomposable graphs. Ann. Statist. 35 1278–1323.
  • [21] Mao, Y., Kschischang, F. R. and Frey, B. J. (2004). Convolutional factor graphs as probabilistic models. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (M. Chickering and J. Halperin, eds.) 374–381. AUAI Press, Arlington, MA.
  • [22] Massam, H. (2007). The IWQG as a prior for the variance parameter of covariance graph models. Working paper.
  • [23] Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory. Wiley, New York.
  • [24] Paulsen, V. I., Power, S. C. and Smith, R. R. (1989). Schur products and matrix completions. J. Funct. Anal. 85 151–178.
  • [25] Pourahmadi, M. (2007). Cholesky decompositions and estimation of a covariance matrix: Orthogonality of variance–correlation parameters. Biometrika 94 1006–1013.
  • [26] Rajaratnam, B., Massam, H. and Carvalho, C. (2008). Flexible covariance estimation in graphical models. Ann. Statist. 36 2818–2849.
  • [27] Roverato, A. (2000). Cholesky decomposition of a hyper inverse Wishart matrix. Biometrika 87 99–112.
  • [28] Roverato, A. (2002). Hyper inverse Wishart distribution for non decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scand. J. Statist. 29 391–411.
  • [29] Silva, R. and Ghahramani, Z. (2009). The hidden life of latent variables: Bayesian learning with mixed graph models. J. Mach. Learn. Res. 10 1187–1238.
  • [30] Wermuth, N. (1980). Linear recursive equations, covariance selection and path analysis, J. Amer. Statist. Assoc. 75 963–972.
  • [31] Wermuth, M., Cox, D. R. and Marchetti, G. M. (2006). Covariance chains. Bernoulli 12 841–862.