Statistical Science

Experiments in Stochastic Computation for High-Dimensional Graphical Models

Beatrix Jones, Carlos Carvalho, Adrian Dobra, Chris Hans, Chris Carter, and Mike West

Full-text: Open access

Abstract

We discuss the implementation, development and performance of methods of stochastic computation in Gaussian graphical models. We view these methods from the perspective of high-dimensional model search, with a particular interest in the scalability with dimension of Markov chain Monte Carlo (MCMC) and other stochastic search methods. After reviewing the structure and context of undirected Gaussian graphical models and model uncertainty (covariance selection), we discuss prior specifications, including new priors over models, and then explore a number of examples using various methods of stochastic computation. Traditional MCMC methods are the point of departure for this experimentation; we then develop alternative stochastic search ideas and contrast this new approach with MCMC. Our examples range from low (12–20) to moderate (150) dimension, and combine simple synthetic examples with data analysis from gene expression studies. We conclude with comments about the need and potential for new computational methods in far higher dimensions, including constructive approaches to Gaussian graphical modeling and computation.

Article information

Source
Statist. Sci., Volume 20, Number 4 (2005), 388-400.

Dates
First available in Project Euclid: 12 January 2006

Permanent link to this document
https://projecteuclid.org/euclid.ss/1137076659

Digital Object Identifier
doi:10.1214/088342305000000304

Mathematical Reviews number (MathSciNet)
MR2210226

Zentralblatt MATH identifier
1130.62408

Keywords
Decomposable models nondecomposable models Markov chain Monte Carlo shotgun stochastic search parallel implementation

Citation

Jones, Beatrix; Carvalho, Carlos; Dobra, Adrian; Hans, Chris; Carter, Chris; West, Mike. Experiments in Stochastic Computation for High-Dimensional Graphical Models. Statist. Sci. 20 (2005), no. 4, 388--400. doi:10.1214/088342305000000304. https://projecteuclid.org/euclid.ss/1137076659


Export citation

References

  • Andersson, S. A., Madigan, D., Perlman, M. D. and Richardson, T. (1999). Graphical Markov models in multivariate analysis. In Multivariate Analysis, Design of Experiments and Survey Sampling (S. Ghosh, ed.) 187--229. Dekker, New York.
  • Armstrong, H., Carter, C. K., Wong, K. F. and Kohn, R. (2005). Bayesian covariance matrix estimation using a mixture of decomposable graphical models. Unpublished manuscript.
  • Atay-Kayis, A. and Massam, H. (2006). The marginal likelihood for decomposable and non-decomposable graphical Gaussian models. Biometrika. To appear.
  • Cowell, R. G., Dawid, A. P., Lauritzen, S. L. and Spiegelhalter, D. J. (1999). Probabilistic Networks and Expert Systems. Springer, New York.
  • Dawid, A. P. and Lauritzen, S. L. (1993). Hyper-Markov laws in the statistical analysis of decomposable graphical models. Ann. Statist. 21 1272--1317.
  • Dellaportas, P. and Forster, J. J. (1999). Markov chain Monte Carlo model determination for hierarchical and graphical log-linear models. Biometrika 86 615--633.
  • Dellaportas, P., Giudici, P. and Roberts, G. (2003). Bayesian inference for nondecomposable graphical Gaussian models. Sankhyā 65 43--55.
  • Dempster, A. P. (1972). Covariance selection. Biometrics 28 157--175.
  • Dickey, J. M. (1971). The weighted likelihood ratio, linear hypotheses on normal location parameters. Ann. Math. Statist. 42 204--223.
  • Dobra, A. and Fienberg, S. E. (2000). Bounds for cell entries in contingency tables given marginal totals and decomposable graphs. Proc. Natl. Acad. Sci. U.S.A. 97 11,885--11,892.
  • Dobra, A., Hans, C., Jones, B. Nevins, J., Yao, G. and West, M. (2004). Sparse graphical models for exploring gene expression data. J. Multivariate Anal. 90 196--212.
  • Dobra, A. and West, M. (2004). Bayesian covariance selection. Available as Discussion Paper 04-23 at www.isds.duke.edu.
  • Flores, M. J., Gámez, J. A. and Olesen, K. G. (2003). Incremental compilation of Bayesian networks. In Proc. 19th Annual Conference on Uncertainty in Artificial Intelligence 233--240. Morgan Kaufmann, San Francisco.
  • Friedman, N., Linial, M., Nachman, I. and Pe'er, D. (2000). Using Bayesian networks to analyze expression data. J. Computational Biology 7 601--620.
  • Giudici, P. (1996). Learning in graphical Gaussian models. In Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.) 621--628. Oxford Univ. Press, London.
  • Giudici, P. and Castelo, R. (2003). Improving Markov chain Monte Carlo model search for data mining. Machine Learning 50 127--158.
  • Giudici, P. and Green, P. J. (1999). Decomposable graphical Gaussian model determination. Biometrika 86 785--801.
  • Grone, R., Johnson, C. R., de Sá, E. M. and Wolkowicz, H. (1984). Positive definite completions of partial Hermitian matrices. Linear Algebra Appl. 58 109--124.
  • Hammersley, J. M. and Clifford, P. E. (1971). Markov fields on finite graphs and lattices. Unpublished manuscript.
  • Heckerman, D., Chickering, D. M., Meek, C., Rounthwaite, R. and Kadie, C. M. (2000). Dependency networks for inference, collaborative filtering, and data visualization. J. Machine Learning Research 1 49--75.
  • Lauritzen, S. L. (1996). Graphical Models. Clarendon Press, Oxford.
  • Lauritzen, S. L. and Sheehan, N. A. (2003). Graphical models for genetic analyses. Statist. Sci. 18 489--514.
  • Madigan, D. and York, J. (1995). Bayesian graphical models for discrete data. Internat. Statist. Rev. 63 215--232.
  • Roverato, A. (2002). Hyper-inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scand. J. Statist. 29 391--411.
  • Wermuth, N. (1976). Model search among multiplicative models. Biometrics 32 253--263.
  • West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, Jr., J. A., Marks, J.R. and Nevins, J. R. (2001). Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. U.S.A. 98 11,462--11,467.
  • Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley, Chichester.
  • Wong, F., Carter, C. and Kohn, R. (2003). Efficient estimation of covariance selection models. Biometrika 90 809--830.
  • Yu, J., Smith, V., Wang, P., Hartemink, A. and Jarvis, E. (2004). Advances to Bayesian network inference for generating causal networks from observational biological data. Bioinformatics 20 3594--3603.
  • Zhou, X., Kao, M. J. and Wong, W. H. (2002). Transitive functional annotation by shortest-path analysis of gene expression data. Proc. Natl. Acad. Sci. U.S.A. 99 12,783--12,788.