Statistical Science

Graphical Models

Michael I. Jordan

Full-text: Open access


Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve large-scale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for approaching these problems, and indeed many of the models developed by researchers in these applied fields are instances of the general graphical model formalism. We review some of the basic ideas underlying graphical models, including the algorithmic ideas that allow graphical models to be deployed in large-scale data analysis problems. We also present examples of graphical models in bioinformatics, error-control coding and language processing.

Article information

Statist. Sci., Volume 19, Number 1 (2004), 140-155.

First available in Project Euclid: 14 July 2004

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Probabilistic graphical models junction tree algorithm sum-product algorithm Markov chain Monte Carlo variational inference bioinformatics error-control coding


Jordan, Michael I. Graphical Models. Statist. Sci. 19 (2004), no. 1, 140--155. doi:10.1214/088342304000000026.

Export citation


  • Aji, S. M. and McEliece, R. J. (2000). The generalized distributive law. IEEE Trans. Inform. Theory 46 325--343.
  • Arnborg, S., Corneil, D. G. and Proskurowski, A. (1987). Complexity of finding embeddings in a $k$-tree. SIAM J. Algebraic Discrete Methods 8 277--284.
  • Attias, H. (2000). A variational Bayesian framework for graphical models. In Advances in Neural Information Processing Systems (S. A. Solla, T. K. Leen and K.-R. Müller, eds.). 12 209--215. MIT Press, Cambridge, MA.
  • Bilmes, J. (2004). Graphical models and automatic speech recognition. In Mathematical Foundations of Speech and Language Processing (M. Johnson, S. Khudanpur, M. Ostendorf and R. Rosenfield, eds.). Springer, New York.
  • Blei, D. M., Jordan, M. I. and Ng, A. Y. (2003). Hierarchical Bayesian models for applications in information retrieval (with discussion). In Bayesian Statistics 7 (J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West, eds.) 25--43. Oxford Univ. Press.
  • Brown, L. (1986). Fundamentals of Statistical Exponential Families. IMS, Hayward, CA.
  • Cowell, R. G., Dawid, A. P., Lauritzen, S. L. and Spiegelhalter, D. J. (1999). Probabilistic Networks and Expert Systems. Springer, New York.
  • Durbin, R., Eddy, S., Krogh, A. and Mitchison, G. (1998). Biological Sequence Analysis. Cambridge Univ. Press.
  • Elston, R. C. and Stewart, J. (1971). A general model for the genetic analysis of pedigree data. Human Heredity 21 523--542.
  • Felsenstein, J. (1981). Evolutionary trees from DNA se- quences: A maximum likelihood approach. J. Molecular Evolution 17 368--376.
  • Gallager, R. G. (1963). Low-Density Parity-Check Codes. MIT Press, Cambridge, MA.
  • Ghahramani, Z. and Beal, M. (2001). Propagation algorithms for variational Bayesian learning. In Advances in Neural Information Processing Systems (D. S. Touretzky, M. C. Mozer and M. E. Hasselmo, eds.) 13 507--513. MIT Press, Cambridge, MA.
  • Ghahramani, Z. and Jordan, M. I. (1997). Factorial hidden Markov models. Machine Learning 29 245--273.
  • Gilks, W., Thomas, A. and Spiegelhalter, D. (1994). A language and a program for complex Bayesian modelling. The Statistician 43 169--177.
  • Huelsenbeck, J. P. and Bollback, J. P. (2001). Empirical and hierarchical Bayesian estimation of ancestral states. Systematic Biology 50 351--366.
  • Jensen, C. S., Kjaerulff, U. and Kong, A. (1995). Blocking-Gibbs sampling in very large probabilistic expert systems. International J. Human--Computer Studies 42 647--666.
  • Jordan, M. I., ed. (1999). Learning in Graphical Models. MIT Press, Cambridge, MA.
  • Jordan, M. I., Ghahramani, Z., Jaakkola, T. S. and Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning 37 183--233.
  • Kschischang, F., Frey, B. J. and Loeliger, H.-A. (2001). Factor graphs and the sum--product algorithm. IEEE Trans. Inform. Theory 47 498--519.
  • Lander, E. S. and Green, P. (1987). Construction of multilocus genetic linkage maps in humans. Proc. Nat. Acad. Sci. U.S.A. 84 2363--2367.
  • Lauritzen, S. L. (1996). Graphical Models. Clarendon Press, Oxford.
  • Leisink, M. A. R. and Kappen, H. J. (2002). General lower bounds based on computer generated higher order expansions. In Proc. 18th Conf. Uncertainty in Artificial Intelligence 293--300. Morgan Kaufmann, San Mateo, CA.
  • Liu, J. (2001). Monte Carlo Strategies in Scientific Computing. Springer, New York.
  • Minka, T. (2002). A family of algorithms for approximate Bayesian inference. Ph.D. dissertation, Massachusetts Institute of Technology.
  • Murphy, K. (2002). Dynamic Bayesian networks: Representation, inference and learning. Ph.D. dissertation, Univ. California, Berkeley.
  • Murphy, K. and Paskin, M. (2002). Linear time inference in hierarchical HMMs. In Advances in Neural Information Processing Systems 14. MIT Press, Cambridge, MA.
  • Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, CA.
  • Richardson, S., Leblond, L., Jaussent, I. and Green, P. J. (2002). Mixture models in measurement error problems, with reference to epidemiological studies. Unpublished manuscript.
  • Richardson, T., Shokrollahi, M. A. and Urbanke, R. (2001). Design of capacity-approaching irregular low-density parity-check codes. IEEE Trans. Inform. Theory 47 619--637.
  • Robert, C. and Casella, G. (2004). Monte Carlo Statistical Methods, 2nd ed. Springer, New York. To appear.
  • Rockafellar, R. T. (1970). Convex Analysis. Princeton Univ. Press.
  • Ron, D., Singer, Y. and Tishby, N. (1996). The power of amnesia: Learning probabilistic automata with variable memory length. Machine Learning 25 117--149.
  • Saul, L. K. and Jordan, M. I. (1995). Boltzmann chains and hidden Markov models. In Advances in Neural Information Processing Systems (G. Tesauro, D. Touretzky and T. Leen, eds.) 7 435--442. MIT Press, Cambridge, MA.
  • Saul, L. K. and Jordan, M. I. (1999). Mixed memory Markov models: Decomposing complex stochastic processes as mixtures of simpler ones. Machine Learning 37 75--87.
  • Shenoy, P. and Shafer, G. (1988). Axioms for probability and belief-function propagation. In Proc. 4th Conf. Uncertainty in Artificial Intelligence. Morgan Kaufmann, San Mateo, CA.
  • Tatikonda, S. and Jordan, M. I. (2002). Loopy belief propagation and Gibbs measures. In Proc. 18th Conf. Uncertainty in Artificial Intelligence 493--500. Morgan Kaufmann, San Mateo, CA.
  • Thomas, A., Gutin, A., Abkevich, V. and Bansal, A. (2000). Multilocus linkage analysis by blocked Gibbs sampling. Statist. Comput. 10 259--269.
  • Titterington, D. M. (2004). Bayesian methods for neural networks and related models. Statist. Sci. 19 128--139.
  • Wainwright, M. J. and Jordan, M. I. (2003). Graphical models, exponential families, and variational inference. Technical Report 649, Dept. Statistics, Univ. California, Berkeley.
  • Wainwright, M. J. and Jordan, M. I. (2004). Semidefinite relaxations for approximate inference on graphs with cycles. In Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA.
  • Yedidia, J., Freeman, W. and Weiss, Y. (2001). Generalized belief propagation. In Advances in Neural Information Processing Systems (T. Leen, T. Dietterich and V. Tresp, eds.) 13 689--695. MIT Press, Cambridge, MA.