The Annals of Statistics

On the toric algebra of graphical models

Dan Geiger, Christopher Meek, and Bernd Sturmfels

Full-text: Open access

Abstract

We formulate necessary and sufficient conditions for an arbitrary discrete probability distribution to factor according to an undirected graphical model, or a log-linear model, or other more general exponential models. For decomposable graphical models these conditions are equivalent to a set of conditional independence statements similar to the Hammersley–Clifford theorem; however, we show that for nondecomposable graphical models they are not. We also show that nondecomposable models can have nonrational maximum likelihood estimates. These results are used to give several novel characterizations of decomposable graphical models.

Article information

Source
Ann. Statist. Volume 34, Number 3 (2006), 1463-1492.

Dates
First available in Project Euclid: 10 July 2006

Permanent link to this document
http://projecteuclid.org/euclid.aos/1152540755

Digital Object Identifier
doi:10.1214/009053606000000263

Mathematical Reviews number (MathSciNet)
MR2278364

Zentralblatt MATH identifier
1104.60007

Subjects
Primary: 60E05: Distributions: general theory 62H99: None of the above, but in this section
Secondary: 13P10: Gröbner bases; other bases for ideals and modules (e.g., Janet and border bases) 14M25: Toric varieties, Newton polyhedra [See also 52B20] 68W30: Symbolic computation and algebraic computation [See also 11Yxx, 12Y05, 13Pxx, 14Qxx, 16Z05, 17-08, 33F10]

Keywords
Conditional independence factorization graphical models decomposable models factorization of discrete distributions Hammersley–Clifford theorem Gröbner bases

Citation

Geiger, Dan; Meek, Christopher; Sturmfels, Bernd. On the toric algebra of graphical models. Ann. Statist. 34 (2006), no. 3, 1463--1492. doi:10.1214/009053606000000263. http://projecteuclid.org/euclid.aos/1152540755.


Export citation

References

  • Agresti, A. (1990). Categorical Data Analysis. Wiley, New York.
  • Barndorff-Nielsen, O. (1978). Information and Exponential Families in Statistical Theory. Wiley, Chichester.
  • Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems (with discussion). J. Roy. Statist. Soc. Ser. B 36 192--236.
  • Brown, L. (1986). Fundamentals of Statistical Exponential Families with Applications in Statistical Decision Theory. IMS, Hayward, CA.
  • Čencov, N. (1982). Statistical Decision Rules and Optimal Inference. Amer. Math. Soc., Providence, RI.
  • Cox, D., Little, J. and O'Shea, D. (1997). Ideals, Varieties and Algorithms, 2nd ed. Springer, New York.
  • Darroch, J., Lauritzen, S. and Speed, T. (1980). Markov fields and log-linear interaction models for contingency tables. Ann. Statist. 8 522--539.
  • Darroch, J. and Speed, T. (1983). Additive and multiplicative models and interactions. Ann. Statist. 11 724--738.
  • Diaconis, P. and Sturmfels, B. (1998). Algebraic algorithms for sampling from conditional distributions. Ann. Statist. 26 363--397.
  • Dobra, A. (2003). Markov bases for decomposable graphical models. Bernoulli 9 1093--1108.
  • Garcia, L., Stillman, M. and Sturmfels, B. (2005). Algebraic geometry of Bayesian networks. J. Symbolic Comput. 39 331--355.
  • Geiger, D., Heckerman, D., King, H. and Meek, C. (2001). Stratified exponential families: Graphical models and model selection. Ann. Statist. 29 505--529.
  • Geiger, D. and Pearl, J. (1993). Logical and algorithmic properties of conditional independence and graphical models. Ann. Statist. 21 2001--2021.
  • Good, I. J. (1963). Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables. Ann. Math. Statist. 34 911--934.
  • Goodman, L. A. (1964). Interactions in multidimensional contingency tables. Ann. Math. Statist. 35 632--646.
  • Grayson, D. and Stillman, M. (1997). Macaulay 2: A software system for research in algebraic geometry and commutative algebra. Available at www.math.uiuc.edu/Macaulay2.
  • Hoşten S. and Sullivant, S. (2002). Gröbner bases and polyhedral geometry of reducible and cyclic models. J. Combin. Theory Ser. A 100 277--301.
  • Lauritzen, S. L. (1975). General exponential models for discrete observations. Scand. J. Statist. 2 23--33.
  • Lauritzen, S. (1996). Graphical Models. Clarendon Press, Oxford.
  • Matúš, F. and Studený, M. (1995). Conditional independences among four variables. I. Combin. Probab. Comput. 4 269--278.
  • Moussouris, J. (1974). Gibbs and Markov random systems with constraints. J. Statist. Phys. 10 11--33.
  • Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Mateo, CA.
  • Pistone, G., Riccomagno, E. and Wynn, H. (2001). Algebraic Statistics. Computational Commutative Algebra in Statistics. Chapman and Hall, New York.
  • Ripley, B. and Kelly, F. (1977). Markov point processes. J. London Math. Soc. (2) 15 188--192.
  • Schrijver, A. (1986). Theory of Linear and Integer Programming. Wiley, Chichester.
  • Settimi, R. and Smith, J. (2000). Geometry, moments and conditional independence trees with hidden variables. Ann. Statist. 28 1179--1205.
  • Stewart, I. (1973). Galois Theory. Chapman and Hall, London.
  • Sturmfels, B. (1996). Gröbner Bases and Convex Polytopes. Amer. Math. Soc., Providence, RI.
  • Takken, A. (1999). Monte Carlo goodness-of-fit tests for discrete data. Ph.D. dissertation, Dept. Statistics, Stanford Univ.