Stratified exponential families: Graphical models and model selection



The Annals of Statistics

Stratified exponential families: Graphical models and model selection

Dan Geiger, David Heckerman, Henry King, and Christopher Meek

Source: Ann. Statist. Volume 29, Number 2 (2001), 505-529.

Abstract

We describe a hierarchy of exponential families which is useful for distinguishing types of graphical models. Undirected graphical models with no hidden variables are linear exponential families (LEFs). Directed acyclic graphical (DAG) models and chain graphs with no hidden variables, includ­ ing DAG models with several families of local distributions, are curved exponential families (CEFs). Graphical models with hidden variables are what we term stratified exponential families (SEFs). A SEF is a finite union of CEFs of various dimensions satisfying some regularity conditions. We also show that this hierarchy of exponential families is noncollapsing with respect to graphical models by providing a graphical model which is a CEF but not a LEF and a graphical model that is a SEF but not a CEF. Finally, we show how to compute the dimension of a stratified exponential family. These results are discussed in the context of model selection of graphical models.

Primary Subjects: 60E05, 62H05
Keywords: Bayesian networks; graphical models; hidden variables; curved exponential families; stratified exponential families; semialgebraic sets; model selection

Full-text: Open access

Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1009210550
Digital Object Identifier: doi:10.1214/aos/1009210550
Mathematical Reviews number (MathSciNet): MR1863967
Zentralblatt MATH identifier: 1012.62012

References

Abramson, B., Brown, J., Edwards, W., Murphy, A. and Winkler, R. (1996). Hailfinder: a Bayesian system for forecastingsevere weather. Internat. J. Forecasting 12 57-71.
Akbulut, S. and King, H. (1992). Topology of Real Algebraic Sets. Springer, New York.
Mathematical Reviews (MathSciNet): MR94m:57001
Andersson, S., Madigan, D. and Perlman, M. (1996). An alternative Markov property for chain graphs. Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence 40-48. Morgan Kaufmann, San Francisco.
Mathematical Reviews (MathSciNet): MR1617123
Bamber, D. and van Santen, J. (1985). How many parameters can a model have and still be testable? J. Math. Psych. 29 443-473.
Mathematical Reviews (MathSciNet): MR87c:62203
Barndorff-Nielsen, O. (1978). Information and Exponential Families. Wiley, New York.
Benedetti, R. and Risler, J. (1990). Real Algebraic and Semialgebraic Sets. Hermann, Paris.
Mathematical Reviews (MathSciNet): MR91j:14045
Berzuini, C., Bellazzi, R., Quaglini, S. and Speigelhalter, D. (1992). Bayesian networks for patient monitoring. Artificial Intelligence in Medicine 4 243-260.
Br ¨ocker, Th. and J¨anich, K. (1982). Introduction to Differential Topology. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR83i:58001
Chickering, D., Heckerman, D. and Meek, C. (1997). A Bayesian approach to learningBayesian networks with local structure. In Proceedings of Uncertainty and Artificial Intelligence 80-89. Morgan Kaufmann, San Francisco.
Cowell, R., Dawid, A. P., Lauritzen, S. and Spiegelhalter, D. (1999). Probabilistic Networks and Expert Systems (Statistics for Engineering and Information Science). Springer, New York.
Efron, B. (1978). The geometry of exponential families. Ann. Statist. 6 362-376.
Mathematical Reviews (MathSciNet): MR57:10890
Eizirik, L., Barbosa, V. and Mendes, S. (1993). A Bayesian-network approach to lexical disambiguation. Cognitive Science 17 257-283.
Fraley, C. and Raftery, A. (1998). How many clusters? Which clusteringmethod? Answers via model-based cluster analysis. Computer Journal 41 578-588.
Frey, B. ed. (1978). Graphical Models for Machine Learning and Digital Communication. MIT Press.
Friedman, N. and Goldszmidt, M. (1996). LearningBayesian networks with local structure. In Poceedings of Twelfth Conference on Uncertainty in Artificial Intelligence 252-262. Morgan Kaufmann, San Francisco.
Fung, B. and Favero, B. D. (1995). ApplyingBayesian networks to information retrieval. Comm. ACM 38 42-48.
Gavard, L., Bhadeshia, H., MacKay, D. and Suzuki, S. (1996). Bayesian neural network model for austenite formation in steels. Materials Science and Technology 12 453-463.
528 GEIGER, HECKERMAN, KING AND MEEK
Geiger, D. and Heckerman, D. (1994). LearningGaussian networks. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence 235-243. Morgan Kaufmann, San Francisco.
Geiger, D., Heckerman, D. and Meek, C. (1996). Asymptotic model selection for directed networks with hidden variables. In Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence 283-290. Morgan Kaufmann, San Francisco.
Mathematical Reviews (MathSciNet): MR1617218
Geiger, D. and Meek, C. (1998). Graphical models and exponential families. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence 156-165. Morgan Kaufmann, San Francisco.
Goodman, L. (1974). Exploratory latent structure analysis usingboth identifiable and unidentifiable models. Biometrika 61 215-231.
Harris, N. (1990). Probabilistic belief networks for genetic counseling. Computer Methods and Programs in Biomedicine 32 37-44.
Haughton, D. (1988). On the choice of a model to fit data from an exponential family. Ann. Statist. 16 342-555.
Mathematical Reviews (MathSciNet): MR89e:62036
Heckerman, D. and Breese, J. (1996). Causal independence for probability assessment and inference usingBayesian networks. IEEE Systems, Man, and Cybernetics 26 826-831.
Heckerman, D., Breese, J. and Rommelse, K. (1995). Decision-theoretic troubleshooting. Comm. ACM 38 49-57.
Henrion, M. (1987). Some practical issues in constructingbelief networks. In Proceedings of the Third Workshop on Uncertainty in Artificial Intelligence 132-139. Association for Uncertainty in Artificial Intelligence, Mountain View, CA.
Kass, R. and Vos, P. (1997). Geometrical Foundations of Asymptotic Inference. Wiley, New York.
Mathematical Reviews (MathSciNet): MR99b:62032
Koster, J. (1997). Gibbs and Markov properties of graphs. Ann. Math. Artificial Intelligence 21 13-26.
Mathematical Reviews (MathSciNet): MR98m:05077
Kumar, V. and Desai, U. (1996). Image interpretation using Bayesian networks. IEEE Trans. Pattern Analysis and Machine Intelligence 18 74-77.
Lauritzen, S. (1996). Graphical Models. Claredon Press, Oxford.
Mathematical Reviews (MathSciNet): MR98g:62001
Lauritzen, S. and Wermuth, N. (1989). Graphical models for association between variables, some of which are qualitative and some quantitative. Ann. Statist. 17 31-57.
McEliece, R., MacKay, D., and Cheng, J. (1998). Trubo decodingas an instance of Pearl's belief propagation algorithm. IEEE Journal on Selected Areas in Communication 16 140-152.
Meek, C. and Heckerman, D. (1997). Structure and parameter learningfor causal independence and causal interaction models. In Proceedings of the Thirteenth Annual Conference on Uncertainty in Artificial Intelligence 366-375. Morgan Kaufmann, San Francisco. Olesen, K., Kjaerulff, U., Jensen, F., Jensen, F., Flack, B., Andreassen, S. and Andersen, S.
(1989). A MUNIN network for the median nerve: A case study on loops. Applied Artificial Intelligence 3 385-404.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco.
Pearl, J. (2000). Causality: Models, Reasoning, and Inference. Cambridge Univ. Press.
Sarkar, S. and Boyer, K. (1993). Integration, inference, and management of spatial information usingBayesian networks: Perceptual organization. IEEE Trans. Pattern Analysis and Machine Intelligence 15 256-274.
Schwarz, G. (1978). Estimatingthe dimension of a model. Ann. Statist. 6 461-464.
Settimi, R. and Smith, J. (1998). On the geometry of Bayesian graphical models with hidden variables. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence 472-479. Morgan Kaufmann, San Francisco.
Shachter, R. and Kenley, R. (1986). Gaussian influence diagrams. Management Science 35 527-550. Shwe, M., Middleton, B., Heckerman, D., Henrion, M., Horvitz, E., Lehmann, H. and
Cooper, G. (1991). Probabilistic diagnosis using a reformulation of the INTERNIST1/QMR knowledge base I. The probabilistic model and inference algorithms. Methods in Information and Medicine 30 241-250.
Spiegelhalter, D. and Thomas, A. (1998). Graphical modelingfor complex stochastic systems: The BUGS project. IEEE Intelligent Systems and Their Applications 13 14-15.
Spirtes, P., Glymour, C. and Scheines, R. (1993). Causation, Prediction, and Search. Springer, New York.
Mathematical Reviews (MathSciNet): MR94g:62004
Spirtes, P., Richardson, T. and Meek, C. (1997). The dimensionality of mixed ancestral graphs. Technical Report CMU-PHIL-83, Dept. Philosophy, Carnegie Mellon Univ.
Spivak, M. (1965). Calculus on Manifolds. Addison-Wesley, New York.
Turtle, H. and Croft, B. (1991). Evaluation of an inference network-based retrieval model. ACM Trans. Information Systems 9 1878-222.
Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley, New York. Department of Computer Science Technion-Israel Institute of Technology Haifa 32000 Israel E-mail: dang@cs.technion.ac.il

2009 © Institute of Mathematical Statistics