### Parameter priors for directed acyclic graphical models and the characterization of several probability distributions

Dan Geiger and David Heckerman
Source: Ann. Statist. Volume 30, Number 5 (2002), 1412-1440.

#### Abstract

We develop simple methods for constructing parameter priors for model choice among directed acyclic graphical (DAG) models. In particular, we introduce several assumptions that permit the construction of parameter priors for a large number of DAG models from a small set of assessments. We then present a method for directly computing the marginal likelihood of every DAG model given a random sample with no missing observations. We apply this methodology to Gaussian DAG models which consist of a recursive set of linear regression models. We show that the only parameter prior for complete Gaussian DAG models that satisfies our assumptions is the normal-Wishart distribution. Our analysis is based on the following new characterization of the Wishart distribution: let $W$ be an $n \times n$, $n \ge 3$, positive definite symmetric matrix of random variables and $f(W)$ be a pdf of $W$. Then, $f(W)$ is a Wishart distribution if and only if $W_{11} - W_{12} W_{22}^{-1} W'_{12}$ is independent of $\{W_{12},W_{22}\}$ for every block partitioning $W_{11},W_{12}, W'_{12}, W_{22}$ of $W$. Similar characterizations of the normal and normal-Wishart distributions are provided as well.

First Page:
Primary Subjects: 62E10, 60E05
Secondary Subjects: 62A15, 62C10, 39B99
Full-text: Open access

Permanent link to this document: http://projecteuclid.org/euclid.aos/1035844981
Digital Object Identifier: doi:10.1214/aos/1035844981
Mathematical Reviews number (MathSciNet): MR1936324
Zentralblatt MATH identifier: 1016.62064

### References

ACZÉL, J. (1966). Lectures on Functional Equations and Their Applications. Academic Press, New York.
Mathematical Reviews (MathSciNet): MR208210
ANDERSSON, S. A., MADIGAN, D. and PERLMAN, M. D. (1997). A characterization of Markov equivalence classes for acy clic digraphs. Ann. Statist. 25 505-541.
Mathematical Reviews (MathSciNet): MR99a:62076
Zentralblatt MATH: 0876.60095
Digital Object Identifier: doi:10.1214/aos/1031833662
Project Euclid: euclid.aos/1031833662
BERNARDO, J. M. and SMITH, A. F. M. (1994). Bayesian Theory. Wiley, New York.
Mathematical Reviews (MathSciNet): MR96a:62006
BUNTINE, W. (1994). Operations for learning with graphical models. J. Artificial Intelligence Research 2 159-225.
CHICKERING, D. (1995). A transformational characterization of equivalent Bayesian network structures. In Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence, Montreal 87-98. Morgan Kaufmann, San Francisco.
Mathematical Reviews (MathSciNet): MR99b:68183
CHICKERING, D. (1996). Learning Bayesian networks from data. Ph.D. dissertation, Univ. California, Los Angeles.
COOPER, G. and HERSKOVITS, E. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning 9 309-347.
Zentralblatt MATH: 0766.68109
COWELL, R., DAWID, A. P., LAURITZEN, S. and SPIEGELHALTER, D. (1999). Probabilistic Networks and Expert Sy stems. Springer, New York.
Mathematical Reviews (MathSciNet): MR2000h:68217
DAWID, A. P. and LAURITZEN, S. (1993). Hy per-Markov laws in the statistical analysis of decomposable graphical models. Ann. Statist. 21 1272-1317.
Mathematical Reviews (MathSciNet): MR95c:62015
Zentralblatt MATH: 0815.62038
Digital Object Identifier: doi:10.1214/aos/1176349260
Project Euclid: euclid.aos/1176349260
DEGROOT, M. (1970). Optimal Statistical Decisions. McGraw-Hill, New York.
Mathematical Reviews (MathSciNet): MR50:8774a
FRIEDMAN, N. and GOLDSZMIDT, M. (1997). Sequential update of Bayesian network structures. In Proceedings of Thirteenth Conference on Uncertainty in Artificial Intelligence 165-174. Morgan Kaufmann, Providence, RI.
GEIGER, D. and HECKERMAN, D. (1994). Learning Gaussian networks. In Proceedings of Tenth Conference on Uncertainty in Artificial Intelligence 235-243. Morgan Kaufmann, San Francisco.
GEIGER, D. and HECKERMAN, D. (1997). A characterization of the Dirichlet distribution through global and local parameter independence. Ann. Statist. 25 1344-1369.
Mathematical Reviews (MathSciNet): MR98h:62013
Zentralblatt MATH: 0885.62009
Digital Object Identifier: doi:10.1214/aos/1069362752
Project Euclid: euclid.aos/1069362752
GEIGER, D. and HECKERMAN, D. (1998). A characterization of the bivariate Wishart distribution. Probab. Math. Statist. 18 119-131.
Mathematical Reviews (MathSciNet): MR2000a:62122
Zentralblatt MATH: 0981.62042
GEIGER, D. and HECKERMAN, D. (1999). Parameter priors for directed graphical models and the characterization of several probability distributions. In Proceedings of Fifteenth Conference on Uncertainty in Artificial Intelligence 216-225. Morgan Kaufmann, San Francisco.
HECKERMAN, D. and GEIGER, D. (1995). Learning Bayesian networks: A unification for discrete and Gaussian domains. In Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence 274-284. Morgan Kaufmann, San Francisco.
Mathematical Reviews (MathSciNet): MR1615024
HECKERMAN, D., GEIGER, D. and CHICKERING, D. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20 197-243.
HECKERMAN, D., MAMDANI, A. and WELLMAN, M. (1995). Real-world applications of Bayesian networks. Comm. ACM 38.
HOWARD, R. and MATHESON, J. (1981). Influence diagrams. In The Principles and Applications of Decision Analy sis 2 (R. Howard and J. Matheson, eds.) 721-762. Strategic Decisions Group, Menlo Park, CA.
JÁRAI, A. (1986). On regular solutions of functional equations. Aequationes Math. 30 21-54.
Zentralblatt MATH: 0589.39012
JÁRAI, A. (1998). Regularity property of the functional equation of the Dirichlet distribution. Aequationes Math. 56 37-46.
Zentralblatt MATH: 0914.39029
KADANE, J. B., DICKEY, J. M., WINKLER, R. L., SMITH, W. S. and PETERS, S. C. (1980). Interactive elicitation of opinion for a normal linear model. J. Amer. Statist. Assoc. 75 845-854.
Mathematical Reviews (MathSciNet): MR82a:62092
Digital Object Identifier: doi:10.2307/2287171
KAGAN, A. M., LINNIK, Y. V. and RAO, C. R. (1973). Characterization Problems in Mathematical Statistics. Wiley, New York.
Mathematical Reviews (MathSciNet): MR49:11689
Zentralblatt MATH: 0271.62002
MADIGAN, D., ANDERSSON, S. A., PERLMAN, M. D. and VOLINSKY, C. T. (1996). Bayesian model averaging and model selection for Markov equivalence classes of acy clic digraphs. Comm. Statist. Theory Methods 25 2493-2519.
PEARL, J. (1988). Probabilistic Reasoning in Intelligent Sy stems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo, CA.
Mathematical Reviews (MathSciNet): MR90g:68003
Zentralblatt MATH: 0746.68089
PRESS, J. S. (1972). Applied Multivariate Analy sis. Holt, Rinehart and Winston, New York.
Mathematical Reviews (MathSciNet): MR54:8979
SHACHTER, R. and KENLEY, C. (1989). Gaussian influence diagrams. Management Sci. 35 527- 550.
SPIEGELHALTER, D., DAWID, A., LAURITZEN, S. and COWELL, R. (1993). Bayesian analysis in expert sy stems (with discussion). Statist. Sci. 8 219-283.
Mathematical Reviews (MathSciNet): MR94j:62011
Digital Object Identifier: doi:10.1214/ss/1177010888
Project Euclid: euclid.ss/1177010888
SPIEGELHALTER, D. and LAURITZEN, S. (1990). Sequential updating of conditional probabilities on directed graphical structures. Networks 20 579-605.
Mathematical Reviews (MathSciNet): MR91g:68151
Zentralblatt MATH: 0697.90045
Digital Object Identifier: doi:10.1002/net.3230200507
SPIRTES, P., GLy MOUR, C. and SCHEINES, R. (2001). Causation, Prediction, and Search. MIT Press.
Mathematical Reviews (MathSciNet): MR1815675
SPIRTES, P. and MEEK, C. (1995). Learning Bayesian networks with discrete variables from data. In Proceedings of First International Conference on Knowledge Discovery and Data Mining 294-299. Morgan Kaufmann, San Francisco.
THIESSON, B., MEEK, C., CHICKERING, D. and HECKERMAN, D. (1998). Computationally efficient methods for selecting among mixtures of graphical models. In Bayesian Statistics 6 (J. M. Bernardo, A. P. Dawid and A. F. M. Smith, eds.) 631-656. Clarendon Press, Oxford.
VERMA, T. and PEARL, J. (1990). Equivalence and sy nthesis of causal models. In Proceedings of Sixth Conference on Uncertainty in Artificial Intelligence 220-227. Morgan Kaufmann, San Francisco.
Mathematical Reviews (MathSciNet): MR1091985
REDMOND, WASHINGTON 98052-6399 E-MAIL: heckerma@microsoft.com