The Annals of Statistics

A conjugate prior for discrete hierarchical log-linear models

Hélène Massam, Jinnan Liu, and Adrian Dobra

Source: Ann. Statist. Volume 37, Number 6A (2009), 3431-3467.

Abstract

In Bayesian analysis of multi-way contingency tables, the selection of a prior distribution for either the log-linear parameters or the cell probabilities parameters is a major challenge. In this paper, we define a flexible family of conjugate priors for the wide class of discrete hierarchical log-linear models, which includes the class of graphical models. These priors are defined as the Diaconis–Ylvisaker conjugate priors on the log-linear parameters subject to “baseline constraints” under multinomial sampling. We also derive the induced prior on the cell probabilities and show that the induced prior is a generalization of the hyper Dirichlet prior. We show that this prior has several desirable properties and illustrate its usefulness by identifying the most probable decomposable, graphical and hierarchical log-linear models for a six-way contingency table.

Primary Subjects: 62F15, 62H17, 62E15
Keywords: Hierarchical log-linear models; conjugate prior; contingency tables; hyper Markov property; hyper Dirichlet; model selection

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1250515392
Digital Object Identifier: doi:10.1214/08-AOS669
Zentralblatt MATH identifier: 05644285

References

[1] Agresti, A. (1990). Categorical Data Analysis. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1044993
[2] Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete Multivariate Analysis: Theory and Practice. MIT Press, Cambridge, MA.
Mathematical Reviews (MathSciNet): MR381130
[3] Clyde, M. and George, E. I. (2004). Model uncertainty. Statist. Sci. 19 81–94.
Mathematical Reviews (MathSciNet): MR2082148
Digital Object Identifier: doi:10.1214/088342304000000035
Project Euclid: euclid.ss/1089808274
[4] Consonni, G. and Leucari, V. (2006). Reference priors for discrete graphical models. Biometrika 93 23–40.
Mathematical Reviews (MathSciNet): MR2277737
Zentralblatt MATH: 1152.62322
Digital Object Identifier: doi:10.1093/biomet/93.1.23
[5] Consonni, G. and Veronese, P. (1992). Conjugate priors for exponential families having quadratic variance function. J. Amer. Statist. Assoc. 87 1123–1127.
Mathematical Reviews (MathSciNet): MR1209570
Zentralblatt MATH: 0764.62027
Digital Object Identifier: doi:10.2307/2290650
[6] Darroch, J. N., Lauritzen, S. L. and Speed, T. P. (1980). Markov fields and log-linear models for contingency tables. Ann. Statist. 8 522–539.
Mathematical Reviews (MathSciNet): MR568718
Zentralblatt MATH: 0444.62064
Digital Object Identifier: doi:10.1214/aos/1176345006
Project Euclid: euclid.aos/1176345006
[7] Darroch, J. N. and Speed, T. P. (1983). Additive and multiplicative models and interaction. Ann. Statist. 11 724–738.
Mathematical Reviews (MathSciNet): MR707924
Zentralblatt MATH: 0556.62032
Digital Object Identifier: doi:10.1214/aos/1176346240
Project Euclid: euclid.aos/1176346240
[8] Dawid, A. P. and Lauritzen, S. L. (1993). Hyper Markov laws in the statistical analysis of decomposable graphical models. Ann. Statist. 21 1272–1317.
Mathematical Reviews (MathSciNet): MR1241267
Zentralblatt MATH: 0815.62038
Digital Object Identifier: doi:10.1214/aos/1176349260
Project Euclid: euclid.aos/1176349260
[9] Dellaportas, P. and Forster, J. J. (1999). Markov chain Monte Carlo model determination for hierarchical and graphical log-linear models. Biometrika 86 615–633.
Mathematical Reviews (MathSciNet): MR1723782
Zentralblatt MATH: 0949.62050
Digital Object Identifier: doi:10.1093/biomet/86.3.615
[10] Diaconis, P. and Ylvisaker, D. (1979). Conjugate priors for exponential families. Ann. Statist. 7 269–281.
Mathematical Reviews (MathSciNet): MR520238
Zentralblatt MATH: 0405.62011
Digital Object Identifier: doi:10.1214/aos/1176344611
Project Euclid: euclid.aos/1176344611
[11] Dobra, A., Briollais, L., Jarjanazi, H., Ozcelik, H. and Massam, H. (2009). Applications of the mode oriented stochastic search (MOSS) algorithm for discrete multi-way data to genomewide studies. In Bayesian Modeling in Bioinformatics (D. Dey, S. Ghosh and B. Mallick, eds.). Taylor and Francis. To appear.
[12] Dobra, A. and Fienberg, S. E. (2000). Bounds for cell entries in contingency tables given marginal totals and decomposable graphs. Proc. Natl. Acad. Sci. USA 97 11185–11192.
Mathematical Reviews (MathSciNet): MR1789526
Zentralblatt MATH: 0960.62059
Digital Object Identifier: doi:10.1073/pnas.97.22.11885
[13] Dobra, A. and Massam, H. (2009). The mode oriented stochastic search (MOSS) for log-linear models with conjugate priors. Statist. Methodol. To appear.
[14] Edwards, D. E. and Havranek, T. (1985). A fast procedure for model search in multidimensional contingency tables. Biometrika 72 339–351.
Mathematical Reviews (MathSciNet): MR801773
Zentralblatt MATH: 0576.62067
Digital Object Identifier: doi:10.1093/biomet/72.2.339
[15] Gutierrez-Pena, E. and Smith, A. F. (1995). Conjugate parametrizations for natural exponential families. J. Amer. Statist. Assoc. 90 1347–1356.
Mathematical Reviews (MathSciNet): MR1379477
Zentralblatt MATH: 0868.62029
Digital Object Identifier: doi:10.2307/2291525
[16] Hook, E. B., Albright, S. G. and Cross, P. K. (1980). Use of Bernoulli census and log-linear methods for estimating the prevalence of spina bifida in live births and the completeness of vital records reports in New York State. Amer. J. Epidemiology 112 750–758.
[17] King, R. and Brooks, S. P. (2001). Prior induction for log-linear models for general contingency table analysis. Ann. Statist. 29 715–747.
Mathematical Reviews (MathSciNet): MR1865338
Zentralblatt MATH: 1041.62050
Digital Object Identifier: doi:10.1214/aos/1009210687
Project Euclid: euclid.aos/1009210687
[18] Knuiman, M. W. and Speed, T. P. (1988). Incorporating prior information into the analysis of contingency tables. Biometrics 44 1061–1071.
Mathematical Reviews (MathSciNet): MR981000
Digital Object Identifier: doi:10.2307/2531735
[19] Lauritzen, S. L. (1996). Graphical Models. Clarendon Press, Oxford.
Mathematical Reviews (MathSciNet): MR1419991
[20] Leucari, V. (2004). Prior distributions for parameters of discrete graphical models. Ph.D. thesis, Dept. Mathematics, Univ. Pavia.
[21] Madigan, D. and Raftery, A. E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Amer. Statist. Assoc. 89 1535–1546.
[22] Madigan, D. and York, J. (1997). Bayesian methods for estimation of the size of a closed population. Biometrika 84 19–31.
Mathematical Reviews (MathSciNet): MR1450189
Zentralblatt MATH: 0887.62029
Digital Object Identifier: doi:10.1093/biomet/84.1.19
[23] Madigan, D., York, J. and Allard, D. (1995). Bayesian graphical models for discrete data. Int. Statist. Rev. 63 215–232.
[24] Mantel, N. (1966). Models for complex contingency tables and polychotomous dosage response curves. Biometrics 22 83–95.
[25] Perks, W. (1947). Some observations on inverse probability including a new indifference rule. J. Inst. Actuar. 73 285–334.
Mathematical Reviews (MathSciNet): MR25103
Zentralblatt MATH: 0031.06001
[26] Steck, H. and Jaakkola, T. (2002). On the Dirichlet prior and Bayesian regularization. In Advances in Neural Information Processing Systems (NIPS) (S. Becker, S. Thrun and K. Obermayer, eds.) 697–704. MIT Press, Cambridge, MA.
[27] Tarantola, C. (2004). MCMC model determination for discrete graphical models. Stat. Model. 4 39–61.
Mathematical Reviews (MathSciNet): MR2037813
Digital Object Identifier: doi:10.1191/1471082X04st063oa
[28] Tierney, L. and Kadane, J. (1986). Accurate approximations for posterior moments and marginal densities. J. Amer. Statist. Assoc. 81 82–86.
Mathematical Reviews (MathSciNet): MR830567
Zentralblatt MATH: 0587.62067
Digital Object Identifier: doi:10.2307/2287970
[29] Wermuth, N. and Cox, D. R. (1992). On the relation between interactions obtained with alternatives codings of discrete variables. Methodika 6 76–85.
[30] Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley, Chichester.
Mathematical Reviews (MathSciNet): MR1112133

2010 © Institute of Mathematical Statistics