Bayesian Analysis

Optimal Gaussian Approximations to the Posterior for Log-Linear Models with Diaconis–Ylvisaker Priors

James Johndrow and Anirban Bhattacharya

Full-text: Open access


In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis–Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. Here we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis–Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback–Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even for modest sample sizes. We also propose a method for model selection using the approximation. The proposed approximation provides a computationally scalable approach to regularized estimation and approximate Bayesian inference for log-linear models.

Article information

Bayesian Anal., Volume 13, Number 1 (2018), 201-223.

First available in Project Euclid: 21 February 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

credible region conjugate prior contingency table Dirichet–Multinomial Kullback–Leibler divergence Laplace approximation

Creative Commons Attribution 4.0 International License.


Johndrow, James; Bhattacharya, Anirban. Optimal Gaussian Approximations to the Posterior for Log-Linear Models with Diaconis–Ylvisaker Priors. Bayesian Anal. 13 (2018), no. 1, 201--223. doi:10.1214/16-BA1046.

Export citation


  • Abramowitz, M. and Stegun, I. A. (1964). Handbook of Mathematical Functions: With Formulas, Graphs, and Mathematical Tables. 55. Courier Corporation.
  • Agresti, A. (2002). Categorical Data Analysis, volume 359. John Wiley & Sons.
  • Attias, H. (1999). “Inferring parameters and structure of latent variable models by variational Bayes.” In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, 21–30. Morgan Kaufmann Publishers Inc.
  • Bishop, Y. M., Fienberg, S. E., and Holland, P. W. (2007). Discrete Multivariate Analysis: Theory and Practice. Springer Science & Business Media.
  • Chen, C.-P. and Qi, F. (2003). “The best lower and upper bounds of harmonic sequence.” RGMIA Research Report Collection, 6(2).
  • Consonni, G., Veronese, P., and Gutiérrez-Peña, E. (2004). “Reference priors for exponential families with simple quadratic variance function.” Journal of Multivariate Analysis, 88(2): 335–364.
  • Dellaportas, P. and Forster, J. J. (1999). “Markov chain Monte Carlo model determination for hierarchical and graphical log-linear models.” Biometrika, 86(3): 615–633.
  • Diaconis, P. and Ylvisaker, D. (1979). “Conjugate priors for exponential families.” The Annals of Statistics, 7(2): 269–281.
  • Dobra, A. and Lenkoski, A. (2011). “Copula Gaussian graphical models and their application to modeling functional disability data.” The Annals of Applied Statistics, 5(2A): 969–993.
  • Dobra, A. and Massam, H. (2010). “The mode oriented stochastic search (MOSS) algorithm for log-linear models with conjugate priors.” Statistical Methodology, 7(3): 240–253.
  • Fienberg, S. E. and Rinaldo, A. (2007). “Three centuries of categorical data analysis: Log-linear models and maximum likelihood estimation.” Journal of Statistical Planning and Inference, 137(11): 3430–3445.
  • Gelfand, A. E. and Smith, A. F. (1990). “Sampling-based approaches to calculating marginal densities.” Journal of the American Statistical Association, 85(410): 398–409.
  • Gutiérrez-Pena, E. and Smith, A. (1995). “Conjugate parameterizations for natural exponential families.” Journal of the American Statistical Association, 90(432): 1347–1356.
  • Haberman, S. J. (1974). “Log-linear models for frequency tables derived by indirect observation: Maximum likelihood equations.” The Annals of Statistics, 911–924.
  • Hoeting, J. A., Madigan, D., Raftery, A. E., and Volinsky, C. T. (1998). “Bayesian model averaging.” In In Proceedings of the AAAI Workshop on Integrating Multiple Learned Models, 77–83. Citeseer.
  • Lauritzen, S. L. (1996). Graphical models. Oxford University Press.
  • Letac, G. and Massam, H. (2012). “Bayes factors and the geometry of discrete hierarchical loglinear models.” The Annals of Statistics, 40(2): 861–890.
  • Massam, H., Liu, J., and Dobra, A. (2009). “A conjugate prior for discrete hierarchical log-linear models.” The Annals of Statistics, 37(6A): 3431–3467.
  • Park, M. Y. and Hastie, T. (2007). “L1-regularization path algorithm for generalized linear models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(4): 659–677.
  • Polson, N. G., Scott, J. G., and Windle, J. (2013). “Bayesian inference for logistic models using Pólya–Gamma latent variables.” Journal of the American Statistical Association, 108(504): 1339–1349.
  • Shun, Z. and McCullagh, P. (1995). “Laplace approximation of high dimensional integrals.” Journal of the Royal Statistical Society. Series B (Methodological), 749–760.
  • Tierney, L. and Kadane, J. B. (1986). “Accurate approximations for posterior moments and marginal densities.” Journal of the American Statistical Association, 81(393): 82–86.
  • Wang, B. and Titterington, D. (2004). “Lack of consistency of mean field and variational Bayes approximations for state space models.” Neural Processing Letters, 20(3): 151–170.
  • Wang, B. and Titterington, D. (2005). “Inadequacy of interval estimates corresponding to variational Bayesian approximations.” Proc. 10th Int. Workshop Artificial Intelligence and Statistics, 373–380.
  • Whittaker, J. (1990). “Graphical models in applied multivariate statistics.”
  • Zou, H. and Hastie, T. (2005). “Regularization and variable selection via the elastic net.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2): 301–320.