In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis–Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. Here we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis–Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback–Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even for modest sample sizes. We also propose a method for model selection using the approximation. The proposed approximation provides a computationally scalable approach to regularized estimation and approximate Bayesian inference for log-linear models.
"Optimal Gaussian Approximations to the Posterior for Log-Linear Models with Diaconis–Ylvisaker Priors." Bayesian Anal. 13 (1) 201 - 223, March 2018. https://doi.org/10.1214/16-BA1046