Annals of Statistics

Maximum likelihood estimation in log-linear models

Stephen E. Fienberg and Alessandro Rinaldo

Full-text: Open access


We study maximum likelihood estimation in log-linear models under conditional Poisson sampling schemes. We derive necessary and sufficient conditions for existence of the maximum likelihood estimator (MLE) of the model parameters and investigate estimability of the natural and mean-value parameters under a nonexistent MLE. Our conditions focus on the role of sampling zeros in the observed table. We situate our results within the framework of extended exponential families, and we exploit the geometric properties of log-linear models. We propose algorithms for extended maximum likelihood estimation that improve and correct the existing algorithms for log-linear model analysis.

Article information

Ann. Statist., Volume 40, Number 2 (2012), 996-1023.

First available in Project Euclid: 18 July 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62H17: Contingency tables
Secondary: 62F99: None of the above, but in this section

Extended exponential families extended maximum likelihood estimators Newton–Raphson algorithm log-linear models sampling zeros


Fienberg, Stephen E.; Rinaldo, Alessandro. Maximum likelihood estimation in log-linear models. Ann. Statist. 40 (2012), no. 2, 996--1023. doi:10.1214/12-AOS986.

Export citation


  • Agresti, A. (2002). Categorical Data Analysis, 2nd ed. Wiley, New York.
  • Aickin, M. (1979). Existence of MLEs for discrete linear exponential models. Ann. Inst. Statist. Math. 31 103–113.
  • Barndorff-Nielsen, O. (1978). Information and Exponential Families in Statistical Theory. Wiley, Chichester.
  • Birch, M. W. (1963). Maximum likelihood in three-way contingency tables. J. Roy. Statist. Soc. Ser. B 25 220–233.
  • Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete Multivariate Analysis. MIT Press, Cambridge, MA. Reprinted by Springer (2007).
  • Brown, L. D. (1986). Fundamentals of Statistical Exponential Families with Applications in Statistical Decision Theory. Institute of Mathematical Statistics Lecture Notes—Monograph Series 9. IMS, Hayward, CA.
  • Čencov, N. N. (1982). Statistical Decision Rules and Optimal Inference. Translations of Mathematical Monographs 53. Amer. Math. Soc., Providence, RI.
  • Christensen, R. (1997). Log-Linear Models and Logistic Regression, 2nd ed. Springer, New York.
  • Csiszár, I. (1975). $I$-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3 146–158.
  • Csiszár, I. (1989). A geometric interpretation of Darroch and Ratcliff’s generalized iterative scaling. Ann. Statist. 17 1409–1413.
  • Csiszár, I. and Matúš, F. (2001). Convex cores of measures on $\mathbbR^d$. Studia Sci. Math. Hungar. 38 177–190.
  • Csiszár, I. and Matúš, F. (2003). Information projections revisited. IEEE Trans. Inform. Theory 49 1474–1490.
  • Csiszár, I. and Matúš, F. (2005). Closures of exponential families. Ann. Probab. 33 582–600.
  • Csiszár, I. and Matúš, F. (2008). Generalized maximum likelihood estimates for exponential families. Probab. Theory Related Fields 141 213–246.
  • Darroch, J. N. and Ratcliff, D. (1972). Generalized iterative scaling for log-linear models. Ann. Math. Statist. 43 1470–1480.
  • Dobra, A. and Massam, H. (2010). The mode oriented stochastic search (MOSS) algorithm for log-linear models with conjugate priors. Stat. Methodol. 7 240–253.
  • Dobra, A., Fienberg, S. E., Rinaldo, A., Slavkovic, A. and Zhou, Y. (2009). Algebraic statistics and contingency table problems: Log-linear models, likelihood estimation, and disclosure limitation. In Emerging Applications of Algebraic Geometry (Putinar, M. and Sullivan, S., eds.). IMA Vol. Math. Appl. 149 63–88. Springer, New York.
  • Drton, M., Sturmfels, B. and Sullivant, S. (2009). Lectures on Algebraic Statistics. Oberwolfach Seminars 39. Birkhäuser, Basel.
  • Eriksson, N., Fienberg, S. E., Rinaldo, A. and Sullivant, S. (2006). Polyhedral conditions for the nonexistence of the MLE for hierarchical log-linear models. J. Symbolic Comput. 41 222–233.
  • Erosheva, E. A., Fienberg, S. E. and Joutard, C. (2007). Describing disability through individual-level mixture models for multivariate binary data. Ann. Appl. Stat. 1 502–537.
  • Ewald, G. (1996). Combinatorial Convexity and Algebraic Geometry. Graduate Texts in Mathematics 168. Springer, New York.
  • Fienberg, S. E. and Rinaldo, A. (2007). Three centuries of categorical data analysis: Log-linear models and maximum likelihood estimation. J. Statist. Plann. Inference 137 3430–3445.
  • Fienberg, S. E. and Rinaldo, A. (2012). Maximum likelihood estimation in log-linear models—supplementary material. Technical report, Carnegie Mellon Univ. Available at
  • Forster, J. (2004). Bayesian inference for Poisson and multinomial log-linear models. Technical report, School of Mathematics, Univ. Southampton.
  • Fulton, W. (1993). Introduction to Toric Varieties. Annals of Mathematics Studies 131. Princeton Univ. Press, Princeton, NJ.
  • Gawrilow, E. and Joswig, M. (2000). Polymake: A framework for analyzing convex polytopes. In Polytopes—Combinatorics and Computation (Oberwolfach, 1997). DMV Sem. 29 43–73. Birkhäuser, Basel.
  • Geiger, D., Meek, C. and Sturmfels, B. (2006). On the toric algebra of graphical models. Ann. Statist. 34 1463–1492.
  • Geyer, C. J. (2009). Likelihood inference in exponential families and directions of recession. Electron. J. Stat. 3 259–289.
  • Haberman, S. J. (1974). The Analysis of Frequency Data. Univ. Chicago Press, Chicago, IL.
  • Haberman, S. J. (1977). Log-linear models and frequency tables with small expected cell counts. Ann. Statist. 5 1148–1169.
  • King, R. and Brooks, S. P. (2001). Prior induction in log-linear models for general contingency table analysis. Ann. Statist. 29 715–747.
  • Koehler, K. J. (1986). Goodness-of-fit tests for $\log$-linear models in sparse contingency tables. J. Amer. Statist. Assoc. 81 483–493.
  • Lang, J. B. (2004). Multinomial–Poisson homogeneous models for contingency tables. Ann. Statist. 32 340–383.
  • Lang, J. B. (2005). Homogeneous linear predictor models for contingency tables. J. Amer. Statist. Assoc. 100 121–134.
  • Lauritzen, S. L. (1996). Graphical Models. Oxford Statistical Science Series 17. Oxford Univ. Press, New York.
  • Letac, G. (1992). Lectures on Natural Exponential Families and Their Variance Functions. Monografías de Matemática [Mathematical Monographs] 50. Instituto de Matemática Pura e Aplicada (IMPA), Rio de Janeiro.
  • Massam, H., Liu, J. and Dobra, A. (2009). A conjugate prior for discrete hierarchical log-linear models. Ann. Statist. 37 3431–3467.
  • Morris, C. (1975). Central limit theorems for multinomial sums. Ann. Statist. 3 165–188.
  • Morton, J. (2008). Relations among conditional probabilities. Technical report. ArXiv:0808.1149v1.
  • Nardi, Y. and Rinaldo, A. (2012). The log-linear group-lasso estimator and its asymptotic properties. Bernoulli. To appear.
  • Pachter, L. and Sturmfels, B. (2005). Algebraic Statistics for Computational Biology. Cambridge Univ. Press, New York.
  • Read, T. R. C. and Cressie, N. A. C. (1988). Goodness-of-Fit Statistics for Discrete Multivariate Data. Springer Series in Statistics. Springer, New York.
  • R Development Core Team (2005). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, available at
  • Rinaldo, A., Fienberg, S. E. and Zhou, Y. (2009). On the geometry of discrete exponential families with application to exponential random graph models. Electron. J. Stat. 3 446–484.
  • Rinaldo, A., Petrović, S. and Fienberg, S. (2011). Maximum likelihood estimation in network models. Technical report. Available at
  • Rockafellar, R. T. (1970). Convex Analysis. Princeton Mathematical Series 28. Princeton Univ. Press, Princeton, NJ.
  • Schrijver, A. (1998). Theory of Linear and Integer Programming. Wiley, New York.
  • Verbeek, A. (1992). The compactification of generalized linear models. Statist. Neerlandica 46 107–142.
  • Ziegler, M. G. (1995). Lectures on Polytopes. Springer, New York.