Electronic Journal of Statistics

Multinomial and empirical likelihood under convex constraints: Directions of recession, Fenchel duality, the PP algorithm

Marian Grendár and Vladimír Špitalský

Full-text: Open access


The primal problem of multinomial likelihood maximization restricted to a convex closed subset of the probability simplex is studied. A solution of this problem may assign a positive mass to an outcome with zero count. Such a solution cannot be obtained by the widely used, simplified Lagrange and Fenchel duals. Related flaws in the simplified dual problems, which arise because the recession directions are ignored, are identified and the correct Lagrange and Fenchel duals are developed.

The results permit us to specify linear sets and data such that the empirical likelihood-maximizing distribution exists and is the same as the multinomial likelihood-maximizing distribution. The multinomial likelihood ratio reaches, in general, a different conclusion than the empirical likelihood ratio.

Implications for minimum discrimination information, Lindsay geometry, compositional data analysis, bootstrap with auxiliary information, and Lagrange multiplier test, which explicitly or implicitly ignore information about the support, are discussed.

A solution of the primal problem can be obtained by the PP (perturbed primal) algorithm, that is, as the limit of a sequence of solutions of perturbed primal problems. The PP algorithm may be implemented by the simplified Lagrange or Fenchel dual.

Article information

Electron. J. Statist. Volume 11, Number 1 (2017), 2547-2612.

Received: July 2016
First available in Project Euclid: 20 June 2017

Permanent link to this document

Digital Object Identifier

Zentralblatt MATH identifier

Primary: 62H12: Estimation 62H17: Contingency tables
Secondary: 90C46: Optimality conditions, duality [See also 49N15]

Closed multinomial distribution estimating equation contingency table zero cell frequency El Barmi Dykstra dual Smith dual PP algorithm epi-convergence Fisher likelihood minimum discrimination information

Creative Commons Attribution 4.0 International License.


Grendár, Marian; Špitalský, Vladimír. Multinomial and empirical likelihood under convex constraints: Directions of recession, Fenchel duality, the PP algorithm. Electron. J. Statist. 11 (2017), no. 1, 2547--2612. doi:10.1214/17-EJS1294. https://projecteuclid.org/euclid.ejs/1497924056

Export citation


  • [1] Agresti, A. (2002)., Categorical data analysis, Second ed. Wiley-Interscience [John Wiley & Sons], New York.
  • [2] Agresti, A. and Coull, B. A. (2002). The analysis of contingency tables under inequality constraints., J. Statist. Plann. Inference 107 45–73.
  • [3] Aitchison, J. (1986)., The statistical analysis of compositional data. Chapman & Hall, London.
  • [4] Aitchison, J. and Silvey, S. D. (1958). Maximum-likelihood estimation of parameters subject to restraints., Ann. Math. Statist. 29 813–828.
  • [5] Anaya-Izquierdo, K., Critchley, F., Marriott, P. and Vos, P. W. Computational information geometry: theory and practice., arxiv:1209.1988.
  • [6] Baker, R., Clarke, M. and Lane, P. (1985). Zero entries in contingency tables., Comput. Statist. Data Anal. 3 33–45.
  • [7] Balabdaoui, F. and Jankowski, H. (2016). Maximum likelihood estimation of a unimodal probability mass function., Stat. Sin. 26 1061–1086.
  • [8] Bergsma, W., Croon, M. and van der Ark, L. A. (2012). The empty set and zero likelihood problems in maximum empirical likelihood estimation., Electron. J. Stat. 6 2356–2361.
  • [9] Bergsma, W., Croon, M. A. and Hagenaars, J. A. (2009)., Marginal models. Springer.
  • [10] Bertsekas, D. P. (2003)., Convex analysis and optimization. Athena Scientific, Belmont, MA With Angelia Nedić and Asuman E. Ozdaglar.
  • [11] Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975)., Discrete multivariate analysis: theory and practice. The MIT Press, Cambridge, Mass.-London With the collaboration of Richard J. Light and Frederick Mosteller.
  • [12] El Barmi, H. and Dykstra, R. L. (1994). Restricted multinomial maximum likelihood estimation based upon Fenchel duality., Statist. Probab. Lett. 21 121–130.
  • [13] El Barmi, H. and Dykstra, R. L. (1998). Maximum likelihood estimates via duality for log-convex models when cell probabilities are subject to convex constraints., Ann. Statist. 26 1878–1893.
  • [14] Fienberg, S. E. and Rinaldo, A. (2012). Maximum likelihood estimation in log-linear models., Ann. Statist. 40 996–1023.
  • [15] Fisher, R. A. (1925). Theory of statistical estimation., Math. Proc. Cambridge Philos. Soc. 22 700–725.
  • [16] Geyer, C. J. (2009). Likelihood inference in exponential families and directions of recession., Electron. J. Stat. 3 259–289.
  • [17] Ghalanos, A. and Theussl, S. (2012)., Rsolnp: general non-linear optimization using augmented Lagrange multiplier method. R package version 1.14.
  • [18] Gokhale, D. V. (1973). Iterative maximum likelihood estimation for discrete distributions., Sankhyā Ser. B 35 293–298.
  • [19] Grendár, M. and Judge, G. (2009). Empty set problem of maximum empirical likelihood methods., Electron. J. Stat. 3 1542–1555.
  • [20] Grendár, M. and Špitalský, V. (2014). Supplement to “Multinomial and empirical likelihood under convex constraints: directions of recession, Fenchel duality, the PP algorithm”. DOI:, 10.1214/17-EJS1294SUPP
  • [21] Haber, M. (1985). Maximum likelihood methods for linear and log-linear models in categorical data., Comput. Statist. Data Anal. 3 1–10.
  • [22] Hall, P. and Presnell, B. (1999). Intentionally biased bootstrap methods., J. R. Stat. Soc. Ser. B Stat. Methodol. 61 143–158.
  • [23] Hogben, L., ed. (2007)., Handbook of linear algebra. Discrete Mathematics and Its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL.
  • [24] Ireland, C. T., Ku, H. H. and Kullback, S. (1969). Symmetry and marginal homogeneity of an $r\times r$ contingency table., J. Amer. Statist. Assoc. 64 1323–1341.
  • [25] Ireland, C. T. and Kullback, S. (1968). Contingency tables with given marginals., Biometrika 55 179–188.
  • [26] Kall, P. (1986). Approximation to optimization problems: an elementary review., Math. Oper. Res. 11 9–18.
  • [27] Kerridge, D. F. (1961). Inaccuracy and inference., J. Roy. Statist. Soc. Ser. B 23 184–194.
  • [28] Klotz, J. H. (1978). Testing a linear constraint for multinomial cell frequencies and disease screening., Ann. Statist. 6 904–909.
  • [29] Lang, J. B. (2004). Multinomial-Poisson homogeneous models for contingency tables., Ann. Statist. 32 340–383.
  • [30] Lindsay, B. G. (1995). Mixture models: theory, geometry and applications. In, NSF-CBMS regional conference series in probability and statistics i–163.
  • [31] Lindsey, J. K. (1996)., Parametric statistical inference. The Clarendon Press, Oxford University Press, New York.
  • [32] Little, R. J. and Wu, M.-M. (1991). Models for contingency tables with known margins when target and sampled populations differ., J. Amer. Statist. Assoc. 86 87–95.
  • [33] Madansky, A. (1963). Tests of homogeneity for correlated samples., J. Amer. Statist. Assoc. 58 97–119.
  • [34] Owen, A. B. (1988). Empirical likelihood ratio confidence intervals for a single functional., Biometrika 75 237–249.
  • [35] Owen, A. B. (2001)., Empirical likelihood. Chapman & Hall/CRC.
  • [36] Pelz, W. and Good, I. J. (1986). Estimating probabilities from contingency tables when the marginal probabilities are known, by using additive objective functions., Statistician 35 45–50.
  • [37] Pitman, E. J. G. (1979)., Some basic theory for statistical inference. Chapman and Hall, London; A Halsted Press Book, John Wiley & Sons, New York.
  • [38] Qin, J. and Lawless, J. (1994). Empirical likelihood and general estimating equations., Ann. Statist. 22 300–325.
  • [39] R-Core-Team (2012). R: a language and environment for statistical computing R Foundation for Statistical Computing, Vienna, Austria.
  • [40] Rockafellar, R. T. (1974)., Conjugate duality and optimization. Society for Industrial and Applied Mathematics, Philadelphia, Pa.
  • [41] Rockafellar, R. T. (1997)., Convex analysis. Princeton University Press, Princeton, NJ.
  • [42] Rockafellar, R. T. and Wets, R. J. B. (1998)., Variational analysis 317. Springer-Verlag, Berlin.
  • [43] Sharma, P., Hawes, R. H., Bansal, A., Gupta, N., Curvers, W., Rastogi, A., Singh, M., Hall, M., Mathur, S. C., Wani, S. B. et al. (2013). Standard endoscopy with random biopsies versus narrow band imaging targeted biopsies in Barrett’s oesophagus: a prospective, international, randomised controlled trial., Gut 62 15–21.
  • [44] Silvey, S. D. (1959). The Lagrangian multiplier test., Ann. Math. Statist. 30 389–407.
  • [45] Smith, J. H. (1947). Estimation of linear functions of cell proportions., Ann. Math. Statist. 18 231–254.
  • [46] Stirling, W. D. (1986). Testing linear hypotheses in contingency tables with zero cell counts., Comput. Statist. Data Anal. 4 1–13.
  • [47] Wets, R. J. B. (1999). Statistical estimation from an optimization viewpoint., Ann. Oper. Res. 85 79–101.
  • [48] Ye, Y. (1987). Interior algorithms for linear, quadratic, and linearly constrained non-linear programming PhD thesis, Department of ESS, Stanford, University.
  • [49] Zhang, B. (1999). Bootstrapping with auxiliary information., Canad. J. Statist. 27 237–249.
  • [50] Zhang, Z. (2009). Interpreting statistical evidence with empirical likelihood functions., Biom. J. 51 710–720.

Supplemental materials