Electronic Journal of Statistics

On the role of the overall effect in exponential families

Anna Klimova and Tamás Rudas

Full-text: Open access


Exponential families of discrete probability distributions when the normalizing constant (or overall effect) is added or removed are compared in this paper. The latter setup, in which the exponential family is curved, is particularly relevant when the sample space is an incomplete Cartesian product or when it is very large, so that the computational burden is significant. The lack or presence of the overall effect has a fundamental impact on the properties of the exponential family. When the overall effect is added, the family becomes the smallest regular exponential family containing the curved one. The procedure is related to the homogenization of an inhomogeneous variety discussed in algebraic geometry, of which a statistical interpretation is given as an augmentation of the sample space. The changes in the kernel basis representation when the overall effect is included or removed are derived. The geometry of maximum likelihood estimates, also allowing zero observed frequencies, is described with and without the overall effect, and various algorithms are compared. The importance of the results is illustrated by an example from cell biology, showing that routinely including the overall effect leads to estimates which are not in the model intended by the researchers.

Article information

Electron. J. Statist., Volume 12, Number 2 (2018), 2430-2453.

Received: August 2017
First available in Project Euclid: 25 July 2018

Permanent link to this document

Digital Object Identifier

Algebraic variety contingency table independence log-linear model maximum likelihood estimation overall effect relational model

Creative Commons Attribution 4.0 International License.


Klimova, Anna; Rudas, Tamás. On the role of the overall effect in exponential families. Electron. J. Statist. 12 (2018), no. 2, 2430--2453. doi:10.1214/18-EJS1453. https://projecteuclid.org/euclid.ejs/1532484335

Export citation


  • Aitchison, J., & Silvey, S. D. (1960). Maximum-likelihood estimation procedures and associated tests of significance., J. Roy. Statist. Soc. Ser.B, 22, 154–171.
  • Andreas, J., & Klein, D. (2015). When and why are log-linear models self-normalizing? In, Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 244–249). ACM, New-York, USA.
  • Bliss, C. I. (1939). The toxicity of poisons applied jointly., Ann. Appl. Biol., 26, 585–615.
  • Cox, D. A., Little, J. & O’Shea, D. (2015). Ideals, varieties, and algorithms: an introduction to computational algebraic geometry and commutative algebra (Fourth ed.). New York:, Springer.
  • Evans, R. J., & Forcina, A. (2013). Two algorithms for fitting constrained marginal models., Comput. Statist. Data Anal., 66, 1–7.
  • Fienberg, S. E., & Rinaldo, A. (2012). Maximum likelihood estimation in log-linear models., Ann. Statist., 40, 996–1023.
  • Forcina, A. (2017). Estimation for multiplicative models under multinomial sampling., arXiv:1704.06762.
  • Geiger, D., Meek, C. & Sturmfels, B. (2006). On the toric algebra of graphical models., Ann. Statist., 34, 1463–1492.
  • Grünbaum, B. (2003). Convex polytopes., Springer.
  • Høsgaard, S. (2004). Statistical inference in context specific interaction models for contingency tables., Scand. J. Statist., 31, 143–158.
  • Kawamoto, H., Wada, H. & Katsura, Y. (2010). A revised scheme for developmental pathways of hematopoietic cells: the myeloid-based model., International Immunology, 22, 65–70.
  • Kawamura, G., Matsuoka, T., Tajiri, T., Nishida, M. & Hayashi, M. (1995). Effectiveness of a sugarcane-fish combination as bait in trapping swimming crabs., Fisheries Research, 22, 155–160.
  • Klimova, A., & Rudas, T. (2012). Coordinate free analysis of trends in British social mobility., J. Appl. Stat., 39, 1681–1691.
  • Klimova, A., & Rudas, T. (2014). gIPFrm: Generalized Iterative Proportional Fitting for Relational Models. [Computer software manual]., http://cran.r-project.org/web/packages/gIPFrm/index.html (accessed on June 9, 2017. R package version 2.0)
  • Klimova, A., & Rudas, T. (2015). Iterative scaling in curved exponential families., Scand. J. Statist., 42, 832–847.
  • Klimova, A., Rudas, T. (2016). On the closure of relational models., J. Multivariate Anal., 143, 440–452.
  • Klimova, A., Rudas, T. & Dobra, A. (2012). Relational models for contingency tables., J. Multivariate Anal., 104, 159–173.
  • Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. Chapman &, Hall.
  • Mnih, A., & Teh, Y. W. (2012). A fast and simple algorithm for training neural probabilistic language models. In J. Langford & J. Pineau (Eds.), Proceedings of the 29th International Conference on Machine learning (ICML 2012), Edinburgh, Scotland, UK (p. 1751–1758)., Omnipress.
  • Nyman, H., Pensar, J., Koski, T. & Corander, J. (2016). Context-specific independence in graphical log-linear models., Computational Statistics, 31, 1493–1512.
  • Perié, L., Hodgkin, P. D., Naik, S. H., Schumacher, T. N., de Boer, R. J. & Duffy, K. R. (2014). Determining lineage pathways from cellular barcoding experiements., Cell Reports, 6, 617–624.
  • Ramos, A. L., Darabi, R., Akbarloo, N., Borges, L., Catanese, J., Dineen, S. P. & Perlingeiro, R. C. R. (2010). Clonal analysis reveals a common progenitor for endothelial, myeloid, and lymphoid precursors in umbilical cord blood., Circ. Res., 107, 1460-1469.
  • Sturmfels, B. (1996). Gröbner bases and convex polytopes. Providence RI:, AMS.
  • Wahrendorf, J., Zentgraf, R. & Brown, C. C. (1981). Optimal designs for the analysis of interactive effects of two carcinogens or other toxicants., Biometrics, 37, 45–54.
  • Ye, F., Huang, W. & Guo, G. (2017). Studying hematopoiesis using single-cell technologies., Journal of Hematology & Oncology, 10.