The Annals of Statistics

Multinomial-Poisson homogeneous models for contingency tables

Joseph B. Lang

Full-text: Open access

Abstract

A unified approach to maximum likelihood inference for a broad, new class of contingency table models is presented. The model class comprises multinomial-Poisson homogeneous (MPH) models, which can be characterized by an independent sampling plan and a system of homogeneous constraints, h(m) = 0, where m is the vector of expected table counts. Maximum likelihood (ML) fitting and large-sample inference for MPH models are described. The MPH models are partitioned into well-defined equivalence classes and explicit comparisons of the large-sample behaviors of ML estimators of equivalent models are given. The equivalence theory not only unifies a large collection of previously known results, it also leads to useful generalizations and many new results. The practical, computational implication is that ML fit results for any particular MPH model can be obtained directly from the ML fit results for any conveniently chosen equivalent model. Issues of hypothesis testability and parameter estimability are also addressed. To illustrate, an example based on statistics journal citation patterns is given for which the data can be used to test the hypothesis that a certain model holds, but they cannot be used to estimate any of that model's parameters.

Article information

Source
Ann. Statist., Volume 32, Number 1 (2004), 340-383.

Dates
First available in Project Euclid: 12 March 2004

Permanent link to this document
https://projecteuclid.org/euclid.aos/1079120140

Digital Object Identifier
doi:10.1214/aos/1079120140

Mathematical Reviews number (MathSciNet)
MR2051011

Zentralblatt MATH identifier
1105.62352

Subjects
Primary: 62H17: Contingency tables
Secondary: 62E20: Asymptotic distribution theory 62H12: Estimation 62H15: Hypothesis testing

Keywords
Approximate normality categorical data equivalent models estimability homogeneous constraint homogeneous statistic large-sample inference restricted maximum likelihood sampling plan testability

Citation

Lang, Joseph B. Multinomial-Poisson homogeneous models for contingency tables. Ann. Statist. 32 (2004), no. 1, 340--383. doi:10.1214/aos/1079120140. https://projecteuclid.org/euclid.aos/1079120140


Export citation

References

  • Agresti, A. (1990). Categorical Data Analysis. Wiley, New York.
  • Aitchison, J. and Silvey, S. D. (1958). Maximum-likelihood estimation of parameters subject to restraints. Ann. Math. Statist. 29 813--828.
  • Aitchison, J. and Silvey, S. D. (1960). Maximum-likelihood estimation procedures and associated tests of significance. J. Roy. Statist. Soc. Ser. B 22 154--171.
  • Andersen, A. H. (1974). Multidimensional contingency tables. Scand. J. Statist. 1 115--127.
  • Apostol, T. M. (1974). Mathematical Analysis, 2nd ed. Addison--Wesley, Reading, MA.
  • Baker, S. G. (1994). The multinomial--Poisson transformation. The Statistician 43 495--504.
  • Bergsma, W. P. (1997). Marginal Models for Categorical Data. Tilburg Univ. Press.
  • Bergsma, W. P. and Rudas, T. (2002). Marginal models for categorical data. Ann. Statist. 30 140--159.
  • Birch, M. W. (1963). Maximum likelihood in three-way contingency tables. J. Roy. Statist. Soc. Ser. B 25 220--233.
  • Birch, M. W. (1964). A new proof of the Pearson--Fisher theorem. Ann. Math. Statist. 35 817--824.
  • Bishop, Y. M. M., Fienberg, S. E. and Holland, P. W. (1975). Discrete Multivariate Analysis. MIT Press.
  • Chambers, R. L. and Welsh, A. H. (1993). Log-linear models for survey data with nonignorable nonresponse. J. Roy. Statist. Soc. Ser. B 55 157--170.
  • Chen, T. and Fienberg, S. E. (1974). Two-dimensional contingency tables with both completely and partially cross-classified data. Biometrics 30 629--642.
  • Christensen, R. R. (1990). Log-Linear Models. Springer, New York.
  • Cormack, R. M. and Jupp, P. E. (1991). Inference for Poisson and multinomial models for capture--recapture experiments. Biometrika 78 911--916.
  • Ferguson, T. S. (1996). A Course in Large Sample Theory. Chapman and Hall, London.
  • Fienberg, S. E. (2000). Contingency tables and log-linear models: Basic results and new developments. J. Amer. Statist. Assoc. 95 643--647.
  • Fienberg, S. E. and Larntz, K. (1976). Log linear representation for paired and multiple comparisons models. Biometrika 63 245--254.
  • Fleming, W. (1977). Functions of Several Variables, 2nd ed. Springer, New York.
  • Glonek, G. F. V. and McCullagh, P. (1995). Multivariate logistic models. J. Roy. Statist. Soc. Ser. B 57 533--546.
  • Grizzle, J. E., Starmer, C. F. and Koch, G. G. (1969). Analysis of categorical data by linear models. Biometrics 25 489--504.
  • Haber, M. (1986). Testing for pairwise independence. Biometrics 42 429--435.
  • Haberman, S. J. (1973). The analysis of residuals in cross-classified tables. Biometrics 29 205--220.
  • Haberman, S. J. (1974). The Analysis of Frequency Data. Univ. Chicago Press.
  • Ihaka, R. and Gentleman, R. (1996). R: A language for data analysis and graphics. J. Comput. Graph. Statist. 5 299--314.
  • Ireland, C. T. and Kullback, S. (1968). Contingency tables with given marginals. Biometrika 55 179--188.
  • Kullback, S. (1971). Marginal homogeneity of multidimensional contingency tables. Ann. Math. Statist. 42 594--606.
  • Lang, J. B. (1996a). On the comparison of multinomial and Poisson loglinear models. J. Roy. Statist. Soc. Ser. B 58 253--266.
  • Lang, J. B. (1996b). Maximum likelihood methods for a generalized class of loglinear models. Ann. Statist. 24 726--752.
  • Lang, J. B. (2000). Maximum-likelihood analysis for useful classes of multinomial--Poisson homogeneous-function models. Technical Report 298, Dept. Statistics and Actuarial Science, Univ. Iowa.
  • Lang, J. B. and Agresti, A. (1994). Simultaneously modeling joint and marginal distributions of multivariate categorical responses. J. Amer. Statist. Assoc. 89 625--632.
  • Lang, J. B., McDonald, J. W. and Smith, P. W. F. (1999). Association-marginal modeling of multivariate categorical responses: A maximum likelihood approach. J. Amer. Statist. Assoc. 94 1161--1171.
  • Lipsitz, S. R., Parzen, M. and Molenberghs, G. (1998). Obtaining the maximum likelihood estimates in incomplete $R\times C$ contingency tables using a Poisson generalized linear model. J. Comput. Graph. Statist. 7 356--376.
  • Lyons, N. I. and Hutcheson, K. (1986). Estimation of Simpson's diversity when counts follow a Poisson distribution. Biometrics 42 171--176.
  • Matthews, J. N. S and Morris, K. P. (1995). An application of Bradley--Terry-type models to the measurement of pain. Appl. Statist. 44 243--255.
  • Palmgren, J. (1981). The Fisher information matrix for log linear models arguing conditionally on observed explanatory variables. Biometrika 68 563--566.
  • Ross, S. M. (1993). Introduction to Probability Models, 5th ed. Academic Press, San Diego, CA.
  • Serfling, R. J. (1980). Approximation Theorems of Mathematical Statistics. Wiley, New York.
  • Silvey, S. D. (1959). The Lagrangian multiplier test. Ann. Math. Statist. 30 389--407.
  • Stigler, S. M. (1994). Citation patterns in the journals of statistics and probability. Statist. Sci. 9 94--108.
  • Stokes, M. E., Davis, C. S. and Koch, G. G. (2000). Categorical Data Analysis Using the SAS System, 2nd ed. SAS Institute, Cary, NC.
  • Wald, A. (1949). Note on the consistency of the maximum likelihood estimate. Ann. Math. Statist. 20 595--601.