The Annals of Statistics

Marginal models for categorical data

Wicher P. Bergsma and Tamás Rudas

Full-text: Open access

Abstract

Statistical models defined by imposing restrictions on marginal distributions of contingency tables have received considerable attention recently. This paper introduces a general definition of marginal log-linear parameters and describes conditions for a marginal log-linear parameter to be a smooth parameterization of the distribution and to be variation independent. Statistical models defined by imposing affine restrictions on the marginal log-linear parameters are investigated. These models generalize ordinary log-linear and multivariate logistic models. Sufficient conditions for a log-affine marginal model to be nonempty and to be a curved exponential family are given. Standard large-sample theory is shown to apply to maximum likelihood estimation of log-affine marginal models for a variety of sampling procedures.

Article information

Source
Ann. Statist., Volume 30, Number 1 (2002), 140-159.

Dates
First available in Project Euclid: 5 March 2002

Permanent link to this document
https://projecteuclid.org/euclid.aos/1015362188

Digital Object Identifier
doi:10.1214/aos/1015362188

Mathematical Reviews number (MathSciNet)
MR1892659

Zentralblatt MATH identifier
1012.62063

Subjects
Primary: 62H17: Contingency tables
Secondary: 62E99: None of the above, but in this section

Keywords
Marginal log-linear parameters log-affine and log-linear marginal models smooth parameterization variation independence existence and connectedness of a model curved exponential family asymptotic normality of maximum likelihood estimates

Citation

Bergsma, Wicher P.; Rudas, Tamás. Marginal models for categorical data. Ann. Statist. 30 (2002), no. 1, 140--159. doi:10.1214/aos/1015362188. https://projecteuclid.org/euclid.aos/1015362188


Export citation

References

  • AGRESTI, A. and LANG, J. B. (1993). A proportional odds model with subject-specific effects for repeated ordered categorical responses. Biometrika 80 527-534.
  • AITCHISON, J. and SILVEY, S. D. (1958). Maximum likelihood estimation of parameters subject to restraints. Ann. Math. Statist. 29 813-828.
  • BARNDORFF-NIELSEN, O. (1978). Information and Exponential Families in Statistical Theory. Wiley, New York.
  • BECKER, M. P. (1994). Analysis of repeated categorical measurements using models for marginal distributions: an application to trends in attitudes on legalized abortion. In Sociological Methodology (P. V. Marsden, ed.). Blackwell, Oxford.
  • BERGSMA, W. P. (1997). Marginal Models for Categorical Data. Tilburg Univ. Press.
  • COLOMBI, R. (1998). A multivariate logit model with marginal canonical association. Comm. Statist. Theory Methods 27 2953-2972.
  • COLOMBI, R. and FORCINA, A. (2001). Marginal regression models for the analysis of positive association of ordinal response variables. Biometrika 88 1007-1019.
  • CROON, M. A., BERGSMA, W. P. and HAGENAARS, J. A. (2000). Analyzing change in categorical variables by generalized log-linear interaction models. Sociol. Methods Res. 29 195-229.
  • DARROCH, J. N., LAURITZEN, S. L. and SPEED, T. P. (1980). Markov fields and log-linear interaction models for contingency tables. Ann. Statist. 8 522-539.
  • DAWID, A. P. (1980). Conditional independence for statistical operations. Ann. Statist. 8 598-617.
  • GLONEK, G. F. V. (1996). A class of regression models for multivariate categorical responses. Biometrika 83 15-28.
  • GLONEK, G. F. V. and MCCULLAGH, P. (1995). Multivariate logistic models. J. Roy. Statist. Soc. Ser. B 57 533-546.
  • GOODMAN, L. A. (1979). Simple models for the analysis of association in cross-classifications having ordered categories. J. Amer. Statist. Assoc. 74 537-552.
  • HABER, M. (1985). Maximum likelihood methods for linear and loglinear models in categorical data. Comput. Statist. Data Anal. 3 1-10.
  • HABERMAN, S. J. (1974). The Analysis of Frequency Data. Univ. Chicago Press.
  • KAUERMANN, G. (1997). A note on multivariate logistic models for contingency tables. Austral. J. Statist. 39 261-276.
  • LANG, J. B. (1996). Maximum likelihood methods for a generalized class of log-linear models. Ann. Statist. 24 726-752.
  • LANG, J. B. and AGRESTI, A. (1994). Simultaneously modelling joint and marginal distributions of multivariate categorical responses. J. Amer. Statist. Assoc. 89 625-632.
  • LAURITZEN, S. L. (1996). Graphical Models. Clarendon, Oxford.
  • LIANG, K. Y., ZEGER, S. L. and QAQISH, B. (1992). Multivariate regression analyses for categorical data (with discussion). J. Roy. Statist. Soc. Ser. B 54 3-40.
  • MCCULLAGH, P. and NELDER, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman and Hall, London.
  • MOLENBERGHS, G. and LESAFFRE, E. (1994). Marginal modelling of correlated ordinal data using a multivariate Plackett distribution. J. Amer. Statist. Assoc. 89 633-644.
  • RUDAS, T. and LEIMER, H. G. (1992). Analysis of contingency tables with known conditional odds ratios or known log-linear parameters. In Statistical Modelling (B. Francis, G. U. H. Seeber, P. G. M. van der Heijden and W. Jansen, eds.) 312-322. Elsevier, New York.
  • WHITTEMORE, A. S. (1978). Collapsibility of multidimensional contingency tables. J. Roy. Statist. Soc. Ser. B 40 328-340.