A flexible regression model for count data

Kimberly F. Sellers and Galit Shmueli

Poisson regression is a popular tool for modeling count data and is applied in a vast array of applications from the social to the physical sciences and beyond. Real data, however, are often over- or under-dispersed and, thus, not conducive to Poisson regression. We propose a regression model based on the Conway–Maxwell-Poisson (COM-Poisson) distribution to address this problem. The COM-Poisson regression generalizes the well-known Poisson and logistic regression models, and is suitable for fitting count data with a wide range of dispersion levels. With a GLM approach that takes advantage of exponential family properties, we discuss model estimation, inference, diagnostics, and interpretation, and present a test for determining the need for a COM-Poisson regression over a standard Poisson regression. We compare the COM-Poisson to several alternatives and illustrate its advantages and usefulness using three data sets with varying dispersion.

Ann. Appl. Stat. Volume 4, Number 2 (2010), 943-961.

Sellers, Kimberly F.; Shmueli, Galit. A flexible regression model for count data. Ann. Appl. Stat. 4 (2010), no. 2, 943--961. doi:10.1214/09-AOAS306.

Supplemental materials

  • Supplementary materials: Materials include details of the iterative reweighted least squares estimation, the Fisher information matrix components associated with the COM-Poisson coefficients, the full airfreight data set and diagnostics under various regression models for the airfreight and crash data, and additional logistic regression examples.