Source: Ann. Appl. Stat. Volume 4, Number 2
(2010), 943-961.
Poisson regression is a popular tool for modeling count data and
is applied in a vast array of applications from the social to
the physical sciences and beyond. Real data, however, are often
over- or under-dispersed and, thus, not conducive to Poisson
regression. We propose a regression model based on the
Conway–Maxwell-Poisson (COM-Poisson) distribution to address
this problem. The COM-Poisson regression generalizes the
well-known Poisson and logistic regression models, and is
suitable for fitting count data with a wide range of dispersion
levels. With a GLM approach that takes advantage of exponential
family properties, we discuss model estimation, inference,
diagnostics, and interpretation, and present a test for
determining the need for a COM-Poisson regression over a
standard Poisson regression. We compare the COM-Poisson to
several alternatives and illustrate its advantages and
usefulness using three data sets with varying dispersion.
References
Ben, M. G. and Yohai, V. J. (2004). Quantile quantile plot for deviance residuals in the generalized linear model. J. Comput. Graph. Statist. 13 36–47.
Boatwright, P., Borle, S. and Kadane, J. B. (2003). A model of the joint distribution of purchase quantity and timing. J. Amer. Statist. Assoc. 98 564–572.
Borle, S., Boatwright, P. and Kadane, J. B. (2006). The timing of bid placement and extent of multiple bidding: An empirical investigation using ebay online auctions. Statist. Sci. 21 194–205.
Borle, S., Boatwright, P., Kadane, J. B., Nunes, J. C. and Shmueli, G. (2005). The effect of product assortment changes on customer retention. Marketing Science 24 616–622.
Borle, S., Dholakia, U., Singh, S. and Westbrook, R. (2007). The impact of survey participation on subsequent behavior: An empirical investigation. Marketing Science 26 711–726.
Cui, Y., Kim, D.-Y. and Zhu, J. (2006). On the generalized Poisson regression mixture model for mapping quantitative trait loci with count data. Genetics 174 2159–2172.
Davison, A. and Tsai, C.-L. (1992). Regression model diagnostics. International Statistical Review 60 337–353.
Famoye, F. (1993). Restricted generalized Poisson regression model. Comm. Statist. Theory Methods 22 1335–1354.
Famoye, F., Wulu, J. J. and K. P. Singh (2004). On the generalized Poisson regression model with an application to accident data. Journal of Data Science 2 287–295.
Kadane, J. B., Krishnan, R. and Shmueli, G. (2006). A data disclosure policy for count data based on the COM-Poisson distribution. Management Science 52 1610–1617.
Kadane, J. B., Shmueli, G., Minka, T. P., Borle, S. and Boatwright, P. (2005). Conjugate analysis of the Conway–Maxwell-Poisson distribution. Bayesian Anal. 1 363–374.
Kalyanam, K., Borle, S. and Boatwright, P. (2007). Deconstructing each item’s category contribution. Marketing Science 26 327–341.
Kutner, M. H., Nachtsheim, C. J. and Neter, J. (2003). Applied Linear Regression Models, 4th ed. McGraw-Hill, New York.
Lattin, J. M., Green, P. E. J. and Caroll, D. (2003). Analyzing Mulivariate Data. Duxbury, Pacific Grove, CA.
Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Sage, London.
Lord, D., Guikema, S. D. and Geedipally, S. R. (2008). Application of the Conway–Maxwell-Poisson generalized linear model for analyzing motor vehicle crashes. Accident Analysis & Prevention 40 1123–1134.
McCullagh, P. and Nelder, J. A. (1997). Generalized Linear Models, 2nd ed. Chapman & Hall/CRC, London.
Mathematical Reviews (MathSciNet):
MR727836
Minka, T. P., Shmueli, G., Kadane, J. B., Borle, S. and Boatwright, P. (2003). Computing with the COM-Poisson distribution. Technical Report 776, Dept. Statistics, Carnegie Mellon Univ., Pittsburgh, PA.
Shmueli, G., Minka, T. P., Kadane, J. B., Borle, S. and Boatwright, P. (2005). A useful distribution for fitting discrete data: Revival of the Conway–Maxwell-Poisson distribution. Appl. Statist. 54 127–142.