The Annals of Applied Statistics

A flexible regression model for count data

Kimberly F. Sellers and Galit Shmueli
Source: Ann. Appl. Stat. Volume 4, Number 2 (2010), 943-961.

Abstract

Poisson regression is a popular tool for modeling count data and is applied in a vast array of applications from the social to the physical sciences and beyond. Real data, however, are often over- or under-dispersed and, thus, not conducive to Poisson regression. We propose a regression model based on the Conway–Maxwell-Poisson (COM-Poisson) distribution to address this problem. The COM-Poisson regression generalizes the well-known Poisson and logistic regression models, and is suitable for fitting count data with a wide range of dispersion levels. With a GLM approach that takes advantage of exponential family properties, we discuss model estimation, inference, diagnostics, and interpretation, and present a test for determining the need for a COM-Poisson regression over a standard Poisson regression. We compare the COM-Poisson to several alternatives and illustrate its advantages and usefulness using three data sets with varying dispersion.

First Page: Show Hide

Related Works:

Full-text: Access denied (no subscription detected)
In 2007, access to the Annals of Applied Statistics was open. Beginning in 2008, you must hold a subscription or be a member of the IMS to view the full journal. For more information on subscribing, please visit: http://imstat.org/orders.
If you are already an IMS member, you may need to update your Euclid profile following the instructions here: http://imstat.org/publications/eaccess.htm.
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1280842147
Digital Object Identifier: doi:10.1214/09-AOAS306
Zentralblatt MATH identifier: 05782548
Mathematical Reviews number (MathSciNet): MR2758428

References

Ben, M. G. and Yohai, V. J. (2004). Quantile quantile plot for deviance residuals in the generalized linear model. J. Comput. Graph. Statist. 13 36–47.
Mathematical Reviews (MathSciNet): MR2044869
Digital Object Identifier: doi:10.1198/1061860042949_a
Boatwright, P., Borle, S. and Kadane, J. B. (2003). A model of the joint distribution of purchase quantity and timing. J. Amer. Statist. Assoc. 98 564–572.
Mathematical Reviews (MathSciNet): MR2011672
Zentralblatt MATH: 1045.62118
Digital Object Identifier: doi:10.1198/016214503000000404
Borle, S., Boatwright, P. and Kadane, J. B. (2006). The timing of bid placement and extent of multiple bidding: An empirical investigation using ebay online auctions. Statist. Sci. 21 194–205.
Mathematical Reviews (MathSciNet): MR2324078
Digital Object Identifier: doi:10.1214/088342306000000123
Project Euclid: euclid.ss/1154979821
Borle, S., Boatwright, P., Kadane, J. B., Nunes, J. C. and Shmueli, G. (2005). The effect of product assortment changes on customer retention. Marketing Science 24 616–622.
Borle, S., Dholakia, U., Singh, S. and Westbrook, R. (2007). The impact of survey participation on subsequent behavior: An empirical investigation. Marketing Science 26 711–726.
Cui, Y., Kim, D.-Y. and Zhu, J. (2006). On the generalized Poisson regression mixture model for mapping quantitative trait loci with count data. Genetics 174 2159–2172.
Davison, A. and Tsai, C.-L. (1992). Regression model diagnostics. International Statistical Review 60 337–353.
Famoye, F. (1993). Restricted generalized Poisson regression model. Comm. Statist. Theory Methods 22 1335–1354.
Mathematical Reviews (MathSciNet): MR1225247
Zentralblatt MATH: 0784.62018
Digital Object Identifier: doi:10.1080/03610929308831089
Famoye, F., Wulu, J. J. and K. P. Singh (2004). On the generalized Poisson regression model with an application to accident data. Journal of Data Science 2 287–295.
Kadane, J. B., Krishnan, R. and Shmueli, G. (2006). A data disclosure policy for count data based on the COM-Poisson distribution. Management Science 52 1610–1617.
Kadane, J. B., Shmueli, G., Minka, T. P., Borle, S. and Boatwright, P. (2005). Conjugate analysis of the Conway–Maxwell-Poisson distribution. Bayesian Anal. 1 363–374.
Mathematical Reviews (MathSciNet): MR2221269
Digital Object Identifier: doi:10.1214/06-BA113
Kalyanam, K., Borle, S. and Boatwright, P. (2007). Deconstructing each item’s category contribution. Marketing Science 26 327–341.
Kutner, M. H., Nachtsheim, C. J. and Neter, J. (2003). Applied Linear Regression Models, 4th ed. McGraw-Hill, New York.
Lattin, J. M., Green, P. E. J. and Caroll, D. (2003). Analyzing Mulivariate Data. Duxbury, Pacific Grove, CA.
Long, J. S. (1997). Regression Models for Categorical and Limited Dependent Variables. Sage, London.
Zentralblatt MATH: 0911.62055
Lord, D., Guikema, S. D. and Geedipally, S. R. (2008). Application of the Conway–Maxwell-Poisson generalized linear model for analyzing motor vehicle crashes. Accident Analysis & Prevention 40 1123–1134.
McCullagh, P. and Nelder, J. A. (1997). Generalized Linear Models, 2nd ed. Chapman & Hall/CRC, London.
Mathematical Reviews (MathSciNet): MR727836
Minka, T. P., Shmueli, G., Kadane, J. B., Borle, S. and Boatwright, P. (2003). Computing with the COM-Poisson distribution. Technical Report 776, Dept. Statistics, Carnegie Mellon Univ., Pittsburgh, PA.
Shmueli, G., Minka, T. P., Kadane, J. B., Borle, S. and Boatwright, P. (2005). A useful distribution for fitting discrete data: Revival of the Conway–Maxwell-Poisson distribution. Appl. Statist. 54 127–142.
Mathematical Reviews (MathSciNet): MR2134602
Zentralblatt MATH: 05188676
Digital Object Identifier: doi:10.1111/j.1467-9876.2005.00474.x

2013 © Institute of Mathematical Statistics

The Annals of Applied Statistics

The Annals of Applied Statistics

Turn MathJax Off
What is MathJax?