Most of the available methods for longitudinal data analysis are designed and validated for the situation where the number of subjects is large and the number of observations per subject is relatively small. Motivated by the Naturalistic Teenage Driving Study (NTDS), which represents the exact opposite situation, we examine standard and propose new methodology for marginal analysis of longitudinal count data in a small number of very long sequences. We consider standard methods based on generalized estimating equations, under working independence or an appropriate correlation structure, and find them unsatisfactory for dealing with time-dependent covariates when the counts are low. For this situation, we explore a within-cluster resampling (WCR) approach that involves repeated analyses of random subsamples with a final analysis that synthesizes results across subsamples. This leads to a novel WCR method which operates on separated blocks within subjects and which performs better than all of the previously considered methods. The methods are applied to the NTDS data and evaluated in simulation experiments mimicking the NTDS.
"Marginal analysis of longitudinal count data in long sequences: Methods and applications to a driving study." Ann. Appl. Stat. 6 (1) 27 - 54, March 2012. https://doi.org/10.1214/11-AOAS507