The Annals of Applied Statistics

Covariance pattern mixture models for the analysis of multivariate heterogeneous longitudinal data

Laura Anderlucci and Cinzia Viroli

Full-text: Open access

Abstract

We propose a novel approach for modeling multivariate longitudinal data in the presence of unobserved heterogeneity for the analysis of the Health and Retirement Study (HRS) data. Our proposal can be cast within the framework of linear mixed models with discrete individual random intercepts; however, differently from the standard formulation, the proposed Covariance Pattern Mixture Model (CPMM) does not require the usual local independence assumption. The model is thus able to simultaneously model the heterogeneity, the association among the responses and the temporal dependence structure.

We focus on the investigation of temporal patterns related to the cognitive functioning in retired American respondents. In particular, we aim to understand whether it can be affected by some individual socio-economical characteristics and whether it is possible to identify some homogenous groups of respondents that share a similar cognitive profile. An accurate description of the detected groups allows government policy interventions to be opportunely addressed.

Results identify three homogenous clusters of individuals with specific cognitive functioning, consistent with the class conditional distribution of the covariates. The flexibility of CPMM allows for a different contribution of each regressor on the responses according to group membership. In so doing, the identified groups receive a global and accurate phenomenological characterization.

Article information

Source
Ann. Appl. Stat., Volume 9, Number 2 (2015), 777-800.

Dates
Received: January 2015
Revised: February 2015
First available in Project Euclid: 20 July 2015

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1437397111

Digital Object Identifier
doi:10.1214/15-AOAS816

Mathematical Reviews number (MathSciNet)
MR3371335

Zentralblatt MATH identifier
06499930

Keywords
Mixture models temporal dependence random effects model

Citation

Anderlucci, Laura; Viroli, Cinzia. Covariance pattern mixture models for the analysis of multivariate heterogeneous longitudinal data. Ann. Appl. Stat. 9 (2015), no. 2, 777--800. doi:10.1214/15-AOAS816. https://projecteuclid.org/euclid.aoas/1437397111


Export citation

References

  • Anderlucci, L. and Viroli, C. (2015). Supplement to “Covariance pattern mixture models for the analysis of multivariate heterogeneous longitudinal data.” DOI:10.1214/15-AOAS816SUPP.
  • Bandyopadhyay, S., Ganguli, B. and Chatterjee, A. (2011). A review of multivariate longitudinal data analysis. Stat. Methods Med. Res. 20 299–330.
  • Banfield, J. D. and Raftery, A. E. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics 49 803–821.
  • Bartolucci, F., Bacci, S. and Pennoni, F. (2014). Longitudinal analysis of self-reported health status by mixture latent auto-regressive models. J. R. Stat. Soc. Ser. C. Appl. Stat. 63 267–288.
  • Bartolucci, F., Farcomeni, A. and Pennoni, F. (2012). Latent Markov Models for Longitudinal Data. Chapman & Hall/CRC, London.
  • Brown, E. R. and Ibrahim, J. G. (2003). A Bayesian semiparametric joint hierarchical model for longitudinal and survival data. Biometrics 59 221–228.
  • Celeux, G. and Govaert, G. (1995). Gaussian parsimonious clustering models. Pattern Recogn. 28 781–793.
  • Chi, E. M. and Reinsel, G. C. (1989). Models for longitudinal data with random effects and $\mathrm{AR}(1)$ errors. J. Amer. Statist. Assoc. 84 452–459.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1–38.
  • De la Cruz-Mesía, R., Quintana, F. A. and Marshall, G. (2008). Model-based clustering for longitudinal data. Comput. Statist. Data Anal. 52 1441–1457.
  • Dutilleul, P. (1999). The MLE algorithm for the matrix normal distribution. J. Stat. Comput. Simul. 64 105–123.
  • Erosheva, E. A., Matsueda, R. L. and Telesca, D. (2014). Breaking bad: Two decades of life-course data analysis in criminology, developmental phsycology, and beyond. Annual Review of Statistics and Its Application 1 301–332.
  • Ferrer, E. and McArdle, J. J. (2003). Alternative structural models for multivariate longitudinal data analysis. Struct. Equ. Model. 10 493–524.
  • Fitzmaurice, G., Davidian, M., Verbeke, G. and Molenberghs, G., eds. (2009). Longitudinal Data Analysis. CRC Press, Boca Raton, FL.
  • Fraley, C. and Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. J. Amer. Statist. Assoc. 97 611–631.
  • Goldstein, H. (1995). Multilevel Statistical Models. Wiley, New York.
  • Grün, B. and Leisch, F. (2007). Fitting finite mixtures of generalized linear regressions in $\mathsf{R}$. Comput. Statist. Data Anal. 51 5247–5252.
  • Heeringa, S. G., Fisher, G. G., Hurd, M. D., Langa, K. M., Ofstedal, M. B., Plassman, B. L., Rodgers, W. and Weir, D. R. (2007). Aging, demographics and memory study (ADAMS). Sample design, weights, and analysis for ADAMS. Available at http://hrsonline.isr.umich.edu/meta/adams/desc/AdamsSampleWeights.pdf.
  • Juster, F. T. and Suzman, R. (1995). An overview of the health and retirement study. J. Hum. Resour. 30 135–145.
  • Kleinman, K. and Ibrahim, J. (1998). A semi-parametric Bayesian approach to the random effects model. Biometrics 54 921–938.
  • Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics 38 963–974.
  • Langa, K. M., Plassman, B. L., Wallace, R. B., Herzog, A. R., Heeringa, S. G., Ofstedal, M. B., Burke, J. R., Fisher, G. G., Fultz, N. H., Hurd, M. D., Potter, G. G., Rodgers, W. L., Steffens, D. C., Weir, D. R. and Willis, R. J. (2005). The aging, demographics, and memory study: Study design and methods. Neuroepidemiology 25 181–191.
  • Lazarsfeld, P. F. and Henry, N. W. (1968). Latent Structure Analysis. Houghton Mifflin, Boston.
  • Leiby, B. E., Sammel, M. D., Ten Have, T. R. and Lynch, K. G. (2009). Identification of multivariate responders and non-responders by using Bayesian growth curve latent class models. J. R. Stat. Soc. Ser. C. Appl. Stat. 58 505–524.
  • Manrique-Vallier, D. (2014). Longitudinal mixed membership trajectory models for disability survey data. Ann. Appl. Stat. 8 2268–2291.
  • McArdle, J. J., Fisher, G. G. and Kadlec, K. M. (2007). Latent variable analysis of age trends in tests of cognitive ability in the health and retirement survey, 1992–2004. Psychol. Aging 22 525–545.
  • McCulloch, C. (2008). Joint modelling of mixed outcome types using latent variables. Stat. Methods Med. Res. 17 53–73.
  • McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley-Interscience, New York.
  • McNicholas, P. D. and Murphy, T. B. (2010). Model-based clustering of longitudinal data. Canad. J. Statist. 38 153–168.
  • Müller, P. and Rosner, G. (1997). A Bayesian population model with hierarchical mixture priors applied to blood count data. J. Amer. Statist. Assoc. 92 1279–1292.
  • Müller, P., Rosner, G. L., De Iorio, M. and MacEachern, S. (2005). A nonparametric Bayesian model for inference in related longitudinal studies. J. Roy. Statist. Soc. Ser. C 54 611–626.
  • Muthén, B. O. (2002). Beyond SEM: General latent variable modeling. Behaviormetrika 29 81–117.
  • Muthén, B. and Asparouhov, T. (2009). Growth mixture modeling: Analysis with non-Gaussian random effects. In Longitudinal Data Analysis 143–165. CRC Press, Boca Raton, FL.
  • Naik, D. N. and Rao, S. S. (2001). Analysis of multivariate repeated measures data with a Kronecker product structured covariance matrix. J. Appl. Stat. 28 91–105.
  • Newton, H. J. (1988). TIMESLAB: A Time Series Analysis Laboratory. Wadsworth & Brooks/Cole, Pacific Grove, CA.
  • Plassman, B. L., Langa, K. M., Fisher, G. G., Heeringa, S. G., Weir, D. R., Ofstedal, M. B., Burke, J. R., Hurd, M. D., Potter, G. G., Rodgers, W. L., Steffens, D. C., McArdle, J. J., Willis, R. J. and Wallace, R. B. (2008). Prevalence of cognitive impairment without dementia in the United States. Ann. Intern. Med. 148 427–434.
  • Pourahmadi, M. (1999). Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation. Biometrika 86 677–690.
  • Proust-Lima, C., Amieva, H. and Jacqmin-Gadda, H. (2013). Analysis of multivariate mixed longitudinal data: A flexible latent process approach. Br. J. Math. Stat. Psychol. 66 470–486.
  • Proust-Lima, C. and Jacqmin-Gadda, H. (2005). Estimation of linear mixed models with a mixture of distribution for the random-effects. Comput. Methods Programs Biomed. 78 165–173.
  • Quandt, R. E. and Ramsey, J. B. (1978). Estimating mixtures of normal distributions and switching regressions. J. Amer. Statist. Assoc. 73 730–752.
  • Reinsel, G. (1984). Estimation and prediction in a multivariate random effects generalized linear model. J. Amer. Statist. Assoc. 79 406–414.
  • Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models. Chapman & Hall/CRC, Boca Raton, FL.
  • Steffens, D. C., Fisher, G. G., Langa, K. M., Potter, G. G. and Plassman, L. G. (2009). Prevalence of depression among older americans: The aging, demographics and memory study. Int. Psychogeriatr. 21 879–888.
  • Timmerman, M. E. and Kiers, H. A. L. (2003). Four simultaneous component models for the analysis of multivariate time series from more than one subject to model intraindividual and interindividual differences. Psychometrika 68 105–121.
  • Vasdekis, V. G. S., Cagnone, S. and Moustaki, I. (2012). A composite likelihood inference in latent variable models for ordinal longitudinal responses. Psychometrika 77 425–441.
  • Verbeke, G. and Lesaffre, E. (1996). A linear mixed-effects model with heterogeneity in the random effects population. J. Amer. Statist. Assoc. 91 217–221.
  • Verbeke, G., Fieuws, S., Molenberghs, G. and Davidian, M. (2014). The analysis of multivariate longitudinal data: A review. Stat. Methods Med. Res. 23 42–59.
  • Vermunt, J. K. and Magidson, J. (2003). Latent class models for classification. Comput. Statist. Data Anal. 41 531–537.
  • Viroli, C. (2011). Finite mixtures of matrix normal distributions for classifying three-way data. Stat. Comput. 21 511–522.
  • Viroli, C. (2012). On matrix-variate regression analysis. J. Multivariate Anal. 111 296–309.

Supplemental materials