The Annals of Applied Statistics

A generalized linear mixed model for longitudinal binary data with a marginal logit link function

Michael Parzen, Souparno Ghosh, Stuart Lipsitz, Debajyoti Sinha, Garrett M. Fitzmaurice, Bani K. Mallick, and Joseph G. Ibrahim

Full-text: Open access

Abstract

Longitudinal studies of a binary outcome are common in the health, social, and behavioral sciences. In general, a feature of random effects logistic regression models for longitudinal binary data is that the marginal functional form, when integrated over the distribution of the random effects, is no longer of logistic form. Recently, Wang and Louis [Biometrika 90 (2003) 765–775] proposed a random intercept model in the clustered binary data setting where the marginal model has a logistic form. An acknowledged limitation of their model is that it allows only a single random effect that varies from cluster to cluster. In this paper we propose a modification of their model to handle longitudinal data, allowing separate, but correlated, random intercepts at each measurement occasion. The proposed model allows for a flexible correlation structure among the random intercepts, where the correlations can be interpreted in terms of Kendall’s τ. For example, the marginal correlations among the repeated binary outcomes can decline with increasing time separation, while the model retains the property of having matching conditional and marginal logit link functions. Finally, the proposed method is used to analyze data from a longitudinal study designed to monitor cardiac abnormalities in children born to HIV-infected women.

Article information

Source
Ann. Appl. Stat., Volume 5, Number 1 (2011), 449-467.

Dates
First available in Project Euclid: 21 March 2011

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1300715198

Digital Object Identifier
doi:10.1214/10-AOAS390

Mathematical Reviews number (MathSciNet)
MR2810405

Zentralblatt MATH identifier
1220.62093

Keywords
Correlated binary data multivariate normal distribution probability integral transformation

Citation

Parzen, Michael; Ghosh, Souparno; Lipsitz, Stuart; Sinha, Debajyoti; Fitzmaurice, Garrett M.; Mallick, Bani K.; Ibrahim, Joseph G. A generalized linear mixed model for longitudinal binary data with a marginal logit link function. Ann. Appl. Stat. 5 (2011), no. 1, 449--467. doi:10.1214/10-AOAS390. https://projecteuclid.org/euclid.aoas/1300715198


Export citation

References

  • Albert, P. S., Follmann, D. A., Wang, S. A. and Suh, E. B. (2002). A latent autoregressive model for longitudinal binary data subject to informative missingness. Biometrics 58 631–642.
  • Bahadur, R. R. (1961). A representation of the joint distribution of responses to n dichotomous items. In Studies in Item Analysis and Prediction (H. Solomon, ed.). Stanford Mathematical Studies in the Social Sciences VI 158–168. Stanford Univ. Press.
  • Caffo, B., An, M.-W. and Rohde, C. (2007). Flexible random intercept models for binary outcomes using mixtures of normals. Comput. Statist. Data Anal. 51 5220–5235.
  • Caffo, B. and Griswold, M. (2006). A user-friendly introduction to link-probit-normal models. Amer. Statist. 60 139–145.
  • Diggle, P. J., Heagerty, P., Liang, K. Y. and Zeger, S. L. (2002). Analysis of Longitudinal Data, 2nd ed. Oxford Univ. Press, Oxford.
  • Fahrmeir, L. and Tutz, G. (2001). Multivariate Statistical Modelling Based on Generalized Linear Models. Springer, New York.
  • Fitzmaurice, G. M. (1995). A caveat concerning independence estimating equations with multivariate binary data. Biometrics 51 309–317.
  • Fitzmaurice, G. M., Laird, N. M. and Rotnitzky, A. G. (1993). Regression models for discrete longitudinal responses (with discussion). Statist. Sci. 8 248–309.
  • Heagerty, P. J. (1999). Marginally specified logistic-normal models for longitudinal binary data. Biometrics 55 688–698.
  • Heagerty, P. J. and Zeger, S. L. (2000). Marginalized multilevel models and likelihood inference (with comments and a rejoinder by the authors). Statist. Sci. 15 1–26.
  • Hoel, P. G., Port, S. C. and Stone, C. J. (1971). Introduction to Probability Theory. Houghton Mifflin, Boston, MA.
  • Hougaard, P. (2000). Analysis of Multivariate Survival Data. Springer, New York.
  • Joe, H. (1997). Multivariate Models and Dependence Concepts. Chapman and Hall, London.
  • Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data. Wiley, New York.
  • Laird, N. M. (1988). Missing data in longitudinal studies. Stat. Med. 7 305–315.
  • Lee, Y. and Nelder, J. A. (2004). Conditional and marginal models: Another review. Statist. Sci. 19 219–228.
  • Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13–22.
  • Lipshultz, S. E., Easley, K. A., Orav, E. J., Kaplan, S., Starc, T. J., Bricker, J. T., Lai, W. W., Moodie, D. S., McIntosh, K., Schluchter, M. D. and Colan, S. D. (1998). Left ventricular structure and function in children infected with human immunodeficiency virus: The prospective P2C2 HIV Multicenter Study. Pediatric Pulmonary and Cardiac Complications of Vertically Transmitted HIV Infection (P2C2 HIV) Study Group. Circulation 97 1246–1256.
  • Lipshultz, S. E., Easley, K. A., Orav, E. J., Kaplan, S., Starc, T. J., Bricker, J. T., Lai, W. W., Moodie, D. S., Sopko, G. and Colan, S. D. (2000). Cardiac dysfunction and mortality in HIV-infected children: The Prospective P2C2 HIV Multicenter Study. Pediatric Pulmonary and Cardiac Complications of Vertically Transmitted HIV Infection (P2C2 HIV) Study Group. Circulation 102 1542–1548.
  • Lipshultz, S. E., Easley, K. A., Orav, E. J., Kaplan, S., Starc, T. J., Bricker, J. T., Lai, W. W., Moodie, D. S., Sopko, G., Schluchter, M. D. and Colan, S. D. (2002). Cardiovascular status of infants and children of women infected with HIV-1 (P(2)C(2) HIV): A cohort study. Lancet 360 368–373.
  • Lipsitz, S. R., Laird, N. M. and Harrington, D. P. (1991). Generalized estimating equations for correlated binary data: Using the odds ratio as a measure of association. Biometrika 78 153–160.
  • McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman and Hall, New York.
  • Molenberghs, G. and Lesaffre, E. (1994). Marginal modelling of correlated ordinal data using a multivariate Plackett distribution. J. Amer. Statist. Assoc. 89 633–644.
  • Nelsen, R. B. (1999). An Introduction to Copulas. Springer, New York.
  • Neuhaus, J. M., Kalbfleisch, J. D. and Hauck, W. W. (1991). A comparison of cluster-specific and population-averaged approaches for analyzing correlated binary data. Int. Statist. Rev. 59 25–35.
  • Pinheiro, J. C. and Bates, D. M. (1995). Approximations to the log-likelihood function in the nonlinear mixed-effects model. J. Comput. Graph. Statist. 4 12–35.
  • Rubin, D. B. (1976). Inference and missing data. Biometrika 63 581–592.
  • Wang, Z. and Louis, T. A. (2003). Matching conditional and marginal shapes in binary mixed-effects models using a bridge distribution function. Biometrika 90 765–775.
  • Wang, Z. and Louis, T. A. (2004). Marginalized binary mixed-effects with covariate-dependent random effects and likelihood inference. Biometrics 60 884–891.
  • Zhao, L. P. and Prentice, R. L. (1990). Correlated binary regression using a quadratic exponential model. Biometrika 77 642–648.