Abstract
In long-term follow-up studies, data are often collected on repeated measures of multivariate response variables as well as on time to the occurrence of a certain event. To jointly analyze such longitudinal data and survival time, we propose a general class of semiparametric latent-class models that accommodates a heterogeneous study population with flexible dependence structures between the longitudinal and survival outcomes. We combine nonparametric maximum likelihood estimation with sieve estimation and devise an efficient EM algorithm to implement the proposed approach. We establish the asymptotic properties of the proposed estimators through novel use of modern empirical process theory, sieve estimation theory and semiparametric efficiency theory. Finally, we demonstrate the advantages of the proposed methods through extensive simulation studies and provide an application to the Atherosclerosis Risk in Communities study.
Funding Statement
This work was supported by a research grant from the Hong Kong Polytechnic University (P0030124), the Hong Kong Research Grants Council grant PolyU 253042/18P and the National Institutes of Health awards R01-HL149683 and R01-HG009974. The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. (HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700005I, HHSN268201700004I).
Acknowledgments
The authors thank the staff and participants of the ARIC study for their important contributions.
Citation
Kin Yau Wong. Donglin Zeng. D. Y. Lin. "Semiparametric latent-class models for multivariate longitudinal and survival data." Ann. Statist. 50 (1) 487 - 510, February 2022. https://doi.org/10.1214/21-AOS2117
Information