Abstract
Multivariate longitudinal data are frequently encountered in practice such as in our motivating longitudinal microbiome study. It is of general interest to associate such high-dimensional, longitudinal measures with some univariate continuous outcome. However, incomplete observations are common in a regular study design, as not all samples are measured at every time point, giving rise to the so-called blockwise missing values. Such missing structure imposes significant challenges for association analysis and defies many existing methods that require complete samples. In this paper we propose to represent multivariate longitudinal data as a three-way tensor array (i.e., sample-by-feature-by-time) and exploit a parsimonious scalar-on-tensor regression model for association analysis. We develop a regularized covariance-based estimation procedure that effectively leverages all available observations without imputation. The method achieves variable selection and smooth estimation of time-varying effects. The application to the motivating microbiome study reveals interesting links between the preterm infant’s gut microbiome dynamics and their neurodevelopment. Additional numerical studies on synthetic data and a longitudinal aging study further demonstrate the efficacy of the proposed method.
Funding Statement
Dr. Gen Li’s research was supported, in part, by NIH Grants R01HG010731 and R03DE031296.
Citation
Tianchen Xu. Kun Chen. Gen Li. "Tensor regression for incomplete observations with application to longitudinal studies." Ann. Appl. Stat. 18 (2) 1195 - 1212, June 2024. https://doi.org/10.1214/23-AOAS1830
Information