The Annals of Statistics
- Ann. Statist.
- Volume 33, Number 6 (2005), 2873-2903.
Functional linear regression analysis for longitudinal data
We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allows for different patterns regarding the timing of the measurements obtained for predictor and response trajectories. Asymptotic properties for a sample of n subjects are investigated under mild conditions, as n→∞, and we obtain consistent estimation for the regression function. Besides convergence results for the components of functional linear regression, such as the regression parameter function, we construct asymptotic pointwise confidence bands for the predicted trajectories. A functional coefficient of determination as a measure of the variance explained by the functional regression model is introduced, extending the standard R2 to the functional case. The proposed methods are illustrated with a simulation study, longitudinal primary biliary liver cirrhosis data and an analysis of the longitudinal relationship between blood pressure and body mass index.
Ann. Statist. Volume 33, Number 6 (2005), 2873-2903.
First available in Project Euclid: 17 February 2006
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62M20: Prediction [See also 60G25]; filtering [See also 60G35, 93E10, 93E11]
Secondary: 60G15: Gaussian processes 62G05: Estimation
Yao, Fang; Müller, Hans-Georg; Wang, Jane-Ling. Functional linear regression analysis for longitudinal data. Ann. Statist. 33 (2005), no. 6, 2873--2903. doi:10.1214/009053605000000660. http://projecteuclid.org/euclid.aos/1140191677.