Annals of Statistics

Functional linear regression analysis for longitudinal data

Fang Yao, Hans-Georg Müller, and Jane-Ling Wang

Full-text: Open access


We propose nonparametric methods for functional linear regression which are designed for sparse longitudinal data, where both the predictor and response are functions of a covariate such as time. Predictor and response processes have smooth random trajectories, and the data consist of a small number of noisy repeated measurements made at irregular times for a sample of subjects. In longitudinal studies, the number of repeated measurements per subject is often small and may be modeled as a discrete random number and, accordingly, only a finite and asymptotically nonincreasing number of measurements are available for each subject or experimental unit. We propose a functional regression approach for this situation, using functional principal component analysis, where we estimate the functional principal component scores through conditional expectations. This allows the prediction of an unobserved response trajectory from sparse measurements of a predictor trajectory. The resulting technique is flexible and allows for different patterns regarding the timing of the measurements obtained for predictor and response trajectories. Asymptotic properties for a sample of n subjects are investigated under mild conditions, as n→∞, and we obtain consistent estimation for the regression function. Besides convergence results for the components of functional linear regression, such as the regression parameter function, we construct asymptotic pointwise confidence bands for the predicted trajectories. A functional coefficient of determination as a measure of the variance explained by the functional regression model is introduced, extending the standard R2 to the functional case. The proposed methods are illustrated with a simulation study, longitudinal primary biliary liver cirrhosis data and an analysis of the longitudinal relationship between blood pressure and body mass index.

Article information

Ann. Statist., Volume 33, Number 6 (2005), 2873-2903.

First available in Project Euclid: 17 February 2006

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62M20: Prediction [See also 60G25]; filtering [See also 60G35, 93E10, 93E11]
Secondary: 60G15: Gaussian processes 62G05: Estimation

Asymptotics coefficient of determination confidence band eigenfunctions functional data analysis prediction repeated measurements smoothing stochastic process


Yao, Fang; Müller, Hans-Georg; Wang, Jane-Ling. Functional linear regression analysis for longitudinal data. Ann. Statist. 33 (2005), no. 6, 2873--2903. doi:10.1214/009053605000000660.

Export citation


  • Besse, P., Cardot, H. and Ferraty, F. (1997). Simultaneous nonparametric regressions of unbalanced longitudinal data. Comput. Statist. Data Anal. 24 255–270.
  • Boularan, J., Ferré, L. and Vieu, P. (1995). A nonparametric model for unbalanced longitudinal data, with application to geophysical data. Comput. Statist. 10 285–298.
  • Cardot, H., Ferraty, F., Mas, A. and Sarda, P. (2003). Testing hypotheses in the functional linear model. Scand. J. Statist. 30 241–255.
  • Cardot, H., Ferraty, F. and Sarda, P. (1999). Functional linear model. Statist. Probab. Lett. 45 11–22.
  • Cardot, H., Ferraty, F. and Sarda, P. (2003). Spline estimators for the functional linear model. Statist. Sinica 13 571–591.
  • Chiou, J.-M., Müller, H.-G. and Wang, J.-L. (2003). Functional quasi-likelihood regression models with smooth random effects. J. R. Stat. Soc. Ser. B Stat. Methodol. 65 405–423.
  • Chiou, J.-M., Müller, H.-G., Wang, J.-L. and Carey, J. R. (2003). A functional multiplicative effects model for longitudinal data, with application to reproductive histories of female medflies. Statist. Sinica 13 1119–1133.
  • Cuevas, A., Febrero, M. and Fraiman, R. (2002). Linear functional regression: The case of fixed design and functional response. Canad. J. Statist. 30 285–300.
  • Cui, H., He, X. and Zhu, L. (2002). On regression estimators with de-noised variables. Statist. Sinica 12 1191–1205.
  • Doksum, K., Blyth, S., Bradlow, E., Meng, X.-L. and Zhao, H. (1994). Correlation curves as local measures of variance explained by regression. J. Amer. Statist. Assoc. 89 571–582.
  • Doksum, K. and Samarov, A. (1995). Nonparametric estimation of global functionals and a measure of the explanatory power of covariates in regression. Ann. Statist. 23 1443–1473.
  • Draper, N. R. and Smith, H. (1998). Applied Regression Analysis, 3rd ed. Wiley, New York.
  • Fan, J. and Lin, S.-K. (1998). Tests of significance when data are curves. J. Amer. Statist. Assoc. 93 1007–1021.
  • Fan, J. and Zhang, J.-T. (2000). Two-step estimation of functional linear models with applications to longitudinal data. J. Roy. Stat. Soc. Ser. B Stat. Methodol. 62 303–322.
  • Faraway, J. J. (1997). Regression analysis for a functional response. Technometrics 39 254–261.
  • Ferraty, F. and Vieu, P. (2003). Functional nonparametric statistics: A double infinite dimensional framework. In Recent Advances and Trends in Nonparametric Statistics (M. G. Akritas and D. N. Politis, eds.) 61–78. North-Holland, Amsterdam.
  • Fleming, T. R. and Harrington, D. P. (1991). Counting Processes and Survival Analysis. Wiley, New York.
  • Grenander, U. (1950). Stochastic processes and statistical inference. Ark. Mat. 1 195–277.
  • Grenander, U. (1968). Probabilities on Algebraic Structures, 2nd ed. Almqvist and Wiksell, Stockholm.
  • He, G., Müller, H.-G. and Wang, J.-L. (2000). Extending correlation and regression from multivariate to functional data. In Asymptotics in Statistics and Probability (M. L. Puri, ed.) 197–210. VSP, Leiden.
  • James, G., Hastie, T. J. and Sugar, C. A. (2000). Principal component models for sparse functional data. Biometrika 87 587–602.
  • Morrell, C. H., Pearson, J. D. and Brant, L. J. (1997). Linear transformations of linear mixed-effects models. Amer. Statist. 51 338–343.
  • Murtaugh, P. A., Dickson, E. R., Van Dam, G. M., Malinchoc, M., Grambsch, P. M., Langworthy, A. L. and Gips, C. H. (1994). Primary biliary cirrhosis: Prediction of short-term survival based on repeated patient visits. Hepatology 20 126–134.
  • Pearson, J. D., Morrell, C. H., Brant, L. J., Landis, P. K. and Fleg, J. L. (1997). Age-associated changes in blood pressure in a longitudinal study of healthy men and women. J. Gerontology A Biol. Sci. Med. Sci. 52 177–183.
  • Ramsay, J. and Dalzell, C. J. (1991). Some tools for functional data analysis (with discussion). J. Roy. Statist. Soc. Ser. B 53 539–572.
  • Ramsay, J. and Silverman, B. W. (1997). Functional Data Analysis. Springer, New York.
  • Rice, J. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. J. Roy. Statist. Soc. Ser. B 53 233–243.
  • Rice, J. and Wu, C. (2001). Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics 57 253–259.
  • Shock, N. W., Greulich, R. C., Andres, R., Lakatta, E. G., Arenberg, D. and Tobin, J. D. (1984). Normal Human Aging: The Baltimore Longitudinal Study of Aging. NIH Publication No. 84-2450. U.S. Government Printing Office, Washington.
  • Staniswalis, J.-G. and Lee, J. -J. (1998). Nonparametric regression analysis of longitudinal data. J. Amer. Statist. Assoc. 93 1403–1418.
  • Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100 577–590.