The Annals of Statistics

Prediction in functional linear regression

T. Tony Cai and Peter Hall

Full-text: Open access

Abstract

There has been substantial recent work on methods for estimating the slope function in linear regression for functional data analysis. However, as in the case of more conventional finite-dimensional regression, much of the practical interest in the slope centers on its application for the purpose of prediction, rather than on its significance in its own right. We show that the problems of slope-function estimation, and of prediction from an estimator of the slope function, have very different characteristics. While the former is intrinsically nonparametric, the latter can be either nonparametric or semiparametric. In particular, the optimal mean-square convergence rate of predictors is n−1, where n denotes sample size, if the predictand is a sufficiently smooth function. In other cases, convergence occurs at a polynomial rate that is strictly slower than n−1. At the boundary between these two regimes, the mean-square convergence rate is less than n−1 by only a logarithmic factor. More generally, the rate of convergence of the predicted value of the mean response in the regression model, given a particular value of the explanatory variable, is determined by a subtle interaction among the smoothness of the predictand, of the slope function in the model, and of the autocovariance function for the distribution of explanatory variables.

Article information

Source
Ann. Statist., Volume 34, Number 5 (2006), 2159-2179.

Dates
First available in Project Euclid: 23 January 2007

Permanent link to this document
https://projecteuclid.org/euclid.aos/1169571793

Digital Object Identifier
doi:10.1214/009053606000000830

Mathematical Reviews number (MathSciNet)
MR2291496

Zentralblatt MATH identifier
1106.62036

Subjects
Primary: 62J05: Linear regression
Secondary: 62G20: Asymptotic properties

Keywords
Bootstrap covariance dimension reduction eigenfunction eigenvalue eigenvector functional data analysis intercept minimax optimal convergence rate principal components analysis rate of convergence slope smoothing spectral decomposition

Citation

Cai, T. Tony; Hall, Peter. Prediction in functional linear regression. Ann. Statist. 34 (2006), no. 5, 2159--2179. doi:10.1214/009053606000000830. https://projecteuclid.org/euclid.aos/1169571793


Export citation

References

  • Besse, P. and Ramsay, J. O. (1986). Principal components analysis of sampled functions. Psychometrika 51 285--311.
  • Bhatia, R., Davis, C. and McIntosh, A. (1983). Perturbation of spectral subspaces and solution of linear operator equations. Linear Algebra Appl. 52/53 45--67.
  • Boente, G. and Fraiman, R. (2000). Kernel-based functional principal components. Statist. Probab. Lett. 48 335--345.
  • Brown, L. D. and Low, M. G. (1996). A constrained risk inequality with applications to nonparametric functional estimation. Ann. Statist. 24 2524--2535.
  • Brumback, B. A. and Rice, J. A. (1998). Smoothing spline models for the analysis of nested and crossed samples of curves (with discussion). J. Amer. Statist. Assoc. 93 961--994.
  • Cai, T. T. and Hall, P. (2005). Prediction in functional linear regression. Technical report. Available at stat.wharton.upenn.edu/\~tcai/paper/FLR-Tech-Report.pdf.
  • Cardot, H. (2000). Nonparametric estimation of smoothed principal components analysis of sampled noisy functions. J. Nonparametr. Statist. 12 503--538.
  • Cardot, H., Ferraty, F. and Sarda, P. (1999). Functional linear model. Statist. Probab. Lett. 45 11--22.
  • Cardot, H., Ferraty, F. and Sarda, P. (2000). Étude asymptotique d'un estimateur spline hybride pour le modèle linéaire fonctionnel. C. R. Acad. Sci. Paris Sér. I Math. 330 501--504.
  • Cardot, H., Ferraty, F. and Sarda, P. (2003). Spline estimators for the functional linear model. Statist. Sinica 13 571--591.
  • Cardot, H. and Sarda, P. (2003). Linear regression models for functional data. Unpublished manuscript.
  • Cardot, H. and Sarda, P. (2005). Estimation in generalized linear models for functional data via penalized likelihood. J. Multivariate Anal. 92 24--41.
  • Cuevas, A., Febrero, M. and Fraiman, R. (2002). Linear functional regression: The case of fixed design and functional response. Canad. J. Statist. 30 285--300.
  • Escabias, M., Aguilera, A. M. and Valderrama, M. J. (2005). Modeling environmental data by functional principal component logistic regression. Environmetrics 16 95--107.
  • Ferraty, F. and Vieu, P. (2000). Dimension fractale et estimation de la régression dans des espaces vectoriels semi-normés. C. R. Acad. Sci. Paris Sér. I Math. 330 139--142.
  • Ferraty, F. and Vieu, P. (2002). The functional nonparametric model and application to spectrometric data. Comput. Statist. 17 545--564.
  • Ferraty, F. and Vieu, P. (2004). Nonparametric models for functional data, with application in regression, time-series prediction and curve discrimination. J. Nonparametr. Statist. 16 111--125.
  • Ferré, L. and Yao, A. F. (2003). Functional sliced inverse regression analysis. Statistics 37 475--488.
  • Girard, S. (2000). A nonlinear PCA based on manifold approximation. Comput. Statist. 15 145--167.
  • Hall, P. and Horowitz, J. L. (2004). Methodology and convergence rates for functional linear regression. Unpublished manuscript.
  • He, G., Müller, H.-G. and Wang, J.-L. (2003). Functional canonical analysis for square integrable stochastic processes. J. Multivariate Anal. 85 54--77.
  • James, G. M. (2002). Generalized linear models with functional predictors. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 411--432.
  • James, G. M., Hastie, T. J. and Sugar, C. A. (2000). Principal component models for sparse functional data. Biometrika 87 587--602.
  • Masry, E. (2005). Nonparametric regression estimation for dependent functional data: Asymptotic normality. Stochastic Process. Appl. 115 155--177.
  • Müller, H.-G. and Stadtmüller, U. (2005). Generalized functional linear models. Ann. Statist. 33 774--805.
  • Preda, C. and Saporta, G. (2004). PLS approach for clusterwise linear regression on functional data. In Classification, Clustering, and Data Mining Applications (D. Banks, L. House, F. R. McMorris, P. Arabie and W. Gaul, eds.) 167--176. Springer, Berlin.
  • Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis (with discussion). J. Roy. Statist. Soc. Ser. B 53 539--572.
  • Ramsay, J. O. and Silverman, B. W. (1997). Functional Data Analysis. Springer, New York.
  • Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis: Methods and Case Studies. Springer, New York.
  • Ratcliffe, S. J., Heller, G. Z. and Leader, L. R. (2002). Functional data analysis with application to periodically stimulated foetal heart rate data. II. Functional logistic regression. Statistics in Medicine 21 1115--1127.
  • Rice, J. A. and Silverman, B. W. (1991). Estimating the mean and covariance structure nonparametrically when the data are curves. J. Roy. Statist. Soc. Ser. B 53 233--243.
  • Silverman, B. W. (1995). Incorporating parametric effects into functional principal components analysis. J. Roy. Statist. Soc. Ser. B 57 673--689.
  • Silverman, B. W. (1996). Smoothed functional principal components analysis by choice of norm. Ann. Statist. 24 1--24.