The Annals of Statistics

Smoothing splines estimators for functional linear regression

Christophe Crambes, Alois Kneip, and Pascal Sarda

Full-text: Open access

Abstract

The paper considers functional linear regression, where scalar responses Y1, …, Yn are modeled in dependence of random functions X1, …, Xn. We propose a smoothing splines estimator for the functional slope parameter based on a slight modification of the usual penalty. Theoretical analysis concentrates on the error in an out-of-sample prediction of the response for a new random function Xn+1. It is shown that rates of convergence of the prediction error depend on the smoothness of the slope function and on the structure of the predictors. We then prove that these rates are optimal in the sense that they are minimax over large classes of possible slope functions and distributions of the predictive curves. For the case of models with errors-in-variables the smoothing spline estimator is modified by using a denoising correction of the covariance matrix of discretized curves. The methodology is then applied to a real case study where the aim is to predict the maximum of the concentration of ozone by using the curve of this concentration measured the preceding day.

Article information

Source
Ann. Statist., Volume 37, Number 1 (2009), 35-72.

Dates
First available in Project Euclid: 16 January 2009

Permanent link to this document
https://projecteuclid.org/euclid.aos/1232115927

Digital Object Identifier
doi:10.1214/07-AOS563

Mathematical Reviews number (MathSciNet)
MR2488344

Zentralblatt MATH identifier
1169.62027

Subjects
Primary: 62G05: Estimation 62G20: Asymptotic properties
Secondary: 60G12: General second-order processes 62M20: Prediction [See also 60G25]; filtering [See also 60G35, 93E10, 93E11]

Keywords
Functional linear regression functional parameter functional variable smoothing splines

Citation

Crambes, Christophe; Kneip, Alois; Sarda, Pascal. Smoothing splines estimators for functional linear regression. Ann. Statist. 37 (2009), no. 1, 35--72. doi:10.1214/07-AOS563. https://projecteuclid.org/euclid.aos/1232115927


Export citation

References

  • [1] Aneiros-Perez, G., Cardot, H., Estevez-Perez, G. and Vieu, P. (2004). Maximum ozone concentration forecasting by functional nonparametric approaches. Environmetrics 15 675–685.
  • [2] Bosq, D. (2000). Linear Processes in Function Spaces. Lecture Notes in Statist. 149. Springer, New York.
  • [3] Cardot, H. (2000). Nonparametric estimation of smoothed principal components analysis of sampled noisy functions. J. Nonparametr. Statist. 12 503–538.
  • [4] Cai, T. T. and Hall, P. (2006). Prediction in functional linear regression. Ann. Statist. 34 2159–2179.
  • [5] Cardot, H., Crambes, C., Kneip, A. and Sarda, P. (2007). Smoothing splines estimators in functional linear regression with errors-in-variables. Comput. Statist. Data Anal. 51 4832–4848.
  • [6] Cardot, H., Crambes, C. and Sarda, P. (2007). Ozone pollution forecasting. In Statistical Methods for Biostatistics and Related Fields (W. Härdle, Y. Mori and P. Vieu, eds.) 221–244. Springer, New York.
  • [7] Cardot, H., Ferraty, F. and Sarda, P. (2003). Spline estimators for the functional linear model. Statist. Sinica 13 571–591.
  • [8] Cardot, H., Mas, A. and Sarda, P. (2007). CLT in functional linear regression models. Probab. Theory Related Fields 138 325–361.
  • [9] Chiou, J. M., Müller, H. G. and Wang, J. L. (2003). Functional quasi-likelihood regression models with smoothed random effects. J. Roy. Statist. Soc. Ser. B 65 405–423.
  • [10] Cuevas, A., Febrero, M. and Fraiman, R. (2002). Linear functional regression: The case of a fixed design and functional response. Canadian J. Statistics 30 285–300.
  • [11] Demmel, J. (1992). The componentwise distance to the nearest singular matrix. SIAM J. Matrix Anal. Appl. 13 10–19.
  • [12] Eilers, P. H. and Marx, B. D. (1996). Flexible smoothing with B-splines and penalties. Statist. Sci. 11 89–102.
  • [13] Eubank, R. L. (1988). Spline Smoothing and Nonparametric Regression. Dekker, New York.
  • [14] Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis: Methods, Theory, Applications and Implementations. Springer, London.
  • [15] Fuller, W. A. (1987). Measurement Error Models. Wiley, New York.
  • [16] Gasser, T., Sroka, L. and Jennen-Steinmetz, C. (1986). Residual variance and residual pattern in nonlinear regression. Biometrika 3 625–633.
  • [17] Golub, G. H. and Van Loan, C. F. (1980). An analysis of the total least squares problem. SIAM J. Numer. Anal. 17 883–893.
  • [18] Hall, P. and Horowitz, J. L. (2007). Methodology and convergence rates for functional linear regression. Ann. Statist. To appear.
  • [19] He, G., Müller, H.-G. and Wang, J. L. (2000). Extending correlation and regression from multivariate to functional data. In Asymptotics in Statistics and Probability (M. L. Puri, ed.) 301–315. VSP, Leiden.
  • [20] Kneip, A. (1994). Ordered linear smoothers. Ann. Statist. 22 835–866.
  • [21] Li, Y. and Hsing, T. (2006). On rates of convergence in functional linear regression. J. Mulitivariate Anal. Published online DOI: 10.1016/j.jmva.2006.10.004.
  • [22] Marx, B. D. and Eilers, P. H. (1999). Generalized linear regression on sampled signals and curves: A P-spline approach. Technometrics 41 1–13.
  • [23] Müller, H.-G. and Stadtmüller, U. (2005). Generalized functional linear models. Annn. Statist. 33 774–805.
  • [24] Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis. J. Roy. Statist. Soc. Ser. B 53 539–572.
  • [25] Ramsay, J. O. and Silverman, B. W. (2002). Applied Functional Data Analysis. Springer, New York.
  • [26] Ramsay, J. O. and Silverman, B. W. (2005). Applied Functional Data Analysis, 2nd ed. Springer, New York.
  • [27] Stone, C. J. (1982). Optimal global rates of convergence for nonparametric regression. Ann. Statist. 10 1040–1053.
  • [28] Utreras, F. (1983). Natural spline functions, their associated eigenvalue problem. Numer. Math. 42 107–117.
  • [29] Van Huffel, S. and Vandewalle, J. (1991). The Total Least Squares Problem: Computational Aspects and Analysis. SIAM, Philadelphia.
  • [30] Wahba, G. (1977). Practical approximate solutions to linear operator equations when the data are noisy. SIAM J. Numer. Anal. 14 651–667.
  • [31] Wahba, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia.
  • [32] Yao, F., Müller, H.-G. and Wang, J. L. (2005). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100 577–590.