Electronic Journal of Statistics

Identifiability in penalized function-on-function regression models

Fabian Scheipl and Sonja Greven

Full-text: Open access

Abstract

Regression models with functional responses and covariates constitute a powerful and increasingly important model class. However, regression with functional data poses well known and challenging problems of non-identifiability. This non-identifiability can manifest itself in arbitrarily large errors for coefficient surface estimates despite accurate predictions of the responses, thus invalidating substantial interpretations of the fitted models. We offer an accessible rephrasing of these identifiability issues in realistic applications of penalized linear function-on-function-regression and delimit the set of circumstances under which they are likely to occur in practice. Specifically, non-identifiability that persists under smoothness assumptions on the coefficient surface can occur if the functional covariate’s empirical covariance has a kernel which overlaps that of the roughness penalty of the spline estimator. Extensive simulation studies validate the theoretical insights, explore the extent of the problem and allow us to evaluate their practical consequences under varying assumptions about the data generating processes. A case study illustrates the practical significance of the problem. Based on theoretical considerations and our empirical evaluation, we provide immediately applicable diagnostics for lack of identifiability and give recommendations for avoiding estimation artifacts in practice.

Article information

Source
Electron. J. Statist. Volume 10, Number 1 (2016), 495-526.

Dates
Received: June 2015
First available in Project Euclid: 4 March 2016

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1457123504

Digital Object Identifier
doi:10.1214/16-EJS1123

Mathematical Reviews number (MathSciNet)
MR3471986

Zentralblatt MATH identifier
1332.62249

Subjects
Primary: 62J07: Ridge regression; shrinkage estimators
Secondary: 62J20: Diagnostics

Keywords
Functional data penalized splines

Citation

Scheipl, Fabian; Greven, Sonja. Identifiability in penalized function-on-function regression models. Electron. J. Statist. 10 (2016), no. 1, 495--526. doi:10.1214/16-EJS1123. https://projecteuclid.org/euclid.ejs/1457123504.


Export citation

References

  • H. Cardot, F. Ferraty, and P. Sarda. Functional linear model., Statistics and Probability Letters, 45:11–22, 1999.
  • H. Cardot, F. Ferraty, and P. Sarda. Spline estimators for the functional linear model., Statistica Sinica, 13(3):571–592, 2003.
  • J. M. Chiou, H. G. Müller, and J. L. Wang. Functional response models., Statistica Sinica, 14(3):675–694, 2004.
  • C. M. Crainiceanu, A.-M. Staicu, and C.-Z. Di. Generalized multilevel functional regression., Journal of the American Statistical Association, 104 :1550–1561, 2009.
  • M. Febrero-Bande and M. Oviedo de la Fuente. Statistical computing in functional data analysis: The R package fda.usc., Journal of Statistical Software, 51(4):1–28, 2012.
  • K. Fuchs, F. Scheipl, and S. Greven. Penalized scalar-on-functions regression with interaction term., Computational Statistics & Data Analysis, 81:38–51, 2015.
  • J. Goldsmith, J. Bobb, C.M. Crainiceanu, B. Caffo, and D. Reich. Penalized functional regression., Journal of Computational and Graphical Statistics, 20(4):830–851, 2011.
  • C. Happ. Identifiability in scalar-on-functions regression. Master’s thesis, LMU München, 2013.
  • D. A. Harville., Matrix Algebra from a Statistician’s Perspective. Springer, 1997.
  • G. He, H. G. Müller, and J. L. Wang. Extending correlation and regression from multivariate to functional data. In M.L. Puri, editor, Asymptotics in Statistics and Probability, pages 301–315. VSP International Science Publishers, 2000.
  • R. Herrick., WFMM. The University of Texas M.D. Anderson Cancer Center, version 3.0 edition, 2013. URL https://biostatistics.mdanderson.org/SoftwareDownload/SingleSoftware.aspx?Software_Id=70
  • L. Huang, F. Scheipl, J. Goldsmith, J. Gellar, J. Harezlak, M. W. McLean, B. Swihart, L. Xiao, C. M. Crainiceanu, and P. T. Reiss., Refund: Regression with Functional Data, 2015. URL http://CRAN.R-project.org/package=refund. R package version 0.1-12.
  • A. E. Ivanescu, A.-M. Staicu, F. Scheipl, and S. Greven. Penalized function-on-function regression., Computational Statistics, 30(2):539–568, 2015.
  • G. M. James. Generalized linear models with functional predictors., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64(3):411–432, 2002.
  • G. M. James and B. W. Silverman. Functional adaptive model estimation., Journal of the American Statistical Association, 100(470):565–576, 2005.
  • R. Larsson and M. Villani. A distance measure between cointegration spaces., Economics Letters, 70(1):21–27, 2001.
  • G. Marra and S. N. Wood. Practical variable selection for generalized additive models., Computational Statistics & Data Analysis, 55(7) :2372–2387, 2011.
  • L. Prchal and P. Sarda. Spline estimator for functional linear regression with functional response. Unpublished, 2007. URL, http://www.math.univ-toulouse.fr/staph/PAPERS/flm_prchal_sarda.pdf.
  • J. O. Ramsay and B. W. Silverman., Functional Data Analysis. Springer, 2005.
  • J. O. Ramsay, G. Hooker, D. Campbell, and J. Cao. Parameter estimation for differential equations: A generalized smoothing approach., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(5):741–796, 2007.
  • J. O. Ramsay, H. Wickham, S. Graves, and G. Hooker., fda: Functional Data Analysis, 2014. URL http://CRAN.R-project.org/package=fda. R package version 2.4.4.
  • P. T. Reiss and R. T. Ogden. Functional principal component regression and functional partial least squares., Journal of the American Statistical Association, 102(479):984–996, 2007.
  • F. Scheipl, A.-M. Staicu, and S. Greven. Functional additive mixed models., Journal of Computational and Graphical Statistics, 24(2):477–501, 2015.
  • S. N. Wood. Modelling and smoothing parameter estimation with multiple quadratic penalties., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 62(2):413–428, 2000.
  • S. N. Wood. Low rank scale invariant tensor product smooths for generalized additive mixed models., Biometrics, 62, 2006.
  • Y. Wu, J. Fan, and H. G. Müller. Varying-coefficient functional linear regression., Bernoulli, 16(3):730–758, 2010.
  • N. T. Yang, H.-G. Müller, and J.-L. Wang., PACE: Principal Analysis by Conditional Expectation, 2012. URL http://www.stat.ucdavis.edu/PACE/. MATLAB package version 2.16.
  • F. Yao and H. G. Müller. Functional quadratic regression., Biometrika, 97(1):49–64, 2010.
  • F. Yao, H. G. Müller, and J. L. Wang. Functional linear regression analysis for longitudinal data., The Annals of Statistics, 33(6) :2873–2903, 2005.