Electronic Journal of Statistics

Structured penalties for functional linear models—partially empirical eigenvectors for regression

Timothy W. Randolph, Jaroslaw Harezlak, and Ziding Feng

Full-text: Open access

Abstract

One of the challenges with functional data is incorporating geometric structure, or local correlation, into the analysis. This structure is inherent in the output from an increasing number of biomedical technologies, and a functional linear model is often used to estimate the relationship between the predictor functions and scalar responses. Common approaches to the problem of estimating a coefficient function typically involve two stages: regularization and estimation. Regularization is usually done via dimension reduction, projecting onto a predefined span of basis functions or a reduced set of eigenvectors (principal components). In contrast, we present a unified approach that directly incorporates geometric structure into the estimation process by exploiting the joint eigenproperties of the predictors and a linear penalty operator. In this sense, the components in the regression are ‘partially empirical’ and the framework is provided by the generalized singular value decomposition (GSVD). The form of the penalized estimation is not new, but the GSVD clarifies the process and informs the choice of penalty by making explicit the joint influence of the penalty and predictors on the bias, variance and performance of the estimated coefficient function. Laboratory spectroscopy data and simulations are used to illustrate the concepts.

Article information

Source
Electron. J. Statist. Volume 6 (2012), 323-353.

Dates
First available: 8 March 2012

Permanent link to this document
http://projecteuclid.org/euclid.ejs/1331216629

Digital Object Identifier
doi:10.1214/12-EJS676

Mathematical Reviews number (MathSciNet)
MR2988411

Zentralblatt MATH identifier
06166960

Subjects
Primary: 62J05: Linear regression 62J07: Ridge regression; shrinkage estimators
Secondary: 65F22: Ill-posedness, regularization

Keywords
Penalized regression generalized singular value decomposition regularization functional data

Citation

Randolph, Timothy W.; Harezlak, Jaroslaw; Feng, Ziding. Structured penalties for functional linear models—partially empirical eigenvectors for regression. Electronic Journal of Statistics 6 (2012), 323--353. doi:10.1214/12-EJS676. http://projecteuclid.org/euclid.ejs/1331216629.


Export citation

References

  • [1] M. Belge, M.E. Kilmer, and E.L. Miller, Wavelet domain image restoration with adaptive edge-preserving regularization, IEEE Transactions On Image Processing 9 (2000), no. 4, 597–608.
  • [2] M. Bertero and P. Boccacci, Introduction to Inverse Problems in Imaging, Institute of Physics, Bristol, UK, 1998.
  • [3] C. Bingham and K. Larntz, A simulation study of alternatives to ordinary least squares: Comment, Journal of the American Statistical Association 72 (1977), no. 357, 97–102.
  • [4] Å. Björck, Numerical Methods for Least Squares Problems, SIAM, Philadelphia, 1996.
  • [5] P. Brown, Measurement, Regression and Calibration, Oxford University Press, Oxford, UK, 1993.
  • [6] B.A. Brumback, D. Ruppert, and M.P. Wand, Comment on “Variable selection and function estimation in additive nonparametric regression using a data-based prior”, Journal of the American Statistical Association 94 (1999), 794–797.
  • [7] T.T. Cai and P. Hall, Prediction in functional linear regression, The Annals of Statistics 34 (2006), no. 5, 2159–2179.
  • [8] H. Cardot, F. Ferraty, and P. Sarda, Spline estimators for the functional linear model, Statistica Sinica 13 (2003), 571–591.
  • [9] P. Craven and G. Wahba, Smoothing noisy data with spline functions, Numerische Mathematik 31 (1979), 377–403.
  • [10] A.P. Dempster, M. Schatzoff, and N. Wermuth, A simulation study of alternatives to ordinary least squares, Journal of the American Statistical Association 72 (1977), no. 357, 77–912.
  • [11] L. Eldén, A weighted pseudoinverse, generalized singular values, and contstrained least squares problems, BIT 22 (1982), 487–502.
  • [12] H.W. Engl, M. Hanke, and A. Neubauer, Regularization of inverse problems, Kluwer, Dordrecht, Germany, 2000.
  • [13] G.H. Golub and C. Van Loan, Matrix computations, Johns Hopkins University Press, Baltimore, 1996.
  • [14] C. Goutis, Second-derivative functional regression with applications to near infra-red spectroscopy, Journal of the Royal Statistical Society B 60 (1998), no. 1, 103–114.
  • [15] C.W. Groetsch, The Theory of Tikhonov Regularization for Fredholm Equations of the First Kind, Research Notes in Mathematics, vol. 105, Pitman, Boston, MA, 1984.
  • [16] P. Hall and J.L. Horowitz, Methodology and convergence rates for functional linear regression, The Annals of Statistics 35 (2007), no. 1, 70–91.
  • [17] P. Hall, D.S. Poskitt, and B. Presnell, A functional data-analytic approach to signal discrimination, Technometrics 43 (2001), no. 1, 1–9.
  • [18] P.C. Hansen, Perturbation bounds for discrete Tikhonov regularisation, Inverse Problems 5 (1989), L41–L44.
  • [19] P.C. Hansen, Rank-Deficient and Discrete Ill-Posed Problems, SIAM, Philadelphia, PA, 1998.
  • [20] T. Hastie, A. Buja, and R. Tibshirani, Penalized discriminant analysis, The Annals of Statistics 23 (1995), no. 1, 73–102.
  • [21] T. Hastie and C. Mallows, Discussion of “A statistical view of some chemometrics regression tools”, Technometrics 35 (1993), 109–148.
  • [22] N.E. Heckman and J.O. Ramsay, Penalized regression with model based penalties, Canadian Journal of Statistics 28 (2000), 241–258.
  • [23] A.E. Hoerl and R.W. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12 (1970), no. 1, 55–67.
  • [24] J.Z. Huang, H. Shen, and A. Buja, Functional principal components analysis via penalized rank one approximation, Electronic Journal of Statistics 2 (2008), 678–695.
  • [25] M.E. Kilmer, P.C. Hansen, and M.I. Español, A projection-based approach to general-form Tikhonov regularization, Siam J. Sci. Comput 29 (2007), no. 1, 315–330.
  • [26] C. Li and H. Li, Network-constrained regularization and variable selection for analysis of genomic data, Bioinformatics 24 (2008), no. 9, 1175.
  • [27] B.R. Lutz, C.E. Dentinger, L.N. Nguyen, L. Sun, J. Zhang, A.N. Allen, S. Chan, and B.S. Knudsen, Spectral Analysis of Multiplex Raman Probe Signatures, ACS nano 2 (2008), no. 11, 2306–2314.
  • [28] A.J. MacLeod, Finite-dimensional regularization with nonidentity smoothing matrices, Linear Algebra and its Applications 111 (1988), 191–207.
  • [29] D.W. Marquardt, Generalized inverses, ridge regression, biased linear estimation and nonlinear estimation, Technometrics 12 (1970), no. 3, 591–612.
  • [30] B.D. Marx and P.H.C. Eilers, Generalized linear regression on sampled signals and curves: A P-spline approach, Technometrics 41 (1999), no. 1, 1–13.
  • [31] B.D. Marx and P.H.C. Eilers, Multivariate calibration stability: a comparison of methods, Journal of Chemometrics 16 (2002), no. 3, 129–140.
  • [32] H.G. Müller, Functional modelling and classification of longitudinal data, Scandinavian Journal of Statistics 32 (2005), 223–240.
  • [33] A. Neumaier, Solving ill-conditioned and singular linear systems: A tutorial on regularization, SIAM Review 10 (1998), no. 3, 636–666.
  • [34] D.M. O’Brien and J.N. Holt, The extension of generalized cross-validation to a multi-parameter class of estimators, Austral. Math. Soc. (Series B) (1981), no. 22, 501–514.
  • [35] C.C. Paige and M.A. Saunders, Towards a generalized singular value decomposition, SIAM J. Numerical Analysis 18 (1981), no. 3, 398–405.
  • [36] D.L. Phillips, A technique for the numerical solution of certain integral equations of the first kind, Journal of the ACM 9 (1962), no. 1, 84–97.
  • [37] R Development Core Team, R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, 2011, ISBN 3-900051-07-0.
  • [38] J.O. Ramsay and B.W. Silverman, Functional Data Analysis, Springer-Verlag, New York, 2005.
  • [39] Philip Reiss, Lei Huang, and Jeff Goldsmith, refund: Regression with functional data, 2010, R package version 0.1-2.
  • [40] P.T. Reiss and R.T. Ogden, Functional principal component regression and functional partial least squares, Journal of the American Statistical Association 102 (2007), no. 479, 984–986.
  • [41] P.T. Reiss and R.T. Ogden, Smoothing parameter selection for a class of semiparametric linear models, Journal of the Royal Statistical Society B 71 (2009), no. 2, 505–523.
  • [42] D. Ruppert, M.P. Wand, and R.J. Carroll, Semiparametric regression, Cambridge University Press, New York, 2003.
  • [43] SAS Institute Inc., SAS/STAT software, version 9.2, Cary, NC, 2008.
  • [44] C.M. Shachaf, S.V. Elchuri, A.L. Koh, J. Zhu, L.N. Nguyen, D.J. Mitchell, J. Zhang, K.B. Swartz, L. Sun, S. Chan, et al., A Novel Method for Detection of Phosphorylation in Single Cells by Surface Enhanced Raman Scattering (SERS) using Composite Organic-Inorganic Nanoparticles (COINs), PLoS ONE 4 (2009), no. 4.
  • [45] B.W. Silverman, Smoothed functional principal components analysis by choice of norm, The Annals of Statistics 24 (1996), 1–24.
  • [46] M. Slawski, W. Zu Castell, and G. Tutz, Feature selection guided by structural information, Annals of Applied Statistics 4 (2010), no. 2, 1056–1080.
  • [47] T. Speed, Comment on “that BLUP is a good thing: The estimation of random effects”, by G.K. Robinson, Statistical Science 6 (1991), no. 1, 42–44.
  • [48] F. Stout and J.H. Kalivas, Tikhonov regularization in standardized and general form for multivariate calibration with application towards removing unwanted spectral artifacts, Journal of Chemometrics 20 (2006), 22–33.
  • [49] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight, Sparsity and smoothness via the fused lasso, Journal of the Royal Statistical Society B 67 (2005), no. 1, 91–108.
  • [50] R.J. Tibshirani and J. Taylor, The solution path of the generalized lasso, The Annals of Statistics 39 (2011), no. 3, 1335–1371.
  • [51] A.N. Tikhonov, On the stability of inverse problems, Dokl. Akad. Nauk SSSR 39 (1943), 176–179.
  • [52] G. Tutz and J. Ulbricht, Penalized regression with correlation-based penalty, Statistics and Computing 19 (2009), no. 3, 239–253.
  • [53] J.M. Varah, A practical examination of some numerical methods for linear discrete ill-posed problems, SIAM Review 21 (1979), no. 1, 100–111.
  • [54] G. Wahba, Spline models for observational data, vol. 59, Society for Industrial Mathematics, 1990.
  • [55] S.N. Wood, An introduction to generalized additive models with R, Chapman and Hall, 2006.