Statistical Science
previous :: next

Spline Adaptation in Extended Linear Models (with comments and a rejoinder by the authors

Mark H. Hansen and Charles Kooperberg
Source: Statist. Sci. Volume 17, Number 1 (2002), 2-51.

Abstract

In many statistical applications, nonparametric modeling can provide insights into the features of a dataset that are not obtainable by other means. One successful approach involves the use of (univariate or multivariate) spline spaces. As a class, these methods have inherited much from classical tools for parametric modeling. For example, stepwise variable selection with spline basis terms is a simple scheme for locating knots (breakpoints) in regions where the data exhibit strong, local features. Similarly, candidate knot configurations (generated by this or some other search technique), are routinely evaluated with traditional selection criteria like AIC or BIC. In short, strategies typically applied in parametric model selection have proved useful in constructing flexible, low-dimensional models for nonparametric problems.

Until recently, greedy, stepwise procedures were most frequently suggested in the literature. Research into Bayesian variable selection, however, has given rise to a number of new spline-based methods that primarily rely on some form of Markov chain Monte Carlo to identify promising knot locations. In this paper, we consider various alternatives to greedy, deterministic schemes, and present a Bayesian framework for studying adaptation in the context of an extended linear model (ELM). Our major test cases are Logspline density estimation and (bivariate) Triogram regression models. We selected these because they illustrate a number of computational and methodological issues concerning model adaptation that arise in ELMs.

First Page: Show Hide
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.ss/1023798997
Digital Object Identifier: doi:10.1214/ss/1023798997
Mathematical Reviews number (MathSciNet): MR1910073

References

AKAIKE, H. (1974). A new look at the statistical model identification. IEEE Trans. Automat. Control AC-19 716-723.
Mathematical Reviews (MathSciNet): MR54:11691
Digital Object Identifier: doi:10.1109/TAC.1974.1100705
BESAG, J. and HIGDON, D. (1999). Bayesian inference for agricultural field experiments (with discussion). J. Roy. Statist. Soc. Ser. B 61 691-746.
Mathematical Reviews (MathSciNet): MR1722238
Zentralblatt MATH: 0951.62091
Digital Object Identifier: doi:10.1111/1467-9868.00201
BREIMAN, L. (1991). The -method for estimating multivariate functions from noisy data. Technometrics 33 125-143.
Mathematical Reviews (MathSciNet): MR92j:62069
Digital Object Identifier: doi:10.2307/1269038
BREIMAN, L. (1993). Hinging hy perplanes for regression, classification and function approximation. IEEE Trans. Inform. Theory 39 999-1013.
Mathematical Reviews (MathSciNet): MR94g:65019
Digital Object Identifier: doi:10.1109/18.256506
BREIMAN, L., FRIEDMAN, J. H., OLSHEN, R. A. and STONE,
C. J. (1984). Classification and Regression Trees. Wadsworth, Pacific Grove, CA.
COURANT, R. (1943). Variational methods for the solution of problems of equilibrium and vibrations. Bull. Amer. Math. Soc. 49 1-23.
Mathematical Reviews (MathSciNet): MR4,200e
Zentralblatt MATH: 0063.00985
Digital Object Identifier: doi:10.1090/S0002-9904-1943-07818-4
Project Euclid: euclid.bams/1183505079
DE BOOR, C. (1978). A Practical Guide to Splines. Springer, New York.
Mathematical Reviews (MathSciNet): MR80a:65027
DENISON, D. G. T., MALLICK, B. K. and SMITH, A. F. M. (1998a). Automatic Bayesian curve fitting. J. Roy. Statist. Soc. Ser. B 60 333-350.
Mathematical Reviews (MathSciNet): MR1616029
Zentralblatt MATH: 0907.62031
Digital Object Identifier: doi:10.1111/1467-9868.00128
DENISON, D. G. T., MALLICK, B. K. and SMITH, A. F. M. (1998b). A Bayesian CART algorithm. Biometrika 85 363-377.
Mathematical Reviews (MathSciNet): MR1649118
Zentralblatt MATH: 01207187
Digital Object Identifier: doi:10.1093/biomet/85.2.363
Dy N, N., LEVIN, D. and RIPPA, S. (1990a). Data dependent triangulations for piecewise linear interpolation. IMA J. Numer. Anal. 10 137-154.
Mathematical Reviews (MathSciNet): MR91a:65022
Zentralblatt MATH: 0699.65004
Digital Object Identifier: doi:10.1093/imanum/10.1.137
Dy N, N., LEVIN, D. and RIPPA, S. (1990b). Algorithms for the construction of data dependent triangulations. In Algorithms for Approximation 2 (J. C. Mason and M. G. Cox, eds.) 185- 192. Chapman and Hall, New York.
Mathematical Reviews (MathSciNet): MR1071979
Zentralblatt MATH: 0752.41002
FRIEDMAN, J. H. (1991). Multivariate adaptive regression splines (with discussion). Ann. Statist. 19 1-141.
Mathematical Reviews (MathSciNet): MR1091842
Zentralblatt MATH: 0765.62064
Digital Object Identifier: doi:10.1214/aos/1176347963
Project Euclid: euclid.aos/1176347963
FRIEDMAN, J. H. and SILVERMAN, B. W. (1989). Flexible parsimonious smoothing and additive modeling (with discussion). Technometrics 31 3-39.
Mathematical Reviews (MathSciNet): MR997668
Digital Object Identifier: doi:10.2307/1270359
GREEN, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 711-732.
Zentralblatt MATH: 0861.62023
Mathematical Reviews (MathSciNet): MR1380810
Digital Object Identifier: doi:10.1093/biomet/82.4.711
GREEN, P. J. and SILVERMAN, B. W. (1994). Nonparametric Regression and Generalized Linear Models. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR1270012
Zentralblatt MATH: 0832.62032
GU, C., BATES, D. M., CHEN, Z. and WAHBA, G. (1989). The computation of generalized cross-validation functions through a Householder tridiagonalization with applications to the fitting of interaction spline models. SIAM J. Matrix Appl. Anal. 10 457-480.
Mathematical Reviews (MathSciNet): MR1016796
Zentralblatt MATH: 0685.65134
Digital Object Identifier: doi:10.1137/0610033
HALPERN, E. F. (1973). Bayesian spline regression when the number of knots is unknown. J. Roy. Statist. Soc. Ser. B 35 347-360.
Mathematical Reviews (MathSciNet): MR49:6428
HANSEN, M. (1994). Extended linear models, multivariate splines and ANOVA. Ph.D. dissertation, Univ. California, Berkeley.
HANSEN, M., KOOPERBERG, C. and SARDY, S. (1998). Triogram models. J. Amer. Statist. Assoc. 93 101-119.
Zentralblatt MATH: 0902.62045
HOLMES, C. C. and MALLICK, B. K. (2001). Bayesian regression with multivariate linear splines. J. Roy. Statist. Soc. Ser. B 63 3-18.
Mathematical Reviews (MathSciNet): MR2001m:62076
Zentralblatt MATH: 0979.62010
Digital Object Identifier: doi:10.1111/1467-9868.00272
HUANG, J. Z. (1998). Projection estimation in multiple regression with application to functional ANOVA models. Ann. Statist. 26 242-272.
Zentralblatt MATH: 0930.62042
Mathematical Reviews (MathSciNet): MR1611780
Digital Object Identifier: doi:10.1214/aos/1030563984
Project Euclid: euclid.aos/1030563984
HUANG, J. Z. (2001). Concave extended linear modeling: A theoretical sy nthesis. Statist. Sinica 11 173-197.
Mathematical Reviews (MathSciNet): MR1820005
Zentralblatt MATH: 0967.62027
JUPP, D. L. B. (1978). Approximation to data by splines with free knots. SIAM J. Numer. Anal. 15 328-343.
Mathematical Reviews (MathSciNet): MR81e:41011
Zentralblatt MATH: 0403.65004
Digital Object Identifier: doi:10.1137/0715022
KOENKER, R. and MIZERA, I. (2001). Penalized Triograms: Total variation regularization for bivariate smoothing. Technical report. (Available at www.econ.uiuc.edu/roger/research/ goniolatry/gon.html.)
KOOPERBERG, C., BOSE, S. and STONE, C. J. (1997). Poly chotomous regression. J. Amer. Statist. Assoc. 92 117-127.
KOOPERBERG, C. and STONE, C. J. (1991). A study of logspline density estimation. Comput. Statist. Data Anal. 12 327-347.
Mathematical Reviews (MathSciNet): MR92k:62073
Zentralblatt MATH: 0825.62442
KOOPERBERG, C. and STONE, C. J. (1992). Logspline density estimation for censored data. J. Comput. Graph. Statist. 1 301-328.
KOOPERBERG, C. and STONE, C. J. (2002). Comparison of parametric, bootstrap, and Bayesian approaches to obtaining confidence intervals for logspline density estimation. Unpublished manuscript.
KOOPERBERG, C. and STONE, C. J. (2002). Confidence intervals for logspline density estimation. Available at http://bear. fhcrc.org/ clk/ref.html.
LINDSTROM, M. (1999). Penalized estimation of free-knot splines. J. Comput. Graph. Statist. 8 333-352.
Mathematical Reviews (MathSciNet): MR1706353
Digital Object Identifier: doi:10.2307/1390640
NICHOLLS, G. (1998). Bayesian image analysis with Markov chain Monte Carlo and colored continuum triangulation models. J. Roy. Statist. Soc. Ser. B 60 643-659.
Mathematical Reviews (MathSciNet): MR1626001
Zentralblatt MATH: 0909.62021
Digital Object Identifier: doi:10.1111/1467-9868.00145
QUAK, E. and SCHUMAKER, L. L. (1991). Least squares fitting by linear splines on data dependent triangulations. In Curves and Surfaces (P. J. Laurent, A. Le Méhauté and L. L. Schumaker, eds.) 387-390. Academic Press, New York.
Mathematical Reviews (MathSciNet): MR1123764
Zentralblatt MATH: 0733.41020
SCHUMAKER, L. L. (1993). Spline Functions: Basic Theory. Wiley, New York.
Mathematical Reviews (MathSciNet): MR94d:41001
SCHWARZ, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461-464.
Zentralblatt MATH: 0379.62005
Mathematical Reviews (MathSciNet): MR468014
Digital Object Identifier: doi:10.1214/aos/1176344136
Project Euclid: euclid.aos/1176344136
SIBSON, R. (1978). Locally equiangular triangulations. Computer Journal 21 243-245.
Mathematical Reviews (MathSciNet): MR80d:52018
Digital Object Identifier: doi:10.1093/comjnl/21.3.243
SILVERMAN, B. W. (1985). Some aspects of the spline smoothing approach to nonparametric regression curve fitting (with discussion). J. Roy. Statist. Soc. Ser. B 47 1-52.
Mathematical Reviews (MathSciNet): MR87i:62110
SMITH, M. (1996). Nonparametric regression: A Markov chain Monte Carlo approach. Ph.D. dissertation, Univ. New South Wales, Australia.
SMITH, M. and KOHN, R. (1996). Nonparametric regression using Bayesian variable selection. J. Econometrics 75 317-344.
Zentralblatt MATH: 0864.62025
SMITH, M. and KOHN, R. (1998). Nonparametric estimation of irregular functions with independent or autocorrelated errors. In Practical Nonparametric and Semiparametric Bayesian Statistics (D. Dey, P. Müller and D. Sinha, eds.) 133-150. Springer, New York.
SMITH, P. L. (1982a). Curve fitting and modeling with splines using statistical variable selection techniques. Report NASA 166034, NASA, Langley Research Center, Hampton, VA.
SMITH, P. L. (1982b). Hy pothesis testing in B-spline regression. Comm. Statist. Part B-Simulation and Comput. 11 143-157.
Mathematical Reviews (MathSciNet): MR649960
Digital Object Identifier: doi:10.1080/03610918208812251
STONE, C. J. (1985). Additive regression and other nonparametric models. Ann. Statist. 13 689-705.
Zentralblatt MATH: 0605.62065
Mathematical Reviews (MathSciNet): MR790566
Digital Object Identifier: doi:10.1214/aos/1176349548
Project Euclid: euclid.aos/1176349548
STONE, C. J. (1994). The use of poly nomial splines and their tensor products in multivariate function estimation (with discussion). Ann. Statist. 22 118-184.
Mathematical Reviews (MathSciNet): MR1272079
Zentralblatt MATH: 0827.62038
Digital Object Identifier: doi:10.1214/aos/1176325361
Project Euclid: euclid.aos/1176325361
STONE, C. J., HANSEN M., KOOPERBERG, C. and TRUONG, Y. K.
(1997). Poly nomial splines and their tensor products in extended linear modeling (with discussion). Ann. Statist. 25 1371-1470.
Mathematical Reviews (MathSciNet): MR1463561
Zentralblatt MATH: 0924.62036
Digital Object Identifier: doi:10.1214/aos/1031594728
Project Euclid: euclid.aos/1031594728
STONE, C. J. and HUANG, J. Z. (2002). Free knot splines in concave extended linear modeling. J. Statist. Plann. Inference. To appear.
Mathematical Reviews (MathSciNet): MR1947401
Zentralblatt MATH: 1030.62057
Digital Object Identifier: doi:10.1016/S0378-3758(02)00280-X
STONE, C. J. and KOO, C.-Y. (1986). Logspline density estimation. Contemp. Math. 59 1-15.
Mathematical Reviews (MathSciNet): MR870445
WAHBA, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia.
Mathematical Reviews (MathSciNet): MR91g:62028
Zentralblatt MATH: 0813.62001
previous :: next

2013 © Institute of Mathematical Statistics

Statistical Science

Statistical Science

Turn MathJax Off
What is MathJax?