The Annals of Statistics

Rodeo: Sparse, greedy nonparametric regression

John Lafferty and Larry Wasserman
Source: Ann. Statist. Volume 36, Number 1 (2008), 28-63.

Abstract

We present a greedy method for simultaneously performing local bandwidth selection and variable selection in nonparametric regression. The method starts with a local linear estimator with large bandwidths, and incrementally decreases the bandwidth of variables for which the gradient of the estimator with respect to bandwidth is large. The method—called rodeo (regularization of derivative expectation operator)—conducts a sequence of hypothesis tests to threshold derivatives, and is easy to implement. Under certain assumptions on the regression function and sampling density, it is shown that the rodeo applied to local linear smoothing avoids the curse of dimensionality, achieving near optimal minimax rates of convergence in the number of relevant variables, as if these variables were isolated in advance.

First Page: Show Hide
Primary Subjects: 62G08
Secondary Subjects: 62G20
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1201877293
Digital Object Identifier: doi:10.1214/009053607000000811
Mathematical Reviews number (MathSciNet): MR2387963
Zentralblatt MATH identifier: 1132.62026

References

Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA.
Mathematical Reviews (MathSciNet): MR726392
Zentralblatt MATH: 0541.62042
Bühlmann, P. and Yu, B. (2006). Sparse boosting. J. Mach. Learn. Res. 7 1001–1024.
Donoho, D. (2004). For most large underdetermined systems of equations, the minimal 1-norm near-solution approximates the sparest near-solution. Comm. Pure Appl. Math. 59 797–829.
Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression (with discussion). Ann. Statist. 32 407–499.
Mathematical Reviews (MathSciNet): MR2060166
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935
Zentralblatt MATH: 1091.62054
Fan, J. (1992). Design-adaptive nonparametric regression. J. Amer. Statist. Assoc. 87 998–1004.
Mathematical Reviews (MathSciNet): MR1209561
Digital Object Identifier: doi:10.2307/2290637
Zentralblatt MATH: 0850.62354
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
Mathematical Reviews (MathSciNet): MR1946581
Digital Object Identifier: doi:10.1198/016214501753382273
Zentralblatt MATH: 1073.62547
Fan, J. and Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Ann. Statist. 32 928–961.
Mathematical Reviews (MathSciNet): MR2065194
Digital Object Identifier: doi:10.1214/009053604000000256
Project Euclid: euclid.aos/1085408491
Zentralblatt MATH: 1092.62031
Friedman, J. H. (1991). Multivariate adaptive regression splines (with discussion). Ann. Statist. 19 1–141.
Mathematical Reviews (MathSciNet): MR1091842
Digital Object Identifier: doi:10.1214/aos/1176347963
Project Euclid: euclid.aos/1176347963
Zentralblatt MATH: 0765.62064
Fu, W. and Knight, K. (2000). Asymptotics for lasso type estimators. Ann. Statist. 28 1356–1378.
Mathematical Reviews (MathSciNet): MR1805787
Digital Object Identifier: doi:10.1214/aos/1015957397
Project Euclid: euclid.aos/1015957397
Zentralblatt MATH: 1105.62357
George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection. Statist. Sinica 7 339–373.
Girosi, F. (1997). An equivalence between sparse approximation and support vector machines. Neural Comput. 10 1455–1480.
Györfi, L., Kohler, M., Krzyżak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR1987353
Hastie, T. and Loader, C. (1993). Local regression: Automatic kernel carpentry. Statist. Sci. 8 120–129.
Hastie, T., Tibshirani, R. and Friedman, J. H. (2001). The Elements of Statistical Learning. Data Mining, Inference, and Prediction. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR1851606
Zentralblatt MATH: 0973.62007
Hristache, M., Juditsky, A., Polzehl, J. and Spokoiny, V. (2001). Structure adaptive approach for dimension reduction. Ann. Statist. 29 1537–1566.
Mathematical Reviews (MathSciNet): MR1891738
Project Euclid: euclid.aos/1015345954
Kerkyacharian, K., Lepski, O. and Picard, D. (2001). Nonlinear estimation in anisotropic multi-index denoising. Probab. Theory Related Fields 121 137–170.
Mathematical Reviews (MathSciNet): MR1863916
Digital Object Identifier: doi:10.1007/PL00008800
Zentralblatt MATH: 1010.62029
Lawrence, N. D., Seeger, M. and Herbrich, R. (2003). Fast sparse Gaussian process methods: The informative vector machine. In Advances in Neural Information Processing Systems 15 625–632.
Zentralblatt MATH: 1157.68431
Lepski, O. V., Mammen, E. and Spokoiny, V. G. (1997). Optimal spatial adaptation to inhomogeneous smoothness: An approach based on kernel estimates with variable bandwidth selectors. Ann. Statist. 25 929–947.
Mathematical Reviews (MathSciNet): MR1447734
Digital Object Identifier: doi:10.1214/aos/1069362731
Project Euclid: euclid.aos/1069362731
Zentralblatt MATH: 0885.62044
Li, L., Cook, R. D. and Nachsteim, C. (2005). Model-free variable selection. J. Roy. Statist. Soc. Ser. B 67 285–299.
Mathematical Reviews (MathSciNet): MR2137326
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00502.x
Rice, J. (1984). Bandwidth choice for nonparametric regression. Ann. Statist. 12 1215–1230.
Mathematical Reviews (MathSciNet): MR760684
Digital Object Identifier: doi:10.1214/aos/1176346788
Project Euclid: euclid.aos/1176346788
Zentralblatt MATH: 0554.62035
Ruppert, D. (1997). Empirical-bias bandwidths for local polynomial nonparametric regression and density estimation. J. Amer. Statist. Assoc. 92 1049–1062.
Mathematical Reviews (MathSciNet): MR1482136
Digital Object Identifier: doi:10.2307/2965570
Zentralblatt MATH: 1067.62531
Ruppert, D. and Wand, M. P. (1994). Multivariate locally weighted least squares regression. Ann. Statist. 22 1346–1370.
Mathematical Reviews (MathSciNet): MR1311979
Digital Object Identifier: doi:10.1214/aos/1176325632
Project Euclid: euclid.aos/1176325632
Zentralblatt MATH: 0821.62020
Samarov, A., Spokoiny, V. and Vial, C. (2005). Component identification and estimation in nonlinear high-dimensional regression models by structural adaptation. J. Amer. Statist. Assoc. 100 429–445.
Mathematical Reviews (MathSciNet): MR2160548
Digital Object Identifier: doi:10.1198/016214504000001529
Zentralblatt MATH: 1117.62419
Smola, A. and Bartlett, P. (2001). Sparse greedy Gaussian process regression. In Advances in Neural Information Processing Systems 13 619–625.
Stone, C. J., Hansen, M. H., Kooperberg, C. and Truong, Y. K. (1997). Polynomial splines and their tensor products in extended linear modeling (with discussion). Ann. Statist. 25 1371–1470.
Mathematical Reviews (MathSciNet): MR1463561
Digital Object Identifier: doi:10.1214/aos/1031594728
Project Euclid: euclid.aos/1031594728
Zentralblatt MATH: 0924.62036
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Ser. B Stat. Methodol. 58 267–288.
Mathematical Reviews (MathSciNet): MR1379242
Tipping, M. (2001). Sparse Bayesian learning and the relevance vector machine. J. Machine Learning Research 1 211–244.
Mathematical Reviews (MathSciNet): MR1875838
Digital Object Identifier: doi:10.1162/15324430152748236
Zentralblatt MATH: 0997.68109
Tropp, J. A. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE Trans. Inform. Theory 50 2231–2241.
Mathematical Reviews (MathSciNet): MR2097044
Digital Object Identifier: doi:10.1109/TIT.2004.834793
Tropp, J. A. (2006). Just relax: Convex programming methods for identifying sparse signals. IEEE Trans. Inform. Theory 51 1030–1051.
Turlach, B. (2004). Discussion of “Least angle regression” by Efron, Hastie, Jonstone and Tibshirani. Ann. Statist. 32 494–499.
Mathematical Reviews (MathSciNet): MR2060166
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935
Zentralblatt MATH: 1091.62054
Zhang, H., Wahba, G., Lin, Y., Voelker, M., Ferris, R. K. and Klein, B. (2005). Variable selection and model building via likelihood basis pursuit. J. Amer. Statist. Assoc. 99 659–672.
Mathematical Reviews (MathSciNet): MR2090901
Digital Object Identifier: doi:10.1198/016214504000000593
Zentralblatt MATH: 1117.62459

2013 © Institute of Mathematical Statistics

The Annals of Statistics

The Annals of Statistics

Turn MathJax Off
What is MathJax?