The Annals of Statistics

Rodeo: Sparse, greedy nonparametric regression

John Lafferty and Larry Wasserman

Source: Ann. Statist. Volume 36, Number 1 (2008), 28-63.

Abstract

We present a greedy method for simultaneously performing local bandwidth selection and variable selection in nonparametric regression. The method starts with a local linear estimator with large bandwidths, and incrementally decreases the bandwidth of variables for which the gradient of the estimator with respect to bandwidth is large. The method—called rodeo (regularization of derivative expectation operator)—conducts a sequence of hypothesis tests to threshold derivatives, and is easy to implement. Under certain assumptions on the regression function and sampling density, it is shown that the rodeo applied to local linear smoothing avoids the curse of dimensionality, achieving near optimal minimax rates of convergence in the number of relevant variables, as if these variables were isolated in advance.

Primary Subjects: 62G08
Secondary Subjects: 62G20
Keywords: Nonparametric regression; sparsity; local linear smoothing; bandwidth estimation; variable selection; minimax rates of convergence

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1201877293
Digital Object Identifier: doi:10.1214/009053607000000811
Mathematical Reviews number (MathSciNet): MR2387963
Zentralblatt MATH identifier: 1132.62026

References

Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA.
Mathematical Reviews (MathSciNet): MR726392
Zentralblatt MATH: 0541.62042
Bühlmann, P. and Yu, B. (2006). Sparse boosting. J. Mach. Learn. Res. 7 1001–1024.
Donoho, D. (2004). For most large underdetermined systems of equations, the minimal 1-norm near-solution approximates the sparest near-solution. Comm. Pure Appl. Math. 59 797–829.
Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression (with discussion). Ann. Statist. 32 407–499.
Mathematical Reviews (MathSciNet): MR2060166
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935
Fan, J. (1992). Design-adaptive nonparametric regression. J. Amer. Statist. Assoc. 87 998–1004.
Mathematical Reviews (MathSciNet): MR1209561
Digital Object Identifier: doi:10.2307/2290637
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
Mathematical Reviews (MathSciNet): MR1946581
Digital Object Identifier: doi:10.1198/016214501753382273
Fan, J. and Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Ann. Statist. 32 928–961.
Mathematical Reviews (MathSciNet): MR2065194
Digital Object Identifier: doi:10.1214/009053604000000256
Project Euclid: euclid.aos/1085408491
Friedman, J. H. (1991). Multivariate adaptive regression splines (with discussion). Ann. Statist. 19 1–141.
Mathematical Reviews (MathSciNet): MR1091842
Digital Object Identifier: doi:10.1214/aos/1176347963
Project Euclid: euclid.aos/1176347963
Fu, W. and Knight, K. (2000). Asymptotics for lasso type estimators. Ann. Statist. 28 1356–1378.
Mathematical Reviews (MathSciNet): MR1805787
Digital Object Identifier: doi:10.1214/aos/1015957397
Project Euclid: euclid.aos/1015957397
George, E. I. and McCulloch, R. E. (1997). Approaches for Bayesian variable selection. Statist. Sinica 7 339–373.
Girosi, F. (1997). An equivalence between sparse approximation and support vector machines. Neural Comput. 10 1455–1480.
Györfi, L., Kohler, M., Krzyżak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR1987353
Hastie, T. and Loader, C. (1993). Local regression: Automatic kernel carpentry. Statist. Sci. 8 120–129.
Hastie, T., Tibshirani, R. and Friedman, J. H. (2001). The Elements of Statistical Learning. Data Mining, Inference, and Prediction. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR1851606
Zentralblatt MATH: 0973.62007
Hristache, M., Juditsky, A., Polzehl, J. and Spokoiny, V. (2001). Structure adaptive approach for dimension reduction. Ann. Statist. 29 1537–1566.
Mathematical Reviews (MathSciNet): MR1891738
Project Euclid: euclid.aos/1015345954
Kerkyacharian, K., Lepski, O. and Picard, D. (2001). Nonlinear estimation in anisotropic multi-index denoising. Probab. Theory Related Fields 121 137–170.
Mathematical Reviews (MathSciNet): MR1863916
Digital Object Identifier: doi:10.1007/PL00008800
Lawrence, N. D., Seeger, M. and Herbrich, R. (2003). Fast sparse Gaussian process methods: The informative vector machine. In Advances in Neural Information Processing Systems 15 625–632.
Lepski, O. V., Mammen, E. and Spokoiny, V. G. (1997). Optimal spatial adaptation to inhomogeneous smoothness: An approach based on kernel estimates with variable bandwidth selectors. Ann. Statist. 25 929–947.
Mathematical Reviews (MathSciNet): MR1447734
Digital Object Identifier: doi:10.1214/aos/1069362731
Project Euclid: euclid.aos/1069362731
Li, L., Cook, R. D. and Nachsteim, C. (2005). Model-free variable selection. J. Roy. Statist. Soc. Ser. B 67 285–299.
Mathematical Reviews (MathSciNet): MR2137326
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00502.x
Rice, J. (1984). Bandwidth choice for nonparametric regression. Ann. Statist. 12 1215–1230.
Mathematical Reviews (MathSciNet): MR760684
Digital Object Identifier: doi:10.1214/aos/1176346788
Project Euclid: euclid.aos/1176346788
Ruppert, D. (1997). Empirical-bias bandwidths for local polynomial nonparametric regression and density estimation. J. Amer. Statist. Assoc. 92 1049–1062.
Mathematical Reviews (MathSciNet): MR1482136
Digital Object Identifier: doi:10.2307/2965570
Ruppert, D. and Wand, M. P. (1994). Multivariate locally weighted least squares regression. Ann. Statist. 22 1346–1370.
Mathematical Reviews (MathSciNet): MR1311979
Digital Object Identifier: doi:10.1214/aos/1176325632
Project Euclid: euclid.aos/1176325632
Samarov, A., Spokoiny, V. and Vial, C. (2005). Component identification and estimation in nonlinear high-dimensional regression models by structural adaptation. J. Amer. Statist. Assoc. 100 429–445.
Mathematical Reviews (MathSciNet): MR2160548
Digital Object Identifier: doi:10.1198/016214504000001529
Smola, A. and Bartlett, P. (2001). Sparse greedy Gaussian process regression. In Advances in Neural Information Processing Systems 13 619–625.
Stone, C. J., Hansen, M. H., Kooperberg, C. and Truong, Y. K. (1997). Polynomial splines and their tensor products in extended linear modeling (with discussion). Ann. Statist. 25 1371–1470.
Mathematical Reviews (MathSciNet): MR1463561
Digital Object Identifier: doi:10.1214/aos/1031594728
Project Euclid: euclid.aos/1031594728
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Ser. B Stat. Methodol. 58 267–288.
Mathematical Reviews (MathSciNet): MR1379242
Tipping, M. (2001). Sparse Bayesian learning and the relevance vector machine. J. Machine Learning Research 1 211–244.
Mathematical Reviews (MathSciNet): MR1875838
Digital Object Identifier: doi:10.1162/15324430152748236
Tropp, J. A. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE Trans. Inform. Theory 50 2231–2241.
Mathematical Reviews (MathSciNet): MR2097044
Digital Object Identifier: doi:10.1109/TIT.2004.834793
Tropp, J. A. (2006). Just relax: Convex programming methods for identifying sparse signals. IEEE Trans. Inform. Theory 51 1030–1051.
Turlach, B. (2004). Discussion of “Least angle regression” by Efron, Hastie, Jonstone and Tibshirani. Ann. Statist. 32 494–499.
Mathematical Reviews (MathSciNet): MR2060166
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935
Zhang, H., Wahba, G., Lin, Y., Voelker, M., Ferris, R. K. and Klein, B. (2005). Variable selection and model building via likelihood basis pursuit. J. Amer. Statist. Assoc. 99 659–672.
Mathematical Reviews (MathSciNet): MR2090901
Digital Object Identifier: doi:10.1198/016214504000000593

2010 © Institute of Mathematical Statistics