Institute of Mathematical Statistics Collections

Consistent selection via the Lasso for high dimensional approximating regression models

Florentina Bunea

Abstract

In this article we investigate consistency of selection in regression models via the popular Lasso method. Here we depart from the traditional linear regression assumption and consider approximations of the regression function f with elements of a given dictionary of M functions. The target for consistency is the index set of those functions from this dictionary that realize the most parsimonious approximation to f among all linear combinations belonging to an L2 ball centered at f and of radius r2n, M. In this framework we show that a consistent estimate of this index set can be derived via 1 penalized least squares, with a data dependent penalty and with tuning sequence rn, M>$\sqrt{\log(Mn)/n}$, where n is the sample size. Our results hold for any 1≤Mnγ, for any γ>0.

First Page: Show Hide
Primary Subjects: 62G08
Secondary Subjects: 62C20, 62G05, 62G20
Keywords: consistency; high dimension; Lasso; l_1 regularization; regression; penalty; selection
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.imsc/1209398465
Digital Object Identifier: doi:10.1214/074921708000000101

References

[1] Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Automat. Control 19 716–723.
Mathematical Reviews (MathSciNet): MR423716
Digital Object Identifier: doi:10.1109/TAC.1974.1100705
[2] Barron, A., Birgé, L. and Massart, P. (1999). Risk bounds for model selection via penalization. Probab. Theory Related Fields 113 301–413.
Mathematical Reviews (MathSciNet): MR1679028
Zentralblatt MATH: 0946.62036
Digital Object Identifier: doi:10.1007/s004400050210
[3] Benjamini, Y. and Hochberg, Y. (1995). Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Hypothesis Testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
[4] Bunea, F. (2004). Consistent covariate selection and post model selection inference in semiparametric regression. Ann. Statist. 32 898–927.
Mathematical Reviews (MathSciNet): MR2065193
Zentralblatt MATH: 1092.62045
Digital Object Identifier: doi:10.1214/009053604000000247
Project Euclid: euclid.aos/1085408490
[5] Bunea, F., Wegkamp, M. H. and Auguste, A. (2006). Consistent variable selection in high dimensional regression via multiple testing. J. Statist. Plann. Inference 136 4349–4364.
Mathematical Reviews (MathSciNet): MR2323420
Zentralblatt MATH: 1112.62062
Digital Object Identifier: doi:10.1016/j.jspi.2005.03.011
[6] Bunea, F., Tsybakov, A. B. and Wegkamp, M. H. (2007). Sparsity oracle inequalities for the Lasso. Electronic J. Statist. 1 169–194.
Mathematical Reviews (MathSciNet): MR2312149
Digital Object Identifier: doi:10.1214/07-EJS008
Project Euclid: euclid.ejs/1179759718
[7] Chakrabarti, A. and Ghosh, J. K. (2006). A generalization of BIC for the general exponential families. J. Statist. Plann. Inference 136 2847–2872.
Mathematical Reviews (MathSciNet): MR2281234
Zentralblatt MATH: 1094.62031
Digital Object Identifier: doi:10.1016/j.jspi.2005.01.005
[8] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–451.
Mathematical Reviews (MathSciNet): MR2060166
Zentralblatt MATH: 1091.62054
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935
[9] Genovese, C. and Wasserman, L. (2004). A Stochastic Process Approach to False Discovery Rates. Ann. Statist. 32 1035–1061.
Mathematical Reviews (MathSciNet): MR2065197
Zentralblatt MATH: 1092.62065
Digital Object Identifier: doi:10.1214/009053604000000283
Project Euclid: euclid.aos/1085408494
[10] Guyon, X. and Yao, J. (1999). On the underfitting and overfitting sets of models chosen by order selection criteria. J. Multivariate Anal. 70 221–315.
Mathematical Reviews (MathSciNet): MR1711522
Digital Object Identifier: doi:10.1006/jmva.1999.1828
[11] Lahiri, P., ed. (2001). Model Selection. Institute of Mathematical Statistics Lecture Notes – Monograph Series 38. IMS, Beachwood, OH.
Mathematical Reviews (MathSciNet): MR2000750
[12] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 1436–1462.
[13] Osborne, M. R., Presnell, B. and Turlach, B. A. (2000a). On the lasso and its dual. J. Comput. Graph. Statist. 9 319–337.
Mathematical Reviews (MathSciNet): MR1822089
Digital Object Identifier: doi:10.2307/1390657
[14] Osborne, M. R., Presnell, B. and Turlach, B. A. (2000b). A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20 389–404.
[15] Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
Mathematical Reviews (MathSciNet): MR468014
Zentralblatt MATH: 0379.62005
Digital Object Identifier: doi:10.1214/aos/1176344136
Project Euclid: euclid.aos/1176344136
[16] Shao, J. (1993). Linear model selection by cross validation. J. Amer. Statist. Assoc. 888 486–494.
Mathematical Reviews (MathSciNet): MR1224373
Zentralblatt MATH: 0773.62051
Digital Object Identifier: doi:10.2307/2290328
[17] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
Mathematical Reviews (MathSciNet): MR1379242
[18] Turlach, B. A. (2005). On algorithms for solving least squares problems under an L1 penalty or an L1 constraint. 2004 Proceedings of the American Statistical Association, Statistical Computing Section [CD-ROM] 2572–2577. American Statistical Association, Alexandria, VA.
[19] Wainwright, M. J. (2007). Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting. Technical report, Dept. Statistics, UC Berkeley.
[20] Wasserman, L. and Roeder, K. (2007). High dimensional variable selection. Technical report, Dept. Statistics, Carnegie Mellon Univ.
[21] Wegkamp, M. H. (2003). Model selection in nonparametric regression. Ann. Statist. 31 252–273.
Mathematical Reviews (MathSciNet): MR1962506
Digital Object Identifier: doi:10.1214/aos/1046294464
Project Euclid: euclid.aos/1046294464
[22] Woodroofe, M. (1982). On model selection and the arcsine laws. Ann. Statist. 10 1182–1194.
Mathematical Reviews (MathSciNet): MR673653
Zentralblatt MATH: 0507.62037
Digital Object Identifier: doi:10.1214/aos/1176345983
Project Euclid: euclid.aos/1176345983
[23] Zhao, P. and Yu, B. (2007). On model selection consistency of Lasso. J. Machine Learning Research 7 2541–2567.
Mathematical Reviews (MathSciNet): MR2274449
[24] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418–1429.
Mathematical Reviews (MathSciNet): MR2279469
Zentralblatt MATH: 1171.62326
Digital Object Identifier: doi:10.1198/016214506000000735

2012 © Institute of Mathematical Statistics

Institute of Mathematical Statistics Collections

Institute of Mathematical Statistics Collections