One-step sparse estimates in nonconcave penalized likelihood models



The Annals of Statistics

One-step sparse estimates in nonconcave penalized likelihood models

Hui Zou and Runze Li

Source: Ann. Statist. Volume 36, Number 4 (2008), 1509-1533.

Abstract

Fan and Li propose a family of variable selection methods via penalized likelihood using concave penalty functions. The nonconcave penalized likelihood estimators enjoy the oracle properties, but maximizing the penalized likelihood function is computationally challenging, because the objective function is nondifferentiable and nonconcave. In this article, we propose a new unified algorithm based on the local linear approximation (LLA) for maximizing the penalized likelihood for a broad class of concave penalty functions. Convergence and other theoretical properties of the LLA algorithm are established. A distinguished feature of the LLA algorithm is that at each LLA step, the LLA estimator can naturally adopt a sparse representation. Thus, we suggest using the one-step LLA estimator from the LLA algorithm as the final estimates. Statistically, we show that if the regularization parameter is appropriately chosen, the one-step LLA estimates enjoy the oracle properties with good initial estimators. Computationally, the one-step LLA estimation methods dramatically reduce the computational cost in maximizing the nonconcave penalized likelihood. We conduct some Monte Carlo simulation to assess the finite sample performance of the one-step sparse estimation methods. The results are very encouraging.

Primary Subjects: 62J05, 62J07
Keywords: AIC; BIC; LASSO; one-step estimator; oracle properties; SCAD

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Alternatively, the document is available for a cost of $15. Select the "buy article" button below to purchase this document from a secured VeriSign, Inc. site.
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1216237287
Digital Object Identifier: doi:10.1214/009053607000000802

References

[1] Antoniadis, A. and Fan, J. (2001). Regularization of wavelets approximations. J. Amer. Statist. Assoc. 96 939–967.
Mathematical Reviews (MathSciNet): MR1946364
Digital Object Identifier: doi:10.1198/016214501753208942
[2] Bickel, P. J. (1975). One-step Huber estimates in the linear model. J. Amer. Statist. Assoc. 70 428–434.
Mathematical Reviews (MathSciNet): MR386168
Digital Object Identifier: doi:10.2307/2285834
[3] Black, A. and Zisserman, A. (1987). Visual Reconstruction. MIT Press, Cambridge, MA.
Mathematical Reviews (MathSciNet): MR919733
[4] Breiman, L. (1996). Heuristics of instability and stabilization in model selection. Ann. Statist. 24 2350–2383.
Mathematical Reviews (MathSciNet): MR1425957
Digital Object Identifier: doi:10.1214/aos/1032181158
Project Euclid: euclid.aos/1032181158
[5] Cai, J., Fan, J., Li, R. and Zhou, H. (2005). Variable selection for multivariate failure time data. Biometrika 92 303–316.
Mathematical Reviews (MathSciNet): MR2201361
Zentralblatt MATH: 05039580
Digital Object Identifier: doi:10.1093/biomet/92.2.303
[6] Cai, J., Fan, J., Zhou, H. and Zhou, Y. (2007). Marginal hazard models with varying-coefficients for multivariate failure time data. Ann. Statist. 35 324–354.
Mathematical Reviews (MathSciNet): MR2332278
Digital Object Identifier: doi:10.1214/009053606000001145
Project Euclid: euclid.aos/1181100190
[7] Cai, Z., Fan, J. and Li, R. (2000). Efficient estimation and inferences for varying-coefficient models. J. Amer. Statist. Assoc. 95 888–902.
Mathematical Reviews (MathSciNet): MR1804446
Digital Object Identifier: doi:10.2307/2669472
[8] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407–499.
Mathematical Reviews (MathSciNet): MR2060166
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935
[9] Fan, J. and Chen, J. (1999). One-step local quasi-likelihood estimation. J. Roy. Statist. Soc. Ser. B 61 927–943.
Mathematical Reviews (MathSciNet): MR1722248
Digital Object Identifier: doi:10.1111/1467-9868.00211
[10] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
Mathematical Reviews (MathSciNet): MR1946581
Digital Object Identifier: doi:10.1198/016214501753382273
[11] Fan, J. and Li, R. (2002). Variable selection for Cox’s proportional hazards model and frailty model. Ann. Statist. 30 74–99.
Mathematical Reviews (MathSciNet): MR1892656
Digital Object Identifier: doi:10.1214/aos/1015362185
Project Euclid: euclid.aos/1015362185
[12] Fan, J. and Li, R. (2004). New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. J. Amer. Statist. Assoc. 99 710–723.
Mathematical Reviews (MathSciNet): MR2090905
Digital Object Identifier: doi:10.1198/016214504000001060
[13] Fan, J. and Li, R. (2006). Statistical challenges with high dimensionality: Feature selection in knowledge discovery. In Proceedings of the Madrid International Congress of Mathematicians 2006 3 595–622. EMS, Zürich.
[14] Fan, J., Lin, H. and Zhou, Y. (2006). Local partial likelihood estimation for life time data. Ann. Statist. 34 290–325.
Mathematical Reviews (MathSciNet): MR2275243
Digital Object Identifier: doi:10.1214/009053605000000796
Project Euclid: euclid.aos/1146576264
[15] Fan, J. and Peng, H. (2004). On non-concave penalized likelihood with diverging number of parameters. Ann. Statist. 32 928–961.
Mathematical Reviews (MathSciNet): MR2065194
Digital Object Identifier: doi:10.1214/009053604000000256
Project Euclid: euclid.aos/1085408491
[16] Frank, I. and Friedman, J. (1993). A statistical view of some chemometrics regression tools. Technometrics 35 109–148.
[17] Fu, W. (1998). Penalized regression: The bridge versus the lasso. J. Comput. Graph. Statist. 7 397–416.
Mathematical Reviews (MathSciNet): MR1646710
Digital Object Identifier: doi:10.2307/1390712
[18] Geyer, C. (1994). On the asymptotics of constrainted M-estimation. Ann. Statist. 22 1993–2010.
Mathematical Reviews (MathSciNet): MR1329179
Digital Object Identifier: doi:10.1214/aos/1176325768
Project Euclid: euclid.aos/1176325768
[19] Heiser, W. (1995). Convergent Computation by Iterative Majorization: Theory and Applications in Multidimensional Data Analysis. Clarendon Press, Oxford.
Mathematical Reviews (MathSciNet): MR1380319
[20] Hunter, D. and Li, R. (2005). Variable selection using mm algorithms. Ann. Statist. 33 1617–1642.
Mathematical Reviews (MathSciNet): MR2166557
Digital Object Identifier: doi:10.1214/009053605000000200
Project Euclid: euclid.aos/1123250224
[21] Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators. Ann. Statist. 28 1356–1378.
Mathematical Reviews (MathSciNet): MR1805787
Digital Object Identifier: doi:10.1214/aos/1015957397
Project Euclid: euclid.aos/1015957397
[22] Lange, K. (1995). A gradient algorithm locally equivalent to the EM algorithm. J. Roy. Statist. Soc. Ser. B 57 425–437.
Mathematical Reviews (MathSciNet): MR1323348
[23] Lange, K., Hunter, D. and Yang, I. (2000). Optimization transfer using surrogate objective functions (with discussion). J. Comput. Graph. Statist. 9 1–59.
Mathematical Reviews (MathSciNet): MR1819865
Digital Object Identifier: doi:10.2307/1390605
[24] Lehmann, E. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR1639875
Zentralblatt MATH: 0916.62017
[25] Leng, C., Lin, Y. and Wahba, G. (2006). A note on the lasso and related procedures in model selection. Statist. Sinica 16 1273–1284.
Mathematical Reviews (MathSciNet): MR2327490
[26] Li, R. and Liang, H. (2008). Variable selection in semiparametric regression modeling. Ann. Statist. 36 261–286.
Mathematical Reviews (MathSciNet): MR2387971
Digital Object Identifier: doi:10.1214/009053607000000604
Project Euclid: euclid.aos/1201877301
[27] Mike, W. (1984). Outlier models and prior distributions in Bayesian linear regression. J. Roy. Statist. Soc. Ser. B 46 431–439.
Mathematical Reviews (MathSciNet): MR790630
[28] Miller, A. (2002). Subset Selection in Regression, 2nd ed. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR2001193
Zentralblatt MATH: 1051.62060
[29] Osborne, M., Presnell, B. and Turlach, B. (2000). A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20 389–403.
Mathematical Reviews (MathSciNet): MR1773265
Digital Object Identifier: doi:10.1093/imanum/20.3.389
[30] Robinson, P. (1988). The stochastic difference between econometrics and statistics. Econometrics 56 531–547.
Mathematical Reviews (MathSciNet): MR946120
Digital Object Identifier: doi:10.2307/1911699
[31] Rosset, S. and Zhu, J. (2007). Piecewise linear regularized solution paths. Ann. Statist. 35 1012–1030.
Mathematical Reviews (MathSciNet): MR2341696
Digital Object Identifier: doi:10.1214/009053606000001370
Project Euclid: euclid.aos/1185303996
[32] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267–288.
Mathematical Reviews (MathSciNet): MR1379242
[33] Wu, Y. (2000). Optimization transfer using surrogate objective functions: Discussion. J. Comput. Graph. Statist. 9 32–34.
Mathematical Reviews (MathSciNet): MR1819865
Digital Object Identifier: doi:10.2307/1390605
[34] Yuan, M. and Lin, Y. (2005). Efficient empirical Bayes variable selection and estimation in linear models. J. Amer. Statist. Assoc. 100 1215–1225.
Mathematical Reviews (MathSciNet): MR2236436
Digital Object Identifier: doi:10.1198/016214505000000367
[35] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. Roy. Statist. Soc. Ser. B 68 49–67.
Mathematical Reviews (MathSciNet): MR2212574
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00532.x
[36] Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. J. Roy. Statist. Soc. Ser. B 67 301–320.
Mathematical Reviews (MathSciNet): MR2137327
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00503.x

2008 © Institute of Mathematical Statistics