## The Annals of Statistics

### Consistent model selection criteria for quadratically supported risks

#### Abstract

In this paper, we study asymptotic properties of model selection criteria for high-dimensional regression models where the number of covariates is much larger than the sample size. In particular, we consider a class of loss functions called the class of quadratically supported risks which is large enough to include the quadratic loss, Huber loss, quantile loss and logistic loss. We provide sufficient conditions for the model selection criteria, which are applicable to the class of quadratically supported risks. Our results extend most previous sufficient conditions for model selection consistency. In addition, sufficient conditions for pathconsistency of the Lasso and nonconvex penalized estimators are presented. Here, pathconsistency means that the probability of the solution path that includes the true model converges to 1. Pathconsistency makes it practically feasible to apply consistent model selection criteria to high-dimensional data. The data-adaptive model selection procedure is proposed which is selection consistent and performs well for finite samples. Results of simulation studies as well as real data analysis are presented to compare the finite sample performances of the proposed data-adaptive model selection criterion with other competitors.

#### Article information

Source
Ann. Statist., Volume 44, Number 6 (2016), 2467-2496.

Dates
Revised: November 2015
First available in Project Euclid: 23 November 2016

https://projecteuclid.org/euclid.aos/1479891625

Digital Object Identifier
doi:10.1214/15-AOS1413

Mathematical Reviews number (MathSciNet)
MR3576551

Zentralblatt MATH identifier
1365.60030

#### Citation

Kim, Yongdai; Jeon, Jong-June. Consistent model selection criteria for quadratically supported risks. Ann. Statist. 44 (2016), no. 6, 2467--2496. doi:10.1214/15-AOS1413. https://projecteuclid.org/euclid.aos/1479891625

#### References

• [1] Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (Tsahkadsor, 1971) 267–281. Akadémiai Kiadó, Budapest.
• [2] Belloni, A. and Chernozhukov, V. (2011). $\ell_{1}$-penalized quantile regression in high-dimensional sparse models. Ann. Statist. 39 82–130.
• [3] Broman, K. W. and Speed, T. P. (2002). A model selection approach for the identification of quantitative trait loci in experimental crosses. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 641–656.
• [4] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Series in Statistics. Springer, Heidelberg.
• [5] Casella, G., Girón, F. J., Martínez, M. L. and Moreno, E. (2009). Consistency of Bayesian procedures for variable selection. Ann. Statist. 37 1207–1228.
• [6] Chen, J. and Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95 759–771.
• [7] Chen, J. and Chen, Z. (2012). Extended BIC for small-$n$-large-$P$ sparse GLM. Statist. Sinica 22 555–574.
• [8] Craven, P. and Wahba, G. (1978/79). Smoothing noisy data with spline functions. Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31 377–403.
• [9] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348–1360.
• [10] Fan, J. and Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters. Ann. Statist. 32 928–961.
• [11] Fan, Y. and Tang, C. Y. (2013). Tuning parameter selection in high dimensional penalized likelihood. J. R. Stat. Soc. Ser. B. Stat. Methodol. 75 531–552.
• [12] Foster, D. and George, E. (1994). The risk inflation criterion for multiple regression. Ann. Statist. 22 1947–1975.
• [13] Geer, S. A. (2000). Empirical Processes in M-estimation 6, Cambridge University Press, Cambridge.
• [14] He, X. and Shi, P. (1994). Convergence rate of $B$-spline estimators of nonparametric conditional quantile functions. J. Nonparametr. Statist. 3 299–308.
• [15] Huang, J., Ma, S. and Zhang, C.-H. (2008). Adaptive Lasso for sparse high-dimensional regression models. Statist. Sinica 18 1603–1618.
• [16] Kim, Y., Choi, H. and Oh, H.-S. (2008). Smoothly clipped absolute deviation on high dimensions. J. Amer. Statist. Assoc. 103 1665–1673.
• [17] Kim, Y. and Kwon, S. (2012). Global optimality of nonconvex penalized estimators. Biometrika 99 315–325.
• [18] Kim, Y., Kwon, S. and Choi, H. (2012). Consistent model selection criteria on high dimensions. J. Mach. Learn. Res. 13 1037–1057.
• [19] Koenker, R. (2005). Quantile Regression. Econometric Society Monographs 38. Cambridge Univ. Press, Cambridge.
• [20] Lee, E. R., Noh, H. and Park, B. U. (2014). Model selection via Bayesian information criterion for quantile regression models. J. Amer. Statist. Assoc. 109 216–229.
• [21] Li, K.-C. and Duan, N. (1989). Regression analysis under link violation. Ann. Statist. 17 1009–1052.
• [22] Portnoy, S. (1985). Asymptotic behavior of $M$ estimators of $p$ regression parameters when $p^{2}/n$ is large. II. Normal approximation. Ann. Statist. 13 1403–1417.
• [23] Rosset, S. and Zhu, J. (2007). Piecewise linear regularized solution paths. Ann. Statist. 35 1012–1030.
• [24] Scheetz, T. E., Kim, K.-Y. A., Swiderski, R. E., Philp, A. R., Braun, T. A., Knudtson, K. L., Dorrance, A. M., DiBona, G. F., Huang, J., Casavant, T. L., Sheffield, V. C. and Stone, E. M. (2006). Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proc. Natl. Acad. Sci. USA 103 14429–14434.
• [25] Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
• [26] Shao, J. (1997). An asymptotic theory for linear model selection. Statist. Sinica 7 221–264.
• [27] Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. J. Roy. Statist. Soc. Ser. B 36 111–147.
• [28] Wang, H., Li, B. and Leng, C. (2009). Shrinkage tuning parameter selection with a diverging number of parameters. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 671–683.
• [29] Wang, L., Kim, Y. and Li, R. (2013). Calibrating nonconvex penalized regression in ultra-high dimension. Ann. Statist. 41 2505–2536.
• [30] Wang, L., Wu, Y. and Li, R. (2012). Quantile regression for analyzing heterogeneity in ultra-high dimension. J. Amer. Statist. Assoc. 107 214–222.
• [31] Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894–942.
• [32] Zhang, Y. and Shen, X. (2010). Model selection procedure for high-dimensional data. Stat. Anal. Data Min. 3 350–358.