Abstract
This paper is concerned with the selection of explanatory variables in multivariate linear regression. The Akaike’s information criterion and the $C_p$ criterion cannot perform in high-dimensional situations such that the dimension of a vector stacked with response variables exceeds the sample size. To overcome this, we consider two variable selection criteria based on an $L_2$ squared distance with a weighted matrix, namely the scalar-type generalized $C_p$ criterion and the ridge-type generalized $C_p$ criterion. We clarify conditions for their consistency under a hybrid-ultra-highdimensional asymptotic framework such that the sample size always goes to infinity but the number of response variables may not go to infinity. Numerical experiments show that the probabilities of selecting the true subset by criteria satisfying consistency conditions are high even when the dimension is larger than the sample size. Finally, we illuminate the practical utility of these criteria using empirical data.
Funding Statement
The author is supported financially by Research Fellowships of the Japan Society for the Promotion of Science for Young Scientists.
Citation
Ryoya Oda. "Consistent variable selection criteria in multivariate linear regression even when dimension exceeds sample size." Hiroshima Math. J. 50 (3) 339 - 374, November 2020. https://doi.org/10.32917/hmj/1607396493
Information