The success of the Lasso in the era of high-dimensional data can be attributed to its conducting an implicit model selection, that is, zeroing out regression coefficients that are not significant. By contrast, classical ridge regression cannot reveal a potential sparsity of parameters, and may also introduce a large bias under the high-dimensional setting. Nevertheless, recent work on the Lasso involves debiasing and thresholding, the latter in order to further enhance the model selection. As a consequence, ridge regression may be worth another look since—after debiasing and thresholding—it may offer some advantages over the Lasso, for example, it can be easily computed using a closed-form expression. In this paper, we define a debiased and thresholded ridge regression method, and prove a consistency result and a Gaussian approximation theorem. We further introduce a wild bootstrap algorithm to construct confidence regions and perform hypothesis testing for a linear combination of parameters. In addition to estimation, we consider the problem of prediction, and present a novel, hybrid bootstrap algorithm tailored for prediction intervals. Extensive numerical simulations further show that the debiased and thresholded ridge regression has favorable finite sample performance and may be preferable in some settings.
This research was partially supported by NSF Grant DMS-19-14556.
The authors are grateful to the anonymous referees for their valuable suggestions that significantly improve the content of this article.
"Ridge regression revisited: Debiasing, thresholding and bootstrap." Ann. Statist. 50 (3) 1401 - 1422, June 2022. https://doi.org/10.1214/21-AOS2156