Electronic Journal of Statistics
- Electron. J. Statist.
- Volume 11, Number 2 (2017), 3226-3250.
Recovery of weak signal in high dimensional linear regression by data perturbation
How to recover weak signals (i.e., small nonzero regression coefficients) is a difficult task in high dimensional feature selection problems. Both convex and nonconvex regularization methods fail to fully recover the true model whenever there exist strong columnwise correlations in design matrices or small nonzero coefficients below some threshold. To address the two challenges, we propose a procedure, Perturbed LASSO (PLA), that weakens correlations in the design matrix and strengthens signals by adding random perturbations to the design matrix. Moreover, a quantitative relationship between the selection accuracy and computing cost of PLA is derived. We theoretically prove and demonstrate using simulations that PLA substantially improves the chance of recovering weak signals and outperforms comparable methods at a limited cost of computation.
Electron. J. Statist. Volume 11, Number 2 (2017), 3226-3250.
Received: November 2016
First available in Project Euclid: 25 September 2017
Permanent link to this document
Digital Object Identifier
Zentralblatt MATH identifier
Primary: 62J07: Ridge regression; shrinkage estimators
Zhang, Yongli. Recovery of weak signal in high dimensional linear regression by data perturbation. Electron. J. Statist. 11 (2017), no. 2, 3226--3250. doi:10.1214/17-EJS1320. https://projecteuclid.org/euclid.ejs/1506326416