The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 13, Number 4 (2019), 2065-2090.
Robust elastic net estimators for variable selection and identification of proteomic biomarkers
In large-scale quantitative proteomic studies, scientists measure the abundance of thousands of proteins from the human proteome in search of novel biomarkers for a given disease. Penalized regression estimators can be used to identify potential biomarkers among a large set of molecular features measured. Yet, the performance and statistical properties of these estimators depend on the loss and penalty functions used to define them. Motivated by a real plasma proteomic biomarkers study, we propose a new class of penalized robust estimators based on the elastic net penalty, which can be tuned to keep groups of correlated variables together in the selected model and maintain robustness against possible outliers. We also propose an efficient algorithm to compute our robust penalized estimators and derive a data-driven method to select the penalty term. Our robust penalized estimators have very good robustness properties and are also consistent under certain regularity conditions. Numerical results show that our robust estimators compare favorably to other robust penalized estimators. Using our proposed methodology for the analysis of the proteomics data, we identify new potentially relevant biomarkers of cardiac allograft vasculopathy that are not found with nonrobust alternatives. The selected model is validated in a new set of 52 test samples and achieves an area under the receiver operating characteristic (AUC) of 0.85.
Ann. Appl. Stat., Volume 13, Number 4 (2019), 2065-2090.
Received: March 2018
Revised: February 2019
First available in Project Euclid: 28 November 2019
Permanent link to this document
Digital Object Identifier
Cohen Freue, Gabriela V.; Kepplinger, David; Salibián-Barrera, Matías; Smucler, Ezequiel. Robust elastic net estimators for variable selection and identification of proteomic biomarkers. Ann. Appl. Stat. 13 (2019), no. 4, 2065--2090. doi:10.1214/19-AOAS1269. https://projecteuclid.org/euclid.aoas/1574910036
- Supplementary material for “Robust elastic net estimators for variable selection and identification of proteomic biomarkers”. We provide additional details on PENSE algorithm, properties and mathematical proofs.