Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 13, Number 4 (2019), 2065-2090.
Robust elastic net estimators for variable selection and identification of proteomic biomarkers
Gabriela V. Cohen Freue, David Kepplinger, Matías Salibián-Barrera, and Ezequiel Smucler
Abstract
In large-scale quantitative proteomic studies, scientists measure the abundance of thousands of proteins from the human proteome in search of novel biomarkers for a given disease. Penalized regression estimators can be used to identify potential biomarkers among a large set of molecular features measured. Yet, the performance and statistical properties of these estimators depend on the loss and penalty functions used to define them. Motivated by a real plasma proteomic biomarkers study, we propose a new class of penalized robust estimators based on the elastic net penalty, which can be tuned to keep groups of correlated variables together in the selected model and maintain robustness against possible outliers. We also propose an efficient algorithm to compute our robust penalized estimators and derive a data-driven method to select the penalty term. Our robust penalized estimators have very good robustness properties and are also consistent under certain regularity conditions. Numerical results show that our robust estimators compare favorably to other robust penalized estimators. Using our proposed methodology for the analysis of the proteomics data, we identify new potentially relevant biomarkers of cardiac allograft vasculopathy that are not found with nonrobust alternatives. The selected model is validated in a new set of 52 test samples and achieves an area under the receiver operating characteristic (AUC) of 0.85.
Article information
Source
Ann. Appl. Stat., Volume 13, Number 4 (2019), 2065-2090.
Dates
Received: March 2018
Revised: February 2019
First available in Project Euclid: 28 November 2019
Permanent link to this document
https://projecteuclid.org/euclid.aoas/1574910036
Digital Object Identifier
doi:10.1214/19-AOAS1269
Mathematical Reviews number (MathSciNet)
MR4037422
Zentralblatt MATH identifier
07160931
Keywords
Robust estimation regularized estimation penalized estimation elastic net penalty proteomics biomarkers
Citation
Cohen Freue, Gabriela V.; Kepplinger, David; Salibián-Barrera, Matías; Smucler, Ezequiel. Robust elastic net estimators for variable selection and identification of proteomic biomarkers. Ann. Appl. Stat. 13 (2019), no. 4, 2065--2090. doi:10.1214/19-AOAS1269. https://projecteuclid.org/euclid.aoas/1574910036
Supplemental materials
- Supplementary material for “Robust elastic net estimators for variable selection and identification of proteomic biomarkers”. We provide additional details on PENSE algorithm, properties and mathematical proofs.Digital Object Identifier: doi:10.1214/19-AOAS1269SUPP

