Open Access
2021 A MOM-based ensemble method for robustness, subsampling and hyperparameter tuning
Joon Kwon, Guillaume Lecué, Matthieu Lerasle
Author Affiliations +
Electron. J. Statist. 15(1): 1202-1227 (2021). DOI: 10.1214/21-EJS1814

Abstract

Hyperparameter tuning and model selection are important steps in machine learning. Unfortunately, classical hyperparameter calibration and model selection procedures are sensitive to outliers and heavy-tailed data. In this work, we construct a selection procedure which can be seen as a robust alternative to cross-validation and is based on a median-of-means principle. Using this procedure, we also build an ensemble method which, trained with algorithms and corrupted heavy-tailed data, selects an algorithm, trains it with a large uncorrupted subsample and automatically tunes its hyperparameters. In particular, the approach can transform any procedure into a robust to outliers and to heavy-tailed data procedure while tuning automatically its hyperparameters.

The construction relies on a divide-and-conquer methodology, making this method easily scalable even on a corrupted dataset. This method is tested with the LASSO which is known to be highly sensitive to outliers.

Funding Statement

The authors gratefully acknowledge financial support from Labex ECODEC (ANR - 11-LABEX-0047).

Citation

Download Citation

Joon Kwon. Guillaume Lecué. Matthieu Lerasle. "A MOM-based ensemble method for robustness, subsampling and hyperparameter tuning." Electron. J. Statist. 15 (1) 1202 - 1227, 2021. https://doi.org/10.1214/21-EJS1814

Information

Received: 1 September 2019; Published: 2021
First available in Project Euclid: 16 March 2021

Digital Object Identifier: 10.1214/21-EJS1814

Subjects:
Primary: 60K35 , 62F35

Keywords: heavy-tailed , robustness

Vol.15 • No. 1 • 2021
Back to Top