Open Access
2022 Reweighting samples under covariate shift using a Wasserstein distance criterion
Julien Reygner, Adrien Touboul
Author Affiliations +
Electron. J. Statist. 16(1): 3278-3314 (2022). DOI: 10.1214/21-EJS1974

Abstract

Considering two random variables with different laws to which we only have access through finite size i.i.d samples, we address how to reweight the first sample so that its empirical distribution converges towards the true law of the second sample as the size of both samples goes to infinity. We study an optimal reweighting that minimizes the Wasserstein distance between the empirical measures of the two samples, and leads to an expression of the weights in terms of Nearest Neighbors. The consistency and some asymptotic convergence rates in terms of expected Wasserstein distance are derived, and do not need the assumption of absolute continuity of one random variable with respect to the other. These results have some application in Uncertainty Quantification for decoupled estimation and in the bound of the generalization error for the Nearest Neighbor regression under covariate shift.

Funding Statement

The research work of AT has been carried out under the leadership of the Technological Research Institute SystemX, and therefore granted with public funds within the scope of the French Program ‘Investissements d’Avenir’.

Acknowledgments

This work was motivated by a collaboration with P. Benjamin, F. Mangeant and M. Yagoubi. We also benefited from fruitful discussions with G. Biau and A. Guyader. Last, we thank two anonymous referees for their careful reading of the article, and their numerous suggestions which allowed to greatly improve the presentation of this work.

Citation

Download Citation

Julien Reygner. Adrien Touboul. "Reweighting samples under covariate shift using a Wasserstein distance criterion." Electron. J. Statist. 16 (1) 3278 - 3314, 2022. https://doi.org/10.1214/21-EJS1974

Information

Received: 1 October 2020; Published: 2022
First available in Project Euclid: 16 May 2022

MathSciNet: MR4421628
zbMATH: 07556932
Digital Object Identifier: 10.1214/21-EJS1974

Keywords: covariate shift , nearest neighbor distance , Nearest neighbor regression , Reweighting , uncertainty quantification , Wasserstein distance

Vol.16 • No. 1 • 2022
Back to Top