Open Access
May 2012 Similarity of samples and trimming
Pedro C. Álvarez-Esteban, Eustasio del Barrio, Juan A. Cuesta-Albertos, Carlos Matrán
Bernoulli 18(2): 606-634 (May 2012). DOI: 10.3150/11-BEJ351

Abstract

We say that two probabilities are similar at level $α$ if they are contaminated versions (up to an $α$ fraction) of the same common probability. We show how this model is related to minimal distances between sets of trimmed probabilities. Empirical versions turn out to present an overfitting effect in the sense that trimming beyond the similarity level results in trimmed samples that are closer than expected to each other. We show how this can be combined with a bootstrap approach to assess similarity from two data samples.

Citation

Download Citation

Pedro C. Álvarez-Esteban. Eustasio del Barrio. Juan A. Cuesta-Albertos. Carlos Matrán. "Similarity of samples and trimming." Bernoulli 18 (2) 606 - 634, May 2012. https://doi.org/10.3150/11-BEJ351

Information

Published: May 2012
First available in Project Euclid: 16 April 2012

zbMATH: 1239.62005
MathSciNet: MR2922463
Digital Object Identifier: 10.3150/11-BEJ351

Keywords: asymptotics , bootstrap , consistency , mass transportation problem , over-fitting , robustness , similarity of distributions , Trimmed probability , Wasserstein distance

Rights: Copyright © 2012 Bernoulli Society for Mathematical Statistics and Probability

Vol.18 • No. 2 • May 2012
Back to Top