Inverse statistical learning

Sébastien Loustau

doi:10.1214/13-EJS838

2013 Inverse statistical learning

Sébastien Loustau

Electron. J. Statist. 7: 2065-2097 (2013). DOI: 10.1214/13-EJS838

Abstract

Let $(X,Y)\in\mathcal{X}\times\mathcal{Y}$ be a random couple with unknown distribution $P$. Let $\mathcal{G}$ be a class of measurable functions and $\ell$ a loss function. The problem of statistical learning deals with the estimation of the Bayes: \[g^{*}=\arg\min_{g\in\mathcal{G}}\mathbb{E}_{P}\ell(g,(X,Y)).\] In this paper, we study this problem when we deal with a contaminated sample $(Z_{1},Y_{1}),\dots,(Z_{n},Y_{n})$ of i.i.d. indirect observations. Each input $Z_{i}$, $i=1,\dots,n$ is distributed from a density $Af$, where $A$ is a known compact linear operator and $f$ is the density of the direct input $X$.

We derive fast rates of convergence for the excess risk of empirical risk minimizers based on regularization methods, such as deconvolution kernel density estimators or spectral cut-off. These results are comparable to the existing fast rates in Koltchinskii (2006) for the direct case. It gives some insights into the effect of indirect measurements in the presence of fast rates of convergence.