Electronic Journal of Statistics

Inverse statistical learning

Sébastien Loustau

Full-text: Open access

Abstract

Let $(X,Y)\in\mathcal{X}\times\mathcal{Y}$ be a random couple with unknown distribution $P$. Let $\mathcal{G}$ be a class of measurable functions and $\ell$ a loss function. The problem of statistical learning deals with the estimation of the Bayes: \[g^{*}=\arg\min_{g\in\mathcal{G}}\mathbb{E}_{P}\ell(g,(X,Y)).\] In this paper, we study this problem when we deal with a contaminated sample $(Z_{1},Y_{1}),\dots,(Z_{n},Y_{n})$ of i.i.d. indirect observations. Each input $Z_{i}$, $i=1,\dots,n$ is distributed from a density $Af$, where $A$ is a known compact linear operator and $f$ is the density of the direct input $X$.

We derive fast rates of convergence for the excess risk of empirical risk minimizers based on regularization methods, such as deconvolution kernel density estimators or spectral cut-off. These results are comparable to the existing fast rates in Koltchinskii (2006) for the direct case. It gives some insights into the effect of indirect measurements in the presence of fast rates of convergence.

Article information

Source
Electron. J. Statist., Volume 7 (2013), 2065-2097.

Dates
First available in Project Euclid: 20 August 2013

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1377005820

Digital Object Identifier
doi:10.1214/13-EJS838

Mathematical Reviews number (MathSciNet)
MR3091617

Zentralblatt MATH identifier
1349.62102

Subjects
Primary: 62G05: Estimation
Secondary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]

Keywords
Statistical learning inverse problem classification deconvolution fast rates

Citation

Loustau, Sébastien. Inverse statistical learning. Electron. J. Statist. 7 (2013), 2065--2097. doi:10.1214/13-EJS838. https://projecteuclid.org/euclid.ejs/1377005820


Export citation

References

  • Audibert, J.-Y. and Tsybakov, A.B. Fast learning rates for plug-in classifiers., The Annals of Statistics, 35: 608–633, 2007.
  • Bartlett, P.L., Bousquet, O., and Mendelson, S. Local rademacher complexities., The Annals of Statistics, 33(4): 1497–1537, 2005.
  • Bartlett, P.L. and Mendelson, S. Empirical minimization., Probability Theory and Related Fields, 135(3): 311–334, 2006.
  • Blanchard, G., Bousquet, O., and Massart, P. Statistical performance of support vector machines., The Annals of Statistics, 36(2): 489–531, 2008.
  • Bousquet, O. A bennet concentration inequality and its application to suprema of empirical processes., C. R. Acad. SCI. Paris Ser. I Math, 334: 495–500, 2002.
  • Butucea, C. Goodness-of-fit testing and quadratic functionnal estimation from indirect observations., The Annals of Statistics, 35: 1907–1930, 2007.
  • Cavalier, L. Nonparametric statistical inverse problems., Inverse Problems, 24: 1–19, 2008.
  • Delaigle, A., Hall, P., and Meister, A. On deconvolution with repeated measurements., The Annals of Statistics, 36(2): 665–685, 2008.
  • Devroye, L., Györfi, L., and Lugosi, G., A Probabilistic Theory of Pattern Recognition. Springer-Verlag, 1996.
  • Engl, H.W., Hank, M., and Neubauer, A., Regularization of Inverse Problems. Kluwer Academic Publishers Group, Dordrecht, 1996.
  • Fan, J. On the optimal rates of convergence for nonparametric deconvolution problems., The Annals of Statistics, 19: 1257–1272, 1991.
  • Koltchinskii, V. Local rademacher complexities and oracle inequalties in risk minimization., The Annals of Statistics, 34(6): 2593–2656, 2006.
  • Lecué, G. and Mendelson, S. General non-exact oracle inequalities for classes with a subexponential envelope., The Annals of Statistics, 40(2): 832–860, 2012.
  • Lederer, Y. and van de Geer, S. New concentration inequalities for suprema of empirical processes. Submitted, 2012.
  • Loustau, S. Penalized empirical risk minimization over Besov spaces., Electronic Journal of Statistics, 3: 824–850, 2009.
  • Loustau, S. Fast rates for noisy clustering., http://hal.archives-ouvertes.fr/hal-00695258, 2012.
  • Loustau, S. and Marteau, C. Discriminant analysis with errors in variables., http://hal.archives-ouvertes.fr/hal-00660383, 2012.
  • Loustau, S. and Marteau, C. Minimax fast rates in discriminant analysis with errors in variables., In revision to Bernoulli, 2013.
  • Mammen, E. and Tsybakov, A.B. Smooth discrimination analysis., The Annals of Statistics, 27(6): 1808–1829, 1999.
  • Massart, P. Some applications of concentration inequalities to statistics., Ann. Fac. Sci. Toulouse Math., 9(2): 245–303, 2000.
  • Massart, P. and Nédélec, E. Risk bounds for statistical learning., The Annals of Statistics, 34(5): 2326–2366, 2006.
  • Meister, A., Deconvolution Problems in Nonparametric Statistics. Springer-Verlag, 2009.
  • Polonik, W. Measuring mass concentrations and estimating density contour clusters – An excess mass approach., The Annals of Statistics, 23(3): 855–881, 1995.
  • Talagrand, M. New concentration inequalities in product spaces., Invent. Math, 126: 505–563, 1996.
  • Tsybakov, A.B., Introduction à l’estimation non-paramétrique. Springer-Verlag, 2004a.
  • Tsybakov, A.B. Optimal aggregation of classifiers in statistical learning., The Annals of Statistics, 32(1): 135–166, 2004b.
  • Tsybakov, A.B. and van de Geer, S. Square root penalty: Adaptation to the margin in classification and in edge estimation., The Annals of Statistics, 33(3): 1203–1224, 2005.
  • van de Geer, S., Empirical Processes in M-estimation. Cambridge University Press, 2000.
  • van der Vaart, A.W. and Wellner, J.A., Weak Convergence and Empirical Processes. With Applications to Statistics. Springer Verlag, 1996.
  • Vapnik, V., Estimation of Dependances Based on Empirical Data. Springer Verlag, 1982.
  • Vapnik, V., The Nature of Statistical Learning Theory. Statistics for Engineering and Information Science, Springer, 2000.
  • Williamson, R.C., Smola, A.J., and Schölkopf, B. Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators., IEEE Transactions on Information Theory, 47(6): 2516–2532, 2001.