The Annals of Statistics

Empirical risk minimization in inverse problems

Jussi Klemelä and Enno Mammen

Full-text: Open access


We study estimation of a multivariate function f : RdR when the observations are available from the function Af, where A is a known linear operator. Both the Gaussian white noise model and density estimation are studied. We define an L2-empirical risk functional which is used to define a δ-net minimizer and a dense empirical risk minimizer. Upper bounds for the mean integrated squared error of the estimators are given. The upper bounds show how the difficulty of the estimation depends on the operator through the norm of the adjoint of the inverse of the operator and on the underlying function class through the entropy of the class. Corresponding lower bounds are also derived. As examples, we consider convolution operators and the Radon transform. In these examples, the estimators achieve the optimal rates of convergence. Furthermore, a new type of oracle inequality is given for inverse problems in additive models.

Article information

Ann. Statist., Volume 38, Number 1 (2010), 482-511.

First available in Project Euclid: 31 December 2009

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G07: Density estimation

Deconvolution empirical risk minimization multivariate density estimation nonparametric function estimation Radon transform tomography


Klemelä, Jussi; Mammen, Enno. Empirical risk minimization in inverse problems. Ann. Statist. 38 (2010), no. 1, 482--511. doi:10.1214/09-AOS726.

Export citation


  • Bass, R. F. (1985). Law of the iterated logarithm for set-indexed partial sum processes with finite variance. Z. Wahrsch. Verw. Gebiete 65 181–237.
  • Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ. Press, Baltimore, MD.
  • Birgé, L. (1983). Approximation dans les espaces métriques et théorie de lestimation. Z. Wahrsch. Verw. Gebiete 70 591–608.
  • Birgé, L. and Massart, P. (1993). Rates of convergence for minimum contrast estimators. Probab. Theory Related Fields 97 113–150.
  • Birman, M. S. and Solomyak, M. Z. (1967). Piecewise-polynomial approximations of functions of the classes Wpα. Mat. Sb. (N.S.) 73 331–355.
  • Cencov, N. N. (1972). Statistical Decision Rules and Optimal Inference. Nauka, Moscow.
  • Comte, F., Taupin, M.-L. and Rozenholc, Y. (2006). Penalized contrast estimator for density deconvolution. Canad. J. Statist. 34 431–452.
  • Deans, S. R. (1983). The Radon Transform and Some of Its Applications. Wiley, New York.
  • Donoho, D. L. (1995). Nonlinear solutions of linear inverse problems by wavelet-vaguelette decomposition. Appl. Comput. Harmon. Anal. 2 101–126.
  • Donoho, D. L. and Low, M. (1992). Renormalization exponents and optimal pointwise rates of convergence. Ann. Statist. 20 944–970.
  • Dudley, R. M. (1999). Uniform Central Limit Theorems. Cambridge Univ. Press, Cambridge.
  • Ermakov, M. S. (1989). Minimax estimation of the solution of an ill-posed convolution type problem. Probl. Inf. Transm. 25 191–200.
  • Hasminskii, R. Z. and Ibragimov, I. A. (1990). On density estimation in the view of Kolmogorov’s ideas in approximation theory. Ann. Statist. 18 999–1010.
  • Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Chapman and Hall, London.
  • Ibragimov, I. A. (2004). Estimation of multivariate regression. Theory Probab. Appl. 48 256–272.
  • Ibragimov, I. A. and Hasminskii, R. Z. (1980). On estimate of the density function. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 98 61–85.
  • Ibragimov, I. A. and Hasminskii, R. Z. (1981). On the nonparametric density estimates. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 108 73–89.
  • Johnstone, I. M. and Silverman, B. W. (1990). Speed of estimation in positron emission tomography and related inverse problems. Ann. Statist. 18 251–280.
  • Klemelä, J. and Mammen, E. (2009). Empirical risk minimization in inverse problems: Extended technical version. Available at arXiv:0904.2977v1.
  • Kolmogorov, A. N. and Tikhomirov, V. M. (1961). ε-entropy and ε-capacity of sets in function spaces. Amer. Math. Soc. Transl. Ser. 2 17 277–364.
  • Koo, J. Y. (1993). Optimal rates of convergence for nonparametric statistical inverse problems. Ann. Statist. 21 590–599.
  • Korostelev, A. P. and Tsybakov, A. B. (1991). Optimal rates of convergence of estimators in a probabilistic setup of tomography problem. Problemy Peredachi Informatsii 27 73–81.
  • Korostelev, A. P. and Tsybakov, A. B. (1993). Minimax Theory of Image Reconstruction. Lecture Notes in Statistics 82. Springer, Berlin.
  • Le Cam, L. (1973). Convergence of estimates under dimensionality restrictions. Ann. Statist. 1 38–53.
  • Mammen, E., Linton, O. and Nielsen, J. (1999). The existence and asymptotic properties of a backfitting projection algorithm under weak conditions. Ann. Statist. 27 1443–1490.
  • Ossiander, M. (1987). A central limit theorem under metric entropy with L2 bracketing. Ann. Probab. 15 897–919.
  • O’Sullivan, F. (1986). A statistical perspective on ill-posed inverse problems. Statist. Sci. 1 502–527.
  • Stone, C. J. (1985). Additive regression and other nonparametric models. Ann. Statist. 13 689–705.
  • Tsybakov, A. B. (1998). Pointwise and sup-norm sharp adaptive estimation of functions on the Sobolev classes. Ann. Statist. 26 2420–2469.
  • Van de Geer, S. A. (2000). Empirical Processes in M-Estimation. Cambridge Univ. Press, Cambridge.
  • Van der Laan, M. J., Dudoit, S. and van der Vaart, A. W. (2004). The cross-validated adaptive epsilon-net estimator. Working paper, U.C. Berkeley Division of Biostatistics.
  • Yang, Y. and Barron, A. (1999). Information-theoretic determination of minimax rates of convergence. Ann. Statist. 27 1564–1599.