Statistical Science

Minimax and Adaptive Inference in Nonparametric Function Estimation

T. Tony Cai

Full-text: Open access

Abstract

Since Stein’s 1956 seminal paper, shrinkage has played a fundamental role in both parametric and nonparametric inference. This article discusses minimaxity and adaptive minimaxity in nonparametric function estimation. Three interrelated problems, function estimation under global integrated squared error, estimation under pointwise squared error, and nonparametric confidence intervals, are considered. Shrinkage is pivotal in the development of both the minimax theory and the adaptation theory.

While the three problems are closely connected and the minimax theories bear some similarities, the adaptation theories are strikingly different. For example, in a sharp contrast to adaptive point estimation, in many common settings there do not exist nonparametric confidence intervals that adapt to the unknown smoothness of the underlying function. A concise account of these theories is given. The connections as well as differences among these problems are discussed and illustrated through examples.

Article information

Source
Statist. Sci., Volume 27, Number 1 (2012), 31-50.

Dates
First available in Project Euclid: 14 March 2012

Permanent link to this document
https://projecteuclid.org/euclid.ss/1331729981

Digital Object Identifier
doi:10.1214/11-STS355

Mathematical Reviews number (MathSciNet)
MR2953494

Zentralblatt MATH identifier
1330.62059

Keywords
Adaptation adaptive estimation Bayes minimax Besov ball block thresholding confidence interval ellipsoid information pooling linear functional linear minimaxity minimax nonparametric regression oracle separable rules sequence model shrinkage thresholding wavelet white noise model

Citation

Cai, T. Tony. Minimax and Adaptive Inference in Nonparametric Function Estimation. Statist. Sci. 27 (2012), no. 1, 31--50. doi:10.1214/11-STS355. https://projecteuclid.org/euclid.ss/1331729981


Export citation

References

  • Abramovich, F., Benjamini, Y., Donoho, D. L. and Johnstone, I. M. (2006). Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 34 584–653.
  • Beran, R. and Dümbgen, L. (1998). Modulation of estimators and confidence sets. Ann. Statist. 26 1826–1856.
  • Birgé, L. and Massart, P. (1995). Estimation of integral functionals of a density. Ann. Statist. 23 11–29.
  • Brown, L. D. and Low, M. G. (1991). Information inequality bounds on the minimax risk (with an application to nonparametric regression). Ann. Statist. 19 329–337.
  • Brown, L. D. and Low, M. G. (1996a). Asymptotic equivalence of nonparametric regression and white noise. Ann. Statist. 24 2384–2398.
  • Brown, L. D. and Low, M. G. (1996b). A constrained risk inequality with applications to nonparametric functional estimation. Ann. Statist. 24 2524–2535.
  • Brown, L. D., Low, M. G. and Zhao, L. H. (1997). Superefficiency in nonparametric function estimation. Ann. Statist. 25 2607–2625.
  • Brown, L. D., Cai, T. T., Low, M. G. and Zhang, C.-H. (2002). Asymptotic equivalence theory for nonparametric regression with random design. Ann. Statist. 30 688–707.
  • Brown, L. D., Carter, A. V., Low, M. G. and Zhang, C.-H. (2004). Equivalence theory for density estimation, Poisson processes and Gaussian white noise with drift. Ann. Statist. 32 2074–2097.
  • Cai, T. T. (1999). Adaptive wavelet estimation: A block thresholding and oracle inequality approach. Ann. Statist. 27 898–924.
  • Cai, T. T. (2003). Rates of convergence and adaptation over Besov spaces under pointwise risk. Statist. Sinica 13 881–902.
  • Cai, T. T. (2008). On information pooling, adaptability and superefficiency in nonparametric function estimation. J. Multivariate Anal. 99 421–436.
  • Cai, T. and Low, M. (2007). Adaptive estimation and confidence intervals for convex functions. Technical report, Dept. Statistics, Univ. Pennsylvania.
  • Cai, T. T. and Low, M. G. (2003). A note on nonparametric estimation of linear functionals. Ann. Statist. 31 1140–1153.
  • Cai, T. T. and Low, M. G. (2004a). Minimax estimation of linear functionals over nonconvex parameter spaces. Ann. Statist. 32 552–576.
  • Cai, T. T. and Low, M. G. (2004b). An adaptation theory for nonparametric confidence intervals. Ann. Statist. 32 1805–1840.
  • Cai, T. T. and Low, M. G. (2005a). On adaptive estimation of linear functionals. Ann. Statist. 33 2311–2343.
  • Cai, T. T. and Low, M. G. (2005b). Nonquadratic estimators of a quadratic functional. Ann. Statist. 33 2930–2956.
  • Cai, T. T. and Low, M. G. (2005c). Adaptive estimation of linear functionals under different performance measures. Bernoulli 11 341–358.
  • Cai, T. T. and Low, M. G. (2006a). Adaptive confidence balls. Ann. Statist. 34 202–228.
  • Cai, T. T. and Low, M. G. (2006b). Optimal adaptive estimation of a quadratic functional. Ann. Statist. 34 2298–2325.
  • Cai, T. T., Low, M. G. and Zhao, L. H. (2007). Trade-offs between global and local risks in nonparametric function estimation. Bernoulli 13 1–19.
  • Cai, T. T., Low, M. G. and Zhao, L. H. (2009). Sharp adaptive estimation by a blockwise method. J. Nonparametr. Stat. 21 839–850.
  • Cai, T. T. and Silverman, B. W. (2001). Incorporating information on neighbouring coefficients into wavelet estimation. Sankhyā Ser. B 63 127–148.
  • Cai, T. T. and Zhou, H. H. (2009). A data-driven block thresholding approach to wavelet estimation. Ann. Statist. 37 569–595.
  • Cavalier, L. and Tsybakov, A. (2002). Sharp adaptation for inverse problems with random noise. Probab. Theory Related Fields 123 323–354.
  • Cavalier, L., Golubev, Y., Lepski, O. and Tsybakov, A. (2003). Block thresholding and sharp adaptive estimation in severely ill-posed inverse problems. Theory Probab. Appl. 48 534–556.
  • Daubechies, I. (1992). Ten Lectures on Wavelets. CBMS-NSF Regional Conference Series in Applied Mathematics 61. SIAM, Philadelphia, PA.
  • Delyon, B. and Juditsky, A. (1996). On minimax wavelet estimators. Appl. Comput. Harmon. Anal. 3 215–228.
  • DeVore, R. A. and Lorentz, G. G. (1993). Constructive Approximation. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 303. Springer, Berlin.
  • Donoho, D. L. (1994). Statistical estimation and optimal recovery. Ann. Statist. 22 238–270.
  • Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika 81 425–455.
  • Donoho, D. L. and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Statist. Assoc. 90 1200–1224.
  • Donoho, D. L. and Johnstone, I. M. (1998). Minimax estimation via wavelet shrinkage. Ann. Statist. 26 879–921.
  • Donoho, D. L. and Liu, R. G. (1987). Geometrizing rates of convergence I. Technical Report 137, Dept. Statistics, Univ. California, Berkeley.
  • Donoho, D. L. and Liu, R. C. (1991). Geometrizing rates of convergence. III. Ann. Statist. 19 668–701.
  • Donoho, D. L., Liu, R. C. and MacGibbon, B. (1990). Minimax risk over hyperrectangles, and implications. Ann. Statist. 18 1416–1437.
  • Donoho, D. L. and Low, M. G. (1992). Renormalization exponents and optimal pointwise rates of convergence. Ann. Statist. 20 944–970.
  • Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. (1995). Wavelet shrinkage: Asymptopia? (with discussion). J. Roy. Statist. Soc. Ser. B 57 301–369.
  • Dümbgen, L. (1998). New goodness-of-fit tests and their application to nonparametric confidence sets. Ann. Statist. 26 288–314.
  • Efromovich, S. Y. (1985). Nonparametric estimation of a density of unknown smoothness. Teory Probab. Appl. 30 557–661.
  • Efromovich, S. Y. and Pinsker, M. S. (1982). Estimation of square-integrable probability density of a random variable. Probl. Inf. Transm. 18 19–38.
  • Efromovich, S. (1997a). Robust and efficient recovery of a signal passed through a filter and then contaminated by non-Gaussian noise. IEEE Trans. Inform. Theory 43 1184–1191.
  • Efromovich, S. (1997b). Density estimation for the case of supersmooth measurement error. J. Amer. Statist. Assoc. 92 526–535.
  • Efromovich, S. (2000). On sharp adaptive estimation of multivariate curves. Math. Methods Statist. 9 117–139.
  • Efromovich, S. and Koltchinskii, V. (2001). On inverse problems with unknown operators. IEEE Trans. Inform. Theory 47 2876–2894.
  • Efromovich, S. and Low, M. G. (1994). Adaptive estimates of linear functionals. Probab. Theory Related Fields 98 261–275.
  • Efromovich, S. Y. and Pinsker, M. S. (1984). Learning algorithm for nonparametric filtering. Autom. Remote Control 11 1434–440.
  • Efron, B. and Morris, C. (1973). Stein’s estimation rule and its competitors—an empirical Bayes approach. J. Amer. Statist. Assoc. 68 117–130.
  • Farrell, R. H. (1972). On the best obtainable asymptotic rates of convergence in estimation of a density function at a point. Ann. Math. Statist. 43 170–180.
  • Foster, D. P. and George, E. I. (1994). The risk inflation criterion for multiple regression. Ann. Statist. 22 1947–1975.
  • Gao, H.-Y. (1998). Wavelet shrinkage denoising using the non-negative garrote. J. Comput. Graph. Statist. 7 469–488.
  • Genovese, C. R. and Wasserman, L. (2005). Confidence sets for nonparametric wavelet regression. Ann. Statist. 33 698–729.
  • Genovese, C. and Wasserman, L. (2008). Adaptive confidence bands. Ann. Statist. 36 875–905.
  • Hall, P., Kerkyacharian, G. and Picard, D. (1998). Block threshold rules for curve estimation using kernel and wavelet methods. Ann. Statist. 26 922–942.
  • Hall, P., Kerkyacharian, G. and Picard, D. (1999). On the minimax optimality of block thresholded wavelet estimators. Statist. Sinica 9 33–49.
  • Has’minskiĭ, R. Z. (1979). Lower bound for the risks of nonparametric estimates of the mode. In Contributions to Statistics (J. Jureckova, ed.) 91–97. Reidel, Dordrecht.
  • Hengartner, N. W. and Stark, P. B. (1995). Finite-sample confidence envelopes for shape-restricted densities. Ann. Statist. 23 525–550.
  • Ibragimov, I. A. and Hasminskii, R. Z. (1984). Nonparametric estimation of the values of a linear functional in Gaussian white noise. Theory Probab. Appl. 31 391–406.
  • Johnstone, I. M. (2002). Function estimation and Gaussian sequence model. Unpublished manuscript.
  • Johnstone, I. M. and Silverman, B. W. (2005). Empirical Bayes selection of wavelet thresholds. Ann. Statist. 33 1700–1752.
  • Kang, Y.-G. and Low, M. G. (2002). Estimating monotone functions. Statist. Probab. Lett. 56 361–367.
  • Kerkyacharian, G., Picard, D. and Tribouley, K. (1996). Lp adaptive density estimation. Bernoulli 2 229–247.
  • Klemelä, J. and Nussbaum, M. (1999). Constructive asymptotic equivalence of density estimation and Gaussian white noise. Discussion Paper No. 53, Sonderforschungsbereich 373, Humboldt Univ., Berlin.
  • LeCam, L. (1953). On some asymptotic properties of maximum likelihood estimates and related Bayes’ estimates. Univ. California Publ. Statist. 1 277–329.
  • Lepski, O. V. and Levit, B. Y. (1998). Adaptive minimax estimation of infinitely differentiable functions. Math. Methods Statist. 7 123–156.
  • Lepskiĭ, O. V. (1990). A problem of adaptive estimation in Gaussian white noise. Theory Probab. Appl. 35 454–466.
  • Li, K.-C. (1989). Honest confidence regions for nonparametric regression. Ann. Statist. 17 1001–1008.
  • Low, M. G. (1992). Renormalization and white noise approximation for nonparametric functional estimation problems. Ann. Statist. 20 545–554.
  • Low, M. G. (1997). On nonparametric confidence intervals. Ann. Statist. 25 2547–2554.
  • Meyer, Y. (1992). Wavelets and Operators. Cambridge Studies in Advanced Mathematics 37. Cambridge Univ. Press, Cambridge.
  • Nussbaum, M. (1985). Spline smoothing in regression models and asymptotic efficiency in L2. Ann. Statist. 13 984–997.
  • Nussbaum, M. (1996). Asymptotic equivalence of density estimation and Gaussian white noise. Ann. Statist. 24 2399–2430.
  • Pinsker, M. S. (1980). Optimal filtration of square-integrable signals in Gaussian noise. Problems Inform. Transmission 16 53–68.
  • Robins, J. and van der Vaart, A. (2006). Adaptive nonparametric confidence sets. Ann. Statist. 34 229–253.
  • Stein, C. (1956). Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 19541955, Vol. I 197–206. Univ. California Press, Berkeley.
  • Stein, C. M. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Statist. 9 1135–1151.
  • Stone, C. J. (1980). Optimal rates of convergence for nonparametric estimators. Ann. Statist. 8 1348–1360.
  • Triebel, H. (1992). Theory of Function Spaces. II. Monographs in Mathematics 84. Birkhäuser, Basel.
  • Tsybakov, A. B. (1998). Pointwise and sup-norm sharp adaptive estimation of functions on the Sobolev classes. Ann. Statist. 26 2420–2469.
  • Zhang, C.-H. (2005). General empirical Bayes wavelet methods and exactly adaptive minimax estimation. Ann. Statist. 33 54–100.