The Annals of Statistics

On Bayesian supremum norm contraction rates

Ismaël Castillo

Full-text: Open access

Abstract

Building on ideas from Castillo and Nickl [Ann. Statist. 41 (2013) 1999–2028], a method is provided to study nonparametric Bayesian posterior convergence rates when “strong” measures of distances, such as the sup-norm, are considered. In particular, we show that likelihood methods can achieve optimal minimax sup-norm rates in density estimation on the unit interval. The introduced methodology is used to prove that commonly used families of prior distributions on densities, namely log-density priors and dyadic random density histograms, can indeed achieve optimal sup-norm rates of convergence. New results are also derived in the Gaussian white noise model as a further illustration of the presented techniques.

Article information

Source
Ann. Statist., Volume 42, Number 5 (2014), 2058-2091.

Dates
First available in Project Euclid: 11 September 2014

Permanent link to this document
https://projecteuclid.org/euclid.aos/1410440634

Digital Object Identifier
doi:10.1214/14-AOS1253

Mathematical Reviews number (MathSciNet)
MR3262477

Zentralblatt MATH identifier
1305.62189

Subjects
Primary: 62G20: Asymptotic properties
Secondary: 62G05: Estimation 62G07: Density estimation

Keywords
Bayesian nonparametrics contraction rates supremum norm

Citation

Castillo, Ismaël. On Bayesian supremum norm contraction rates. Ann. Statist. 42 (2014), no. 5, 2058--2091. doi:10.1214/14-AOS1253. https://projecteuclid.org/euclid.aos/1410440634


Export citation

References

  • [1] Barron, A. (1988). Convergence of Bayes estimators. Technical Report 7, Univ. Illinois at Urbana–Champaign.
  • [2] Barron, A., Schervish, M. J. and Wasserman, L. (1999). The consistency of posterior distributions in nonparametric problems. Ann. Statist. 27 536–561.
  • [3] Castillo, I. (2008). Lower bounds for posterior rates with Gaussian process priors. Electron. J. Stat. 2 1281–1299.
  • [4] Castillo, I. (2012). A semiparametric Bernstein–von Mises theorem for Gaussian process priors. Probab. Theory Related Fields 152 53–99.
  • [5] Castillo, I. (2012). Semiparametric Bernstein–von Mises theorem and bias, illustrated with Gaussian process priors. Sankhya 74 194–221.
  • [6] Castillo, I. and Nickl, R. (2013). Nonparametric Bernstein–von Mises theorems in Gaussian white noise. Ann. Statist. 41 1999–2028.
  • [7] Castillo, I. and Nickl, R. (2014). On the Bernstein–von Mises phenomenon for nonparametric Bayes procedures. Ann. Statist. 42 1941–1969.
  • [8] Castillo, I. and Rousseau, J. (2013). A general Bernstein–von Mises theorem in semiparametric models. Preprint. Available at arXiv:1305.4482.
  • [9] Choy, S. T. B. and Smith, A. F. M. (1997). On robust analysis of a normal location parameter. J. Roy. Statist. Soc. Ser. B 59 463–474.
  • [10] Cohen, A., Daubechies, I. and Vial, P. (1993). Wavelets on the interval and fast wavelet transforms. Appl. Comput. Harmon. Anal. 1 54–81.
  • [11] Gasparini, M. (1996). Bayesian density estimation via Dirichlet density processes. J. Nonparametr. Stat. 6 355–366.
  • [12] Ghosal, S. (2001). Convergence rates for density estimation with Bernstein polynomials. Ann. Statist. 29 1264–1280.
  • [13] Ghosal, S., Ghosh, J. K. and van der Vaart, A. W. (2000). Convergence rates of posterior distributions. Ann. Statist. 28 500–531.
  • [14] Ghosal, S. and van der Vaart, A. (2007). Posterior convergence rates of Dirichlet mixtures at smooth densities. Ann. Statist. 35 697–723.
  • [15] Ghosal, S. and van der Vaart, A. (2007). Convergence rates of posterior distributions for non-i.i.d. observations. Ann. Statist. 35 192–223.
  • [16] Giné, E. and Nickl, R. (2011). Rates on contraction for posterior distributions in $L^r$-metrics, $1\leq r\leq\infty$. Ann. Statist. 39 2883–2911.
  • [17] Goldenshluger, A. and Lepski, O. (2014). On adaptive minimax density estimation on $R^d$. Probab. Theory Related Fields 159 479–543.
  • [18] Härdle, W., Kerkyacharian, G., Picard, D. and Tsybakov, A. (1998). Wavelets, Approximation, and Statistical Applications. Lecture Notes in Statistics 129. Springer, New York.
  • [19] Has’minskiĭ, R. Z. (1978). A lower bound for risks of nonparametric density estimates in the uniform metric. Teor. Veroyatn. Primen. 23 824–828.
  • [20] Hoffmann, M., Rousseau, J. and Schmidt-Hieber, J. (2013). On adaptive posterior concentration rates. Preprint. Available at arXiv:1305.5270.
  • [21] Ibragimov, I. A. and Has’minskiĭ, R. Z. (1980). An estimate of the density of a distribution. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 98 61–85, 161–162, 166.
  • [22] Lavine, M. (1992). Some aspects of Pólya tree distributions for statistical modelling. Ann. Statist. 20 1222–1235.
  • [23] Lenk, P. J. (1988). The logistic normal distribution for Bayesian, nonparametric, predictive densities. J. Amer. Statist. Assoc. 83 509–516.
  • [24] Leonard, T. (1973). A Bayesian method for histograms. Biometrika 60 297–308.
  • [25] Nadarajah, S. (2003). The Kotz-type distribution with applications. Statistics 37 341–358.
  • [26] Nickl, R. (2007). Donsker-type theorems for nonparametric maximum likelihood estimators. Probab. Theory Related Fields 138 411–449.
  • [27] Rivoirard, V. and Rousseau, J. (2012). Bernstein–von Mises theorem for linear functionals of the density. Ann. Statist. 40 1489–1523.
  • [28] Rivoirard, V. and Rousseau, J. (2012). Posterior concentration rates for infinite dimensional exponential families. Bayesian Anal. 7 311–333.
  • [29] Scricciolo, C. (2006). Convergence rates for Bayesian density estimation of infinite-dimensional exponential families. Ann. Statist. 34 2897–2920.
  • [30] Scricciolo, C. (2007). On rates of convergence for Bayesian density estimation. Scand. J. Stat. 34 626–642.
  • [31] Scricciolo, C. (2014). Adaptive Bayesian density estimation in $L^p$ metrics with Pitman–Yor or normalized inverse-Gaussian process kernel mixtures. Bayesian Anal. 9 475–520.
  • [32] Shen, X. and Wasserman, L. (2001). Rates of convergence of posterior distributions. Ann. Statist. 29 687–714.
  • [33] Stone, C. J. (1982). Optimal global rates of convergence for nonparametric regression. Ann. Statist. 10 1040–1053.
  • [34] Tokdar, S. T. (2007). Towards a faster implementation of density estimation with logistic Gaussian process priors. J. Comput. Graph. Statist. 16 633–655.
  • [35] Tokdar, S. T. and Ghosh, J. K. (2007). Posterior consistency of logistic Gaussian process priors in density estimation. J. Statist. Plann. Inference 137 34–42.
  • [36] Triebel, H. (1983). Theory of Function Spaces. Monographs in Mathematics 78. Birkhäuser, Basel.
  • [37] van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics 3. Cambridge Univ. Press, Cambridge.
  • [38] van der Vaart, A. W. and van Zanten, J. H. (2008). Rates of contraction of posterior distributions based on Gaussian process priors. Ann. Statist. 36 1435–1463.
  • [39] Walker, S. (2004). New approaches to Bayesian consistency. Ann. Statist. 32 2028–2043.
  • [40] Walker, S. G. and Gutiérrez-Peña, E. (1999). Robustifying Bayesian procedures. In Bayesian Statistics, 6 (Alcoceber, 1998) 685–710. Oxford Univ. Press, New York.