Electronic Journal of Statistics

Nonparametric Bayesian model selection and averaging

Subhashis Ghosal, Jüri Lember, and Aad van der Vaart

Full-text: Open access


We consider nonparametric Bayesian estimation of a probability density p based on a random sample of size n from this density using a hierarchical prior. The prior consists, for instance, of prior weights on the regularity of the unknown density combined with priors that are appropriate given that the density has this regularity. More generally, the hierarchy consists of prior weights on an abstract model index and a prior on a density model for each model index. We present a general theorem on the rate of contraction of the resulting posterior distribution as n, which gives conditions under which the rate of contraction is the one attached to the model that best approximates the true density of the observations. This shows that, for instance, the posterior distribution can adapt to the smoothness of the underlying density. We also study the posterior distribution of the model index, and find that under the same conditions the posterior distribution gives negligible weight to models that are bigger than the optimal one, and thus selects the optimal model or smaller models that also approximate the true density well. We apply these result to log spline density models, where we show that the prior weights on the regularity index interact with the priors on the models, making the exact rates depend in a complicated way on the priors, but also that the rate is fairly robust to specification of the prior weights.

Article information

Electron. J. Statist., Volume 2 (2008), 63-89.

First available in Project Euclid: 1 February 2008

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G07: Density estimation 62G20: Asymptotic properties 62C10: Bayesian problems; characterization of Bayes procedures
Secondary: 65U05 68T05: Learning and adaptive systems [See also 68Q32, 91E40]

Adaptation rate of convergence Bayes factor rate of contraction


Ghosal, Subhashis; Lember, Jüri; van der Vaart, Aad. Nonparametric Bayesian model selection and averaging. Electron. J. Statist. 2 (2008), 63--89. doi:10.1214/07-EJS090. https://projecteuclid.org/euclid.ejs/1201877208

Export citation


  • Eduard Belitser and Subhashis Ghosal. Adaptive Bayesian inference on the mean of an infinite-dimensional normal distribution., Ann. Statist., 31 (2): 536–559, 2003. ISSN 0090-5364. Dedicated to the memory of Herbert E. Robbins.
  • James O. Berger and Alessandra Guglielmi. Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives., J. Amer. Statist. Assoc., 96 (453): 174–184, 2001. ISSN 0162-1459.
  • Lucien Birgé. Approximation dans les espaces métriques et théorie de l’estimation., Z. Wahrsch. Verw. Gebiete, 65 (2): 181–237, 1983. ISSN 0044-3719.
  • Sarat C. Dass and Jaeyong Lee. A note on the consistency of Bayes factors for testing point null versus non-parametric alternatives., J. Statist. Plann. Inference, 119 (1): 143–152, 2004. ISSN 0378-3758.
  • Carl de Boor., A practical guide to splines, volume 27 of Applied Mathematical Sciences. Springer-Verlag, New York, revised edition, 2001. ISBN 0-387-95366-3.
  • David Freedman. On the Bernstein-von Mises theorem with infinite-dimensional parameters., Ann. Statist., 27 (4): 1119–1140, 1999. ISSN 0090-5364.
  • Subhashis Ghosal. Convergence rates for density estimation with Bernstein polynomials., Ann. Statist., 29 (5): 1264–1280, 2001. ISSN 0090-5364.
  • Subhashis Ghosal, Jayanta K. Ghosh, and Aad W. van der Vaart. Convergence rates of posterior distributions., Ann. Statist., 28 (2): 500–531, 2000. ISSN 0090-5364.
  • Subhashis Ghosal, Jüri Lember, and Aad Van Der Vaart. On Bayesian adaptation. In, Proceedings of the Eighth Vilnius Conference on Probability Theory and Mathematical Statistics, Part II (2002), volume 79, pages 165–175, 2003.
  • Subhashis Ghosal and Aad W. van der Vaart. Convergence rates for posterior distributions for noniid observations., Ann. Statist., 35: 697–723, 2007a.
  • Subhashis Ghosal and Aad W. van der Vaart. Posterior convergence rates of dirichlet mixtures of normal distributions for smooth densities., Ann. Statist., 35: 192–223, 2007b.
  • Tzee-Ming Huang. Convergence rates for posterior distributions and adaptive estimation., Ann. Statist., 32 (4): 1556–1593, 2004. ISSN 0090-5364.
  • B. J. K. Kleijn and A. W. van der Vaart. Misspecification in infinite-dimensional Bayesian statistics., Ann. Statist., 34 (2): 837–877, 2006. ISSN 0090-5364.
  • A. N. Kolmogorov and V. M. Tihomirov., ɛ-entropy and ɛ-capacity of sets in functional space. Amer. Math. Soc. Transl. (2), 17: 277–364, 1961. ISSN 0065-9290.
  • Lucien Le Cam., Asymptotic methods in statistical decision theory. Springer Series in Statistics. Springer-Verlag, New York, 1986. ISBN 0-387-96307-3.
  • L. LeCam. Convergence of estimates under dimensionality restrictions., Ann. Statist., 1: 38–53, 1973. ISSN 0090-5364.
  • J. Lember and A.W. van der Vaart. On universal bayesian adaptation., Statistics and Decisions, 25 (2): 127–152, 2007.
  • Sonia Petrone. Bayesian density estimation using Bernstein polynomials., Canad. J. Statist., 27 (1): 105–126, 1999. ISSN 0319-5724.
  • Sonia Petrone and Larry Wasserman. Consistency of Bernstein polynomial posteriors., J. R. Stat. Soc. Ser. B Stat. Methodol., 64 (1): 79–100, 2002. ISSN 1369-7412.
  • Gideon Schwarz. Estimating the dimension of a model., Ann. Statist., 6 (2): 461–464, 1978. ISSN 0090-5364.
  • Charles J. Stone. The dimensionality reduction principle for generalized additive models., Ann. Statist., 14 (2): 590–606, 1986. ISSN 0090-5364.
  • Charles J. Stone. Large-sample inference for log-spline models., Ann. Statist., 18 (2): 717–741, 1990. ISSN 0090-5364.
  • Alexandre B. Tsybakov., Introduction à l’estimation non-paramétrique, volume 41 of Mathématiques & Applications (Berlin) [Mathematics & Applications]. Springer-Verlag, Berlin, 2004. ISBN 3-540-40592-5.
  • Aad W. van der Vaart and Jon A. Wellner., Weak convergence and empirical processes. Springer Series in Statistics. Springer-Verlag, New York, 1996. ISBN 0-387-94640-3. With applications to statistics.
  • Stephen Walker, Paul Damien, and Peter Lenk. On priors with a Kullback-Leibler property., J. Amer. Statist. Assoc., 99 (466): 404–408, 2004. ISSN 0162-1459.
  • Linda H. Zhao. Bayesian aspects of some nonparametric problems., Ann. Statist., 28 (2): 532–552, 2000. ISSN 0090-5364.