The Annals of Applied Probability

Distances between nested densities and a measure of the impact of the prior in Bayesian statistics

Christophe Ley, Gesine Reinert, and Yvik Swan

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


In this paper, we propose tight upper and lower bounds for the Wasserstein distance between any two univariate continuous distributions with probability densities $p_{1}$ and $p_{2}$ having nested supports. These explicit bounds are expressed in terms of the derivative of the likelihood ratio $p_{1}/p_{2}$ as well as the Stein kernel $\tau_{1}$ of $p_{1}$. The method of proof relies on a new variant of Stein’s method which manipulates Stein operators.

We give several applications of these bounds. Our main application is in Bayesian statistics: we derive explicit data-driven bounds on the Wasserstein distance between the posterior distribution based on a given prior and the no-prior posterior based uniquely on the sampling distribution. This is the first finite sample result confirming the well-known fact that with well-identified parameters and large sample sizes, reasonable choices of prior distributions will have only minor effects on posterior inferences if the data are benign.

Article information

Ann. Appl. Probab., Volume 27, Number 1 (2017), 216-241.

Received: October 2015
Revised: April 2016
First available in Project Euclid: 6 March 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60E15: Inequalities; stochastic orderings
Secondary: 62F15: Bayesian inference

Stein’s method Bayesian analysis prior distribution posterior distribution


Ley, Christophe; Reinert, Gesine; Swan, Yvik. Distances between nested densities and a measure of the impact of the prior in Bayesian statistics. Ann. Appl. Probab. 27 (2017), no. 1, 216--241. doi:10.1214/16-AAP1202.

Export citation


  • [1] Azzalini, A. (1985). A class of distributions which includes the normal ones. Scand. J. Statist. 12 171–178.
  • [2] Chen, L. H. Y., Goldstein, L. and Shao, Q.-M. (2011). Normal Approximation by Stein’s Method. Probability and Its Applications (New York). Springer, Heidelberg.
  • [3] Chwialkowski, K., Strathmann, H. and Gretton, A. (2016). A kernel test of goodness of fit. Preprint. Available at arXiv:1602.02964v3.
  • [4] Diaconis, P. and Freedman, D. (1986). On inconsistent Bayes estimates of location. Ann. Statist. 14 68–87.
  • [5] Diaconis, P. and Freedman, D. (1986). On the consistency of Bayes estimates. Ann. Statist. 14 1–67.
  • [6] Döbler, C. (2015). Stein’s method of exchangeable pairs for the beta distribution and generalizations. Electron. J. Probab. 20 109.
  • [7] Döbler, C. (2015). Stein’s method for the half-normal distribution with applications to limit theorems related to the simple symmetric random walk. ALEA Lat. Am. J. Probab. Math. Stat. 12 171–191.
  • [8] Eden, R. and Víquez, J. (2015). Nourdin–Peccati analysis on Wiener and Wiener–Poisson space for general distributions. Stochastic Process. Appl. 125 182–216.
  • [9] Efron, B. (1981). Nonparametric standard errors and confidence intervals. Canad. J. Statist. 9 139–172.
  • [10] Eichelsbacher, P. and Thäle, C. (2015). Malliavin–Stein method for variance-gamma approximation on Wiener space. Electron. J. Probab. 20 123.
  • [11] Gaunt, R. E. (2014). Variance-gamma approximation via Stein’s method. Electron. J. Probab. 19 38.
  • [12] Goldstein, L. and Reinert, G. (2005). Distributional transformations, orthogonal polynomials, and Stein characterizations. J. Theoret. Probab. 18 237–260.
  • [13] Gorham, J. and Mackey, L. (2015). Measuring sample quality with Stein’s method. Adv. Neural Inf. Process. Syst. 226–234.
  • [14] Gorham, J. and Mackey, L. (2016). Multivariate Stein factors for strongly log-concave distributions. Electron. Commun. Probab. 21.
  • [15] Hallin, M. and Ley, C. (2014). Skew-symmetric distributions and Fisher information: The double sin of the skew-normal. Bernoulli 20 1432–1453.
  • [16] Karlin, S. and Rubin, H. (1956). Distributions possessing a monotone likelihood ratio. J. Amer. Statist. Assoc. 51 637–643.
  • [17] Ley, C., Reinert, G. and Swan, Y. (2016). Stein’s method for comparison of univariate distributions. Preprint. Available at arXiv:1408.2998.
  • [18] Ley, C. and Swan, Y. (2013). Local Pinsker inequalities via Stein’s discrete density approach. IEEE Trans. Inform. Theory 59 5584–5591.
  • [19] Ley, C. and Swan, Y. (2013). Stein’s density approach and information inequalities. Electron. Commun. Probab. 18 7.
  • [20] Ley, C. and Swan, Y. (2016). Parametric Stein operators and variance bounds. Braz. J. Probab. Stat. 30 171–195.
  • [21] Nourdin, I. and Peccati, G. (2012). Normal Approximations with Malliavin Calculus: From Stein’s Method to Universality. Cambridge Tracts in Mathematics 192. Cambridge Univ. Press, Cambridge.
  • [22] Nourdin, I., Peccati, G. and Swan, Y. (2014). Entropy and the fourth moment phenomenon. J. Funct. Anal. 266 3170–3207.
  • [23] Nourdin, I., Peccati, G. and Swan, Y. (2014). Integration by parts and representation of information functionals. 2014 IEEE International Symposium on Information Theory (ISIT) 2217–2221.
  • [24] Oates, C. J., Girolami, M. and Chopin, N. (2016). Control funtionals for Monte Carlo integration. J. R. Stat. Soc. Ser. B. Stat. Methodol. To appear. DOI:10.1111/rssb.12185.
  • [25] Pike, J. and Ren, H. (2014). Stein’s method and the Laplace distribution. ALEA Lat. Am. J. Probab. Math. Stat. 11 571–587.
  • [26] Ross, N. (2011). Fundamentals of Stein’s method. Probab. Surv. 8 210–293.
  • [27] Ross, S. M. (1996). Stochastic Processes, 2nd ed. Wiley Series in Probability and Statistics: Probability and Statistics. Wiley, New York.
  • [28] Shaked, M. and Shanthikumar, J. G. (2007). Stochastic Orders. Springer Series in Statistics. Springer, New York.
  • [29] Stein, C. (1965). Approximation of improper prior measures by prior probability measures. In Proc. Internat. Res. Sem., Statist. Lab., Univ. California, Berkeley, Calif., 1963 217–240. Springer, New York.
  • [30] Stein, C. (1986). Approximate Computation of Expectations. Institute of Mathematical Statistics Lecture Notes—Monograph Series 7. IMS, Hayward, CA.
  • [31] Stein, C., Diaconis, P., Holmes, S. and Reinert, G. (2004). Use of exchangeable pairs in the analysis of simulations. In Stein’s Method: Expository Lectures and Applications. Institute of Mathematical Statistics Lecture Notes—Monograph Series 46 1–26. IMS, Beachwood, OH.
  • [32] Vallender, S. (1974). Calculation of the Wasserstein distance between probability distributions on the line. Theory Probab. Appl. 18 784–786.
  • [33] Villani, C. (2009). Optimal Transport: Old and New. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 338. Springer, Berlin.