## Bernoulli

• Bernoulli
• Volume 22, Number 3 (2016), 1535-1571.

### Borrowing strengh in hierarchical Bayes: Posterior concentration of the Dirichlet base measure

XuanLong Nguyen

#### Abstract

This paper studies posterior concentration behavior of the base probability measure of a Dirichlet measure, given observations associated with the sampled Dirichlet processes, as the number of observations tends to infinity. The base measure itself is endowed with another Dirichlet prior, a construction known as the hierarchical Dirichlet processes (Teh et al. [J. Amer. Statist. Assoc. 101 (2006) 1566–1581]). Convergence rates are established in transportation distances (i.e., Wasserstein metrics) under various conditions on the geometry of the support of the true base measure. As a consequence of the theory, we demonstrate the benefit of “borrowing strength” in the inference of multiple groups of data – a powerful insight often invoked to motivate hierarchical modeling. In certain settings, the gain in efficiency due to the latent hierarchy can be dramatic, improving from a standard nonparametric rate to a parametric rate of convergence. Tools developed include transportation distances for nonparametric Bayesian hierarchies of random measures, the existence of tests for Dirichlet measures, and geometric properties of the support of Dirichlet measures.

#### Article information

Source
Bernoulli, Volume 22, Number 3 (2016), 1535-1571.

Dates
Revised: October 2014
First available in Project Euclid: 16 March 2016

https://projecteuclid.org/euclid.bj/1458132991

Digital Object Identifier
doi:10.3150/15-BEJ703

Mathematical Reviews number (MathSciNet)
MR3474825

Zentralblatt MATH identifier
1360.62103

#### Citation

Nguyen, XuanLong. Borrowing strengh in hierarchical Bayes: Posterior concentration of the Dirichlet base measure. Bernoulli 22 (2016), no. 3, 1535--1571. doi:10.3150/15-BEJ703. https://projecteuclid.org/euclid.bj/1458132991

#### References

• [1] Barron, A., Schervish, M.J. and Wasserman, L. (1999). The consistency of posterior distributions in nonparametric problems. Ann. Statist. 27 536–561.
• [2] Berger, J.O. (1993). Statistical Decision Theory and Bayesian Analysis. Springer Series in Statistics. New York: Springer.
• [3] Blackwell, D. and MacQueen, J.B. (1973). Ferguson distributions via Pólya urn schemes. Ann. Statist. 1 353–355.
• [4] Carroll, R.J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density. J. Amer. Statist. Assoc. 83 1184–1186.
• [5] Doss, H. and Sellke, T. (1982). The tails of probabilities chosen from a Dirichlet prior. Ann. Statist. 10 1302–1305.
• [6] Falconer, K.J. (1986). The Geometry of Fractal Sets. Cambridge Tracts in Mathematics 85. Cambridge: Cambridge Univ. Press.
• [7] Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Statist. 19 1257–1272.
• [8] Ferguson, T.S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
• [9] Garcia, I., Molter, U. and Scotto, R. (2007). Dimension functions of Cantor sets. Proc. Amer. Math. Soc. 135 3151–3161.
• [10] Gassiat, E. and Rousseau, J. (2014). About the posterior distribution in hidden Markov models with unknown number of states. Bernoulli 20 2039–2075.
• [11] Gassiat, E. and van Handel, R. (2014). The local geometry of finite mixtures. Trans. Amer. Math. Soc. 366 1047–1072.
• [12] Ghosal, S. (2010). The Dirichlet process, related priors and posterior asymptotics. In Bayesian Nonparametrics. Camb. Ser. Stat. Probab. Math. 35–79. Cambridge: Cambridge Univ. Press.
• [13] Ghosal, S., Ghosh, J.K. and van der Vaart, A.W. (2000). Convergence rates of posterior distributions. Ann. Statist. 28 500–531.
• [14] Ghosh, J.K. and Ramamoorthi, R.V. (2003). Bayesian Nonparametrics. Springer Series in Statistics. New York: Springer.
• [15] Giné, E. and Nickl, R. (2011). Rates on contraction for posterior distributions in $L^{r}$-metrics, $1\leq r\leq\infty$. Ann. Statist. 39 2883–2911.
• [16] Hjort, N.L., Holmes, C., Müller, P. and Walker, S.G., eds. (2010). Bayesian Nonparametrics. Cambridge Series in Statistical and Probabilistic Mathematics 28. Cambridge: Cambridge Univ. Press.
• [17] Korwar, R.M. and Hollander, M. (1973). Contributions to the theory of Dirichlet processes. Ann. Probab. 1 705–711.
• [18] Lehmann, E.L. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer Texts in Statistics. New York: Springer.
• [19] Nguyen, X. (2013). Convergence of latent mixing measures in finite and infinite mixture models. Ann. Statist. 41 370–400.
• [20] Nguyen, X. (2015). Posterior contraction of the population polytope in finite admixture models. Bernoulli 21 618–646.
• [21] Nguyen, X. (2015). Supplement to “Borrowing strengh in hierarchical Bayes: Posterior concentration of the Dirichlet base measure.” DOI:10.3150/15-BEJ703SUPP.
• [22] Rousseau, J. and Mengersen, K. (2011). Asymptotic behaviour of the posterior distribution in overfitted mixture models. J. R. Stat. Soc. Ser. B. Stat. Methodol. 73 689–710.
• [23] Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statist. Sinica 4 639–650.
• [24] Shen, X. and Wasserman, L. (2001). Rates of convergence of posterior distributions. Ann. Statist. 29 687–714.
• [25] Teh, Y.W. and Jordan, M.I. (2010). Hierarchical Bayesian nonparametric models with applications. In Bayesian Nonparametrics. Camb. Ser. Stat. Probab. Math. 158–207. Cambridge: Cambridge Univ. Press.
• [26] Teh, Y.W., Jordan, M.I., Beal, M.J. and Blei, D.M. (2006). Hierarchical Dirichlet processes. J. Amer. Statist. Assoc. 101 1566–1581.
• [27] van der Vaart, A.W. and van Zanten, J.H. (2008). Rates of contraction of posterior distributions based on Gaussian process priors. Ann. Statist. 36 1435–1463.
• [28] van der Vaart, A.W. and Wellner, J.A. (1996). Weak Convergence and Empirical Processes. Springer Series in Statistics. New York: Springer.
• [29] Villani, C. (2009). Optimal Transport: Old and New. Grundlehren der Mathematischen Wissenschaften 338. Berlin: Springer.
• [30] Walker, S. (2004). New approaches to Bayesian consistency. Ann. Statist. 32 2028–2043.
• [31] Walker, S.G., Lijoi, A. and Prünster, I. (2007). On rates of convergence for posterior distributions in infinite-dimensional models. Ann. Statist. 35 738–746.
• [32] Wong, W.H. and Shen, X. (1995). Probability inequalities for likelihood ratios and convergence rates of sieve MLEs. Ann. Statist. 23 339–362.
• [33] Zhang, C.-H. (1990). Fourier methods for estimating mixing densities and distributions. Ann. Statist. 18 806–831.