## Electronic Communications in Probability

### On the sub-Gaussianity of the Beta and Dirichlet distributions

#### Abstract

We obtain the optimal proxy variance for the sub-Gaussianity of Beta distribution, thus proving upper bounds recently conjectured by Elder (2016). We provide different proof techniques for the symmetrical (around its mean) case and the non-symmetrical case. The technique in the latter case relies on studying the ordinary differential equation satisfied by the Beta moment-generating function known as the confluent hypergeometric function. As a consequence, we derive the optimal proxy variance for the Dirichlet distribution, which is apparently a novel result. We also provide a new proof of the optimal proxy variance for the Bernoulli distribution, and discuss in this context the proxy variance relation to log-Sobolev inequalities and transport inequalities.

#### Article information

Source
Electron. Commun. Probab., Volume 22 (2017), paper no. 54, 14 pp.

Dates
Accepted: 4 October 2017
First available in Project Euclid: 13 October 2017

https://projecteuclid.org/euclid.ecp/1507860211

Digital Object Identifier
doi:10.1214/17-ECP92

Mathematical Reviews number (MathSciNet)
MR3718704

Zentralblatt MATH identifier
06797807

Subjects
Primary: 97K50: Probability theory

#### Citation

Marchal, Olivier; Arbel, Julyan. On the sub-Gaussianity of the Beta and Dirichlet distributions. Electron. Commun. Probab. 22 (2017), paper no. 54, 14 pp. doi:10.1214/17-ECP92. https://projecteuclid.org/euclid.ecp/1507860211

#### References

• Arbel, J., Favaro, S., Nipoti, B., and Teh, Y. W. (2017). Bayesian nonparametric inference for discovery probabilities: credible intervals and large sample asymptotics. Statistica Sinica, 27:839–858.
• Ben-Hamou, A., Boucheron, S., and Ohannessian, M. I. (2017). Concentration inequalities in the infinite urn scheme for occupancy counts and the missing mass, with applications. Bernoulli, 23(1):249–287.
• Berend, D. and Kontorovich, A. (2013). On the concentration of the missing mass. Electronic Communications in Probability, 18(3):1–7.
• Birkhoff, G. and Rota, G.-C. (1989). Ordinary Differential Equations. John Wiley and Sons Editions.
• Bobkov, S. G. and Götze, F. (1999). Exponential integrability and transportation cost related to logarithmic sobolev inequalities. Journal of Functional Analysis, 163(1):1–28.
• Boucheron, S., Lugosi, G., and Massart, P. (2013). Concentration inequalities: A nonasymptotic theory of independence. Oxford University Press.
• Buldygin, V. V. and Kozachenko, Y. V. (1980). Sub-Gaussian random variables. Ukrainian Mathematical Journal, 32(6):483–489.
• Buldygin, V. V. and Kozachenko, Y. V. (2000). Metric characterization of random variables and random processes, volume 188. American Mathematical Society, Providence, Rhode Island.
• Buldygin, V. V. and Moskvichova, K. (2013). The sub-Gaussian norm of a binary random variable. Theory of probability and mathematical statistics, 86:33–49.
• Castillo, I. (2016). Pólya tree posterior distributions on densities. Annales de l’Institut Henri Poincaré, to appear.
• Elder, S. (2016). Bayesian adaptive data analysis guarantees from subgaussianity. arXiv preprint: arXiv:1611.00065.
• Gross, L. (1975). Logarithmic sobolev inequalities. American Journal of Mathematics, 97(4):1061–1083.
• Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American statistical association, 58(301):13–30.
• Kearns, M. and Saul, L. (1998). Large deviation methods for approximate probabilistic inference. In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence, pages 311–319.
• Ledoux, M. (1999). Concentration of measure and logarithmic sobolev inequalities. Lecture notes in mathematics - Springer Verlag-, pages 120–216.
• McAllester, D. A. and Ortiz, L. (2003). Concentration inequalities for the missing mass and for histogram rule error. Journal of Machine Learning Research, 4:895–911.
• McAllester, D. A. and Schapire, R. E. (2000). On the convergence rate of Good-Turing estimators. In COLT, pages 1–6.
• Ordentlich, E. and Weinberger, M. J. (2005). A distribution dependent refinement of pinsker’s inequality. IEEE Transactions on Information Theory, 51(5):1836–1840.
• Perry, A., Wein, A. S., and Bandeira, A. S. (2016). Statistical limits of spiked tensor models. arXiv preprint: arXiv:1612.07728.
• Pisier, G. (2016). Subgaussian sequences in probability and Fourier analysis. arXiv preprint: arXiv:1607.01053.
• Raginsky, M. and Sason, I. (2013). Concentration of measure inequalities in information theory, communications, and coding. Foundations and Trends in Communications and Information Theory, 10(1-2):1–246.
• Robinson, J. C. (2004). An introduction to Ordinary Differential Equations. Cambridge University Press.