Electronic Communications in Probability

On the sub-Gaussianity of the Beta and Dirichlet distributions

Olivier Marchal and Julyan Arbel

Full-text: Open access

Abstract

We obtain the optimal proxy variance for the sub-Gaussianity of Beta distribution, thus proving upper bounds recently conjectured by Elder (2016). We provide different proof techniques for the symmetrical (around its mean) case and the non-symmetrical case. The technique in the latter case relies on studying the ordinary differential equation satisfied by the Beta moment-generating function known as the confluent hypergeometric function. As a consequence, we derive the optimal proxy variance for the Dirichlet distribution, which is apparently a novel result. We also provide a new proof of the optimal proxy variance for the Bernoulli distribution, and discuss in this context the proxy variance relation to log-Sobolev inequalities and transport inequalities.

Article information

Source
Electron. Commun. Probab. Volume 22 (2017), paper no. 54, 14 pp.

Dates
Received: 19 June 2017
Accepted: 4 October 2017
First available in Project Euclid: 13 October 2017

Permanent link to this document
https://projecteuclid.org/euclid.ecp/1507860211

Digital Object Identifier
doi:10.1214/17-ECP92

Zentralblatt MATH identifier
06797807

Subjects
Primary: 97K50: Probability theory

Keywords
sub-Gaussian Beta distribution Dirichlet distribution concentration inequality transport inequality log-Sobolev inequality

Rights
Creative Commons Attribution 4.0 International License.

Citation

Marchal, Olivier; Arbel, Julyan. On the sub-Gaussianity of the Beta and Dirichlet distributions. Electron. Commun. Probab. 22 (2017), paper no. 54, 14 pp. doi:10.1214/17-ECP92. https://projecteuclid.org/euclid.ecp/1507860211


Export citation

References

  • Arbel, J., Favaro, S., Nipoti, B., and Teh, Y. W. (2017). Bayesian nonparametric inference for discovery probabilities: credible intervals and large sample asymptotics.Statistica Sinica, 27:839–858.
  • Ben-Hamou, A., Boucheron, S., and Ohannessian, M. I. (2017). Concentration inequalities in the infinite urn scheme for occupancy counts and the missing mass, with applications.Bernoulli, 23(1):249–287.
  • Berend, D. and Kontorovich, A. (2013). On the concentration of the missing mass.Electronic Communications in Probability, 18(3):1–7.
  • Birkhoff, G. and Rota, G.-C. (1989).Ordinary Differential Equations. John Wiley and Sons Editions.
  • Bobkov, S. G. and Götze, F. (1999). Exponential integrability and transportation cost related to logarithmic sobolev inequalities.Journal of Functional Analysis, 163(1):1–28.
  • Boucheron, S., Lugosi, G., and Massart, P. (2013).Concentration inequalities: A nonasymptotic theory of independence. Oxford University Press.
  • Buldygin, V. V. and Kozachenko, Y. V. (1980). Sub-Gaussian random variables.Ukrainian Mathematical Journal, 32(6):483–489.
  • Buldygin, V. V. and Kozachenko, Y. V. (2000).Metric characterization of random variables and random processes, volume 188. American Mathematical Society, Providence, Rhode Island.
  • Buldygin, V. V. and Moskvichova, K. (2013). The sub-Gaussian norm of a binary random variable.Theory of probability and mathematical statistics, 86:33–49.
  • Castillo, I. (2016). Pólya tree posterior distributions on densities.Annales de l’Institut Henri Poincaré, to appear.
  • Elder, S. (2016). Bayesian adaptive data analysis guarantees from subgaussianity.arXiv preprint:arXiv:1611.00065.
  • Gross, L. (1975). Logarithmic sobolev inequalities.American Journal of Mathematics, 97(4):1061–1083.
  • Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables.Journal of the American statistical association, 58(301):13–30.
  • Kearns, M. and Saul, L. (1998). Large deviation methods for approximate probabilistic inference. InProceedings of the Fourteenth conference on Uncertainty in artificial intelligence, pages 311–319.
  • Ledoux, M. (1999). Concentration of measure and logarithmic sobolev inequalities.Lecture notes in mathematics - Springer Verlag-, pages 120–216.
  • McAllester, D. A. and Ortiz, L. (2003). Concentration inequalities for the missing mass and for histogram rule error.Journal of Machine Learning Research, 4:895–911.
  • McAllester, D. A. and Schapire, R. E. (2000). On the convergence rate of Good-Turing estimators. InCOLT, pages 1–6.
  • Ordentlich, E. and Weinberger, M. J. (2005). A distribution dependent refinement of pinsker’s inequality.IEEE Transactions on Information Theory, 51(5):1836–1840.
  • Perry, A., Wein, A. S., and Bandeira, A. S. (2016). Statistical limits of spiked tensor models.arXiv preprint:arXiv:1612.07728.
  • Pisier, G. (2016). Subgaussian sequences in probability and Fourier analysis.arXiv preprint:arXiv:1607.01053.
  • Raginsky, M. and Sason, I. (2013). Concentration of measure inequalities in information theory, communications, and coding.Foundations and Trends in Communications and Information Theory, 10(1-2):1–246.
  • Robinson, J. C. (2004).An introduction to Ordinary Differential Equations. Cambridge University Press.