Annals of Probability

The structure of low-complexity Gibbs measures on product spaces

Tim Austin

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text


Let $K_{1},\ldots,K_{n}$ be bounded, complete, separable metric spaces. Let $\lambda_{i}$ be a Borel probability measure on $K_{i}$ for each $i$. Let $f:\prod_{i}K_{i}\longrightarrow \mathbb{R}$ be a bounded and continuous potential function, and let \begin{equation*}\mu (\mathrm{d}\boldsymbol{x})\ \propto \ \mathrm{e}^{f(\boldsymbol{x})}\lambda_{1}(\mathrm{d}x_{1})\cdots \lambda_{n}(\mathrm{d}x_{n})\end{equation*} be the associated Gibbs distribution.

At each point $\boldsymbol{{x}\in \prod_{i}K_{i}}$, one can define a ‘discrete gradient’ $\nabla f(\boldsymbol{x},\cdot )$ by comparing the values of $f$ at all points which differ from $\boldsymbol{{x}}$ in at most one coordinate. In case $\prod_{i}K_{i}=\{-1,1\}^{n}\subset \mathbb{R}^{n}$, the discrete gradient $\nabla f(\boldsymbol{x},\cdot )$ is naturally identified with a vector in $\mathbb{R}^{n}$.

This paper shows that a ‘low-complexity’ assumption on $\nabla f$ implies that $\mu $ can be approximated by a mixture of other measures, relatively few in number, and most of them close to product measures in the sense of optimal transport. This implies also an approximation to the partition function of $f$ in terms of product measures, along the lines of Chatterjee and Dembo’s theory of ‘nonlinear large deviations’.

An important precedent for this work is a result of Eldan in the case $\prod_{i}K_{i}=\{-1,1\}^{n}$. Eldan’s assumption is that the discrete gradients $\nabla f(\boldsymbol{x},\cdot )$ all lie in a subset of $\mathbb{R}^{n}$ that has small Gaussian width. His proof is based on the careful construction of a diffusion in $\mathbb{R}^{n}$ which starts at the origin and ends with the desired distribution on the subset $\{-1,1\}^{n}$. Here our assumption is a more naive covering-number bound on the set of gradients $\{\nabla f(\boldsymbol{x},\cdot ):\boldsymbol{x}\in \prod_{i}K_{i}\}$, and our proof relies only on basic inequalities of information theory. As a result, it is shorter, and applies to Gibbs measures on arbitrary product spaces.

Article information

Ann. Probab., Volume 47, Number 6 (2019), 4002-4023.

Received: December 2018
Revised: January 2019
First available in Project Euclid: 2 December 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60B99: None of the above, but in this section
Secondary: 60G99: None of the above, but in this section 82B20: Lattice systems (Ising, dimer, Potts, etc.) and systems on graphs 94A17: Measures of information, entropy

Nonlinear large deviations Gibbs measures gradient complexity dual total correlation mixtures of product measures


Austin, Tim. The structure of low-complexity Gibbs measures on product spaces. Ann. Probab. 47 (2019), no. 6, 4002--4023. doi:10.1214/19-AOP1352.

Export citation


  • [1] Augeri, F. Nonlinear large deviation bounds with applications to traces of Wigner matrices and cycles counts in Erdős–Rényi graphs. Preprint. Available at arXiv:1810.01558.
  • [2] Austin, T. Multi-variate correlation and mixtures of product measures. Preprint. Available at arXiv:1809.10272.
  • [3] Austin, T. (2018). Measure concentration and the weak Pinsker property. Publ. Math. Inst. Hautes Études Sci. 128 1–119.
  • [4] Bhattacharya, B. B., Ganguly, S., Lubetzky, E. and Zhao, Y. (2017). Upper tails and independence polynomials in random graphs. Adv. Math. 319 313–347.
  • [5] Bhattacharya, B. B., Ganguly, S., Shao, X. and Zhao, Y. Upper tail large deviations for arithmetic progressions in a random set. Preprint. Available at arXiv:1605.02994.
  • [6] Chatterjee, S. (2017). Large Deviations for Random Graphs. Lecture Notes in Math. 2197. Springer, Cham.
  • [7] Chatterjee, S. and Dembo, A. (2016). Nonlinear large deviations. Adv. Math. 299 396–450.
  • [8] Cook, N. and Dembo, A. Large deviations of subgraph counts in Erdős–Rényi random graphs. Preprint. Available at arXiv:1809.11148.
  • [9] Cover, T. M. and Thomas, J. A. (2006). Elements of Information Theory, 2nd ed. Wiley-Interscience, Hoboken, NJ.
  • [10] Csiszár, I. (1975). $I$-divergence geometry of probability distributions and minimization problems. Ann. Probab. 3 146–158.
  • [11] Dembo, A. (1997). Information inequalities and concentration of measure. Ann. Probab. 25 927–939.
  • [12] Dembo, A. and Zeitouni, O. (2010). Large Deviations Techniques and Applications. Stochastic Modelling and Applied Probability 38. Springer, Berlin. Corrected reprint of the second (1998) edition.
  • [13] Dudley, R. M. (2002). Real Analysis and Probability. Cambridge Studies in Advanced Mathematics 74. Cambridge Univ. Press, Cambridge. Revised reprint of the 1989 original.
  • [14] Eldan, R. (2018). Gaussian-width gradient complexity, reverse log-Sobolev inequalities and nonlinear large deviations. Geom. Funct. Anal. 28 1548–1596. Preprint. Available at arXiv:1612.04346.
  • [15] Eldan, R. and Gross, R. (2018). Exponential random graphs behave like mixtures of stochastic block models. Ann. Appl. Probab. 28 3698–3735. Preprint. Available at arXiv:1707.01227.
  • [16] Eldan, R. and Gross, R. (2018). Decomposition of mean-field Gibbs distributions into product measures. Electron. J. Probab. 23 Paper No. 35, 24.
  • [17] Han, T. S. (1975). Linear dependence structure of the entropy space. Inf. Control 29 337–368.
  • [18] Han, T. S. (1978). Nonnegative entropy measures of multivariate symmetric correlations. Inf. Control 36 133–156.
  • [19] Ledoux, M. and Talagrand, M. (2011). Probability in Banach Spaces: Isoperimetry and Processes. Classics in Mathematics. Springer, Berlin. Reprint of the 1991 edition.
  • [20] Lubetzky, E. and Zhao, Y. (2017). On the variational problem for upper tails in sparse random graphs. Random Structures Algorithms 50 420–436.
  • [21] Marton, K. (1986). A simple proof of the blowing-up lemma. IEEE Trans. Inform. Theory 32 445–446.
  • [22] Marton, K. (1996). Bounding $\overline{d}$-distance by informational divergence: A method to prove measure concentration. Ann. Probab. 24 857–866.
  • [23] Marton, K. (1998). Measure concentration for a class of random processes. Probab. Theory Related Fields 110 427–439.
  • [24] Pinsker, M. S. (1964). Information and Information Stability of Random Variables and Processes. Translated and Edited by Amiel Feinstein. Holden-Day, San Francisco, CA.
  • [25] Samson, P.-M. (2000). Concentration of measure inequalities for Markov chains and $\Phi$-mixing processes. Ann. Probab. 28 416–461.
  • [26] Talagrand, M. (1995). Concentration of measure and isoperimetric inequalities in product spaces. Publ. Math. Inst. Hautes Études Sci. 81 73–205.
  • [27] Talagrand, M. (1996). Transportation cost for Gaussian and other product measures. Geom. Funct. Anal. 6 587–600.
  • [28] Yan, J. Nonlinear large deviations: Beyond the hypercube. Preprint. Available at arXiv:1703.08887.