Bounds on the covariance matrix of the Sherrington–Kirkpatrick model

We consider the Sherrington-Kirkpatrick model with no external ﬁeld and inverse temperature β < 1 and prove that the expected operator norm of the covariance matrix of the Gibbs measure is bounded by a constant depending only on β . This answers an open question raised by Talagrand, who proved a bound of C ( β )(log n ) 8 . Our result follows by establishing an approximate formula for the covariance matrix which we obtain by diﬀerentiating the TAP equations and then optimally controlling the associated error terms. We complement this result by showing diverging lower bounds on the operator norm, both at the critical and low temperatures


Introduction and main result
We consider the Sherrington-Kirkpatrick model of spin glasses, a Gibbs distribution over the hypercube {−1, +1} n given by the expression where β ≥ 0 is the inverse temperature parameter and (g ij ) i<j are the random coupling coefficients assumed to be i.i.d.N(0, 1) random variables.We are interested in the behavior of the n × n covariance matrix of this probability measure: cov(µ It is expected that the operator norm of cov(µ) is of constant order in n whenever the model is at high temperature, i.e., for all β < 1, and that it must diverge at the critical and low temperatures β ≥ 1. Talagrand proved that for all β < 1 there exists C(β) < ∞ such that and conjectured that the logarithmic term can be removed entirely [Tal11,Section 11.5].His proof relies on the moment method and bounds the expectation of the trace of large powers of the covariance matrix; a method known to be loose by a logarithmic factor for random matrices with i.i.d.entries.Recently, Bauerschmidt and Bodineau [BB19] proved a decomposition theorem for Ising measures into a log-concave mixture of product measures, provided that their interaction matrix J is positive semidefinite and satisfies the operator norm bound J op < 1.The authors used this decomposition to prove a log-Sobolev inequality for such measure for a notion of the discrete gradient.In the special case of the SK model, their decomposition implies that E cov(µ) op is bounded for all β < 1/4.See also the work of Eldan, Koehler and Zeitouni [EKZ21] who proved under the same conditions a spectral gap inequality for Glauber dynamics.Such functional inequalities are expected to hold in the entire high-temperature regime, this is however still an open problem.In this paper we show boundedness of the operator norm for all β < 1: Extensions to a non-zero external field, possibly relevant to the problem of proving Poincaré and log-Sobolev inequalities for µ for all β < 1, are discussed in Section 2.
, where c > 0 is an absolute constant, for all n ≥ 2.
We note that the second bullet in the above theorem lower bounds the second moment of cov(µ) op ; a weaker statement.In Proposition 4.3 we prove a similar lower bound on the first moment, albeit not at β = 1 but at a temperature approaching criticality: The arguments used to prove Theorem 1.2 follow more or less directly from known results, so we delay their exposition to Section 4 in favor of the high-temperature result.
Our approach to showing Theorem 1.1 is by first establishing a TAP equation for the two-point correlations σ i σ j with an optimal error bound of 1/n 2 as we write next. (1.3) In particular we have for i = j, An analogue of Eq. (1.4) with a weaker error bound was recently proved by Adhikari, Brennecke, von Soosten and Yau [ABvSY21] for nonzero and Gaussian external field up to an inverse temperature β 0 = log 2 using what the authors call the dynamical approach to the TAP equations.The error bound they prove is of the form C(β, ε)/n 1+ε for ε sufficiently small.This bound is unfortunately not enough to imply Theorem 1.1; a bound of order 1/n 2 seems necessary for the conclusion to follow.We elaborate on a generalization of this result to nonzero external field in Section 2 below.

Nonzero external field and related work
The result of Theorem 1.3 is inspired by the following heuristic.Let us introduce an external field y = (y i ) n i=1 ∈ R n to the Gibbs measure: For a "typical" external field, the log-partition function of µ y is expected to have a TAP representation of the following form: The above representation was recently proved for Gaussian external fields by Chen, Panchenko and Subag [CPS18].Taking two derivatives with respect to y on both sides of Eq. (2.2), we expect where m is a maximizer in (2.2) and D(m) is the diagonal matrix with entries D(m) ii = 1/(1−m 2 i ).Taking y = 0, we have m = 0 and (2.6) Theorem 1.3 makes the above approximation precise in the sense of a bounded Frobenius norm, once the inverse is eliminated from the right-hand side.This operation will allow us to perform Gaussian integration by parts with respect to the disorder random variables g ij and this creates various overlap terms between independent replicas from the Gibbs measure µ.Then Theorem 1.3 is proved by exploiting known asymptotics for these overlaps in the high-temperature regime.
Similarly to the zero external field case, one can attempt to make the approximation (2.5) precise by showing that for β small enough, e.g., below the AT line when y is Gaussian, we have where m = ( σ i ) n i=1 is the mean vector of µ y .Boundedness of E cov(µ y ) op would then follow if one can show that λ max ∇ 2 F TAP (m) ≤ −ε for some ε > 0 with probability at least 1 − O(1/n) (see Eq. (1.10)).Celentano [Cel22] recently showed that the TAP free energy F TAP is locally strongly concave around one of its stationary points with probability 1 − o n (1) by applying a Gaussian comparison theorem carefully conditioned on a sequence of sigma-fields produced by an Approximate Message Passing (AMP) iteration.This stationary point should presumably be close to the mean vector σ ; see the related works [CT21,CFM21].The standard theory of AMP used in that paper does not yield any quantitative control on the probability of convergence, and this approach seems to fall short of obtaining the O(1/n) rate needed here.
Following the technique developed in Eldan and Shamir [ES22], which was later generalized in Chen and Eldan [CE22], a bound on operator norm of cov(µ y ) for an external field y given by Eldan's stochastic localization process can be used to prove Poincaré and log-Sobolev inequalities for µ.Our result can be seen as a small step within this larger scope.

Proof of Theorem 1.3
We will show that the left-hand side in Eq. (1.3) is actually equal to thereby showing that it is truly of constant order.We let We then write P − I 2 F = P 2 F − 2Tr(P ) + n .We treat the above expression term by term and we show that both E P 2 F and E Tr(P ) are of the form therefore canceling the terms diverging in n.We first analyze the trace term and turn to the norm term which is more delicate.Let us first record the following simple lemma for future reference: Proof.The above is trivially true for k = ℓ since the Hamiltonian has no dependence on g kk , and σ 2 k = 1.Assume k = ℓ.Writing f as a quotient, the product rule implies that the first term can be interpreted as the Gibbs average of f σ k σ ℓ , while the second term obtained by differentiating the partition function yields an independent Gibbs average of σ k σ ℓ .
For ℓ independent replicas σ 1 , • • • , σ ℓ drawn from µ, we write R a,b = 1 n n i=1 σ a i σ b i for the pairwise overlap between σ a and σ b , and The trace term: Using Gaussian integration by parts and Lemma 3.1, we have From [Tal11, Theorem 11.5.4], it known that for β < 1, The norm term: We now calculate E P 2 F .We have g ik g iℓ σ j σ k σ j σ ℓ = I + II + III .
We deal with these terms in order.Since the first term is As for the second term, using Lemma 3.1 we have Summing the above expression over i, j, k, this is We now turn to the third term.We split the sum into a diagonal and a off-diagonal part: For the diagonal term in Eq. (3.4), taking expectations and applying Gaussian integration by parts, we find Summing over i, j we find that (3.5) For the off-diagonal term in Eq. (3.4), since the random variables g ij are independent, we have ] .Let's call the above expression O ijkℓ .Summing this expression over all i, j, k, ℓ and only subtracting off the diagonal terms (corresponding to k = ℓ) later gives From this we deduce Finally, combining the above formula with Eq. (3.1) for E Tr(P ) we obtain (3.17) 4 Lower bounds at the critical temperature and low temperatures In this section, we investigate the tightness of our results.In particular, we show that the expected operator norm necessarily diverges as β → 1.In the low temperature regime where β > 1, we provide a lower bound on the expected operator norm of order √ n.Finally, we consider the behavior of the operator norm near and at the critical temperature β = 1.
Below, we write cov(µ β,n ) to emphasize the dependence on the inverse temperature β and n.

High, near critical temperature
We first provide a simple argument using well-known limit theorems on the overlap R 12 to show that the expected operator norm necessarily diverges as β → 1 − : In particular, lim Proof.We simply lower bound From Talagrand [Tal11, Theorem 11.5.4], the random variable (n(1 − β 2 )) 1/2 • R 12 converges in distribution as n → ∞ to a standard Gaussian under the measure E • ; by the Cauchy-Schwarz inequality and the Portmanteau theorem, it follows that where z is standard normal.The inequality follows from the standard fact E |z| = 2/π.
Note that the previous result does not appear to immediately have any bearing on the behavior at β = 1 due to the subtlety of interchanging limits.We treat the case β = 1 below using a similar analysis and leveraging some of the few known results in this regime.

Low temperature
Next, we show that the boundedness of the operator norm cannot extend past β = 1: Proposition 4.2.For every β > 1, there exists a constant c(β) > 0 such that Proof.The same argument as in Proposition 4.1 yields where we used |R 12 | ≤ 1 in the last inequality.From Talagrand [Tal11, Equation (14.417)], it holds that where ζ * β is called the Parisi measure and is the unique minimizer of the Parisi functional; see [AC15] and [Tal11,Pan13] for definitions.It is further known that ζ * β = δ 0 when β > 1, so the right-hand side in (4.4) is some constant c(β) > 0; see [AMS22, Section 5.2] for more details.The previous two displays yield the desired conclusion.

At the transition
To complete this picture, we now consider the operator norm near and at the critical temperature β = 1.The next result refines Proposition 4.1 by showing that for a sequence β n → 1 − , one can obtain similar lower bounds on the operator norm with different exponents: Proof.As usual, we always have where the last inequality is by the Cauchy-Schwarz inequality.Define g [Tal11,Theorem 11.7.1],these conditions imply that (n(1 − β 2 n )) 1/2 • R 12 converges in distribution to a standard Gaussian with respect to the measure E • ⊗2 , so an analogous argument using the Portmanteau theorem as in Proposition 4.1 implies that for some constant c > 0. Combining the two previous displays proves the desired inequality.
Remark 1.We remark that this result is essentially the best lower bound one can obtain at the critical point β = 1 using this method of lower bounding the operator norm by the Frobenius norm divided by √ n.
Indeed, it is expected that E R 2 12 = Θ(1/n 2/3 ) at β = 1.Assuming this, the lower bound in the above argument would be at most C √ n/n 1/3 = Cn 1/6 for some constant C > 0.
Finally, we directly consider the behavior at the critical temperature β = 1.While it is expected and intuitive that E R 2 12 is increasing with respect to β, which would imply a lower bound from the previous results, this does not appear to be known.Instead, we show the following bound on the second moment which is somewhat weaker, but holds unconditionally by applying known results in this setting: Proof.We leverage some known results on overlap convergence at β = 1.First, a result of Chatterjee [Tal11, Theorem 11.7.6]shows a lower bound on the third moment of R 12 : there exists a constant L > 0 such that Next, we need a bound on the decay of tail probability of R 12 .This is the content of a result of Talagrand [Tal03, Theorem 2.14.5] which we quote here: for β ≤ 1 and for all x ≥ L(log n/n) 3/8 where L = L(β) < ∞, it holds that where the probability is over the distribution of R 12 induced by the measure E • ⊗2 .Taking x = max{1, (2/L 3 ) 1/4 } • L(log n/n) Using (4.9) and rearranging yields E R 2 12 ≥ c n 13/16 (log n) 3/16 , (4.10) for some constant c > 0. The result follows from the previous display with the simple lower bound We finally remark that when β < 1, we can prove a constant upper bound on E cov(µ β,n ) 2 op by the same argument used to show our main result, Theorem 1.1.Thus the previous result, Proposition 4.4, shows a quantitative transition in the value of the second moment of the operator norm at β = 1.