LAW OF LARGE NUMBERS FOR THE MAXIMUM OF THE TWO-DIMENSIONAL COULOMB GAS POTENTIAL

. We derive the leading order asymptotics of the logarithmic potential of a two dimensional Coulomb gas at arbitrary positive temperature. The proof is based on precise evaluation of exponential moments, and the theory of Gaussian multiplicative chaos.

1. Introduction 1.1.Setting and main result.We are interested in proving a law of large numbers for the maximal value of the random electrostatic (or logarithmic) potential generated by the particles of a two-dimensional Coulomb gas -sometimes also called a 2d log-gas, or "two-dimensional, one-component plasma" (2DOCP).
The 2DOCP.Let N ≥ 1 and let X N := (x 1 , . . ., x N ) be a N -tuple of (distinct) points in R 2 and let (1.1) be the "energy" of X N , given by the sum of all the pairwise logarithmic interactions between points plus the effect of the so-called quadratic confining potential x → |x| 2 2 on each particle.Let β > 0 be fixed.We will work with the probability measure P β N on R 2 N whose density is defined as: where dX N := dx 1 . . .dx N is the Lebesgue measure on (R 2 ) N and Z N,β is the normalizing constant, or partition function, namely the following integral: The measure P β N is the canonical Gibbs measure of a 2DOCP at inverse temperature β and with quadratic confinement, cf.e.g.[For10,Chap. 15].
Henceforth, we fix an arbitrary value of β > 0 and N ≥ 2. We let X N be a random variable in (R 2 ) N distributed according to the Gibbs measure P β N , and let X N be the associated random point measure on R 2 as defined above.Unless specified otherwise, expectations in what follows are taken under P β N .
Main result.In the sequel, D(z, r) denotes the closed disk of center z and radius r in R 2 , we set D r = D(0, r) and D = D 1 .Let m 0 be the probability measure with uniform density, denoted m 0 , on the unit disk D. As we recall below in Section 2.1, under P β N the points (x 1 , . . ., x N ) tend to arrange themselves at the macroscopic level according to the so-called equilibrium (or background) measure m 0 .
Define the Coulomb gas potential generated by X N = (x 1 , . . ., x N ) (and the background measure) as the following random real-valued field on R 2 : In physical terms, Pot N (z) corresponds to the value at z of the electrostatic potential generated by the system of charges and the background measure m 0 .Obviously Pot N (z) is equal to −∞ whenever z coincides with one of the point charges.Our main result is the following description of the maximum of Pot N over closed disks in the interior of D: Theorem 1 (LLN for the max of the 2D Coulomb gas potential).For all r ∈ (0, 1) we have: As a byproduct of our proof, we also obtain a control on a certain regularization of Pot N at microscopic scale (see Corollary 3.9), which allows us to state a uniform control on the fluctuations of linear statistics for a certain class of C 2 test functions (see Proposition 3.10).
1.2.Connections with the literature.Our interest in Theorem 1 is motivated by connections with the theory of random matrices and the theory of logarithmically correlated fields.
The Ginibre ensemble and normal matrix models.For the specific value β = 2 of the inverse temperature in (1.2), P β N coincides with the joint distribution of eigenvalues for Ginibre matrices (i.e.matrices whose entries are i.i.d.complex Gaussians normalized by 1/ √ N ) see again [For10,Chap. 15] or the recent survey [BF22].In particular, the Coulomb gas potential z → Pot N (z) in (1.3) can then be interpreted as a log-characteristic polynomial, namely the logarithm of the absolute value of the characteristic polynomial of the associated complex Ginibre matrix.In this (β = 2) case, the statement of Theorem 1 was proved in [Lam20].We also mention [WW19], where exponential moments of Pot N (z) are computed when β = 2; from the results of [WW19], the upper bound in Theorem 1 can be readily obtained in that case, but even for β = 2, mixed moments at different points need to be computed in order to obtain the lower bound.
The Ginibre ensemble has the property of being determinantal (see e.g.[HKP + 09, Chap.6.4]), which is akin to being "integrable" and gives access to hard but explicit computations.Note that one can consider more general normal matrix models for which the inverse temperature is still β = 2 but the "confining" potential |x| 2 2 in (1.1) is replaced by other potentials, see e.g.[AHM11, AHM15,AKS21].These models are also determinantal and, in contrast, the present paper fits into a line of work exploring what can be done without the determinantal structure.
Logarithmically correlated structure.For the Ginibre ensemble, following a prediction of Forrester and using determinantal techniques, [RV07] proved that the log-characteristic polynomial Pot N converges to some planar Gaussian Free Field (GFF).This follows from a central limit theorem (CLT) for fluctuations of linear statistics i.e. quantities of the form N i=1 f (x i ) − N f dm 0 , with f of class C 1 .For normal matrix models, similar questions are treated (among other things) in [AHM11, AHM15,AKS21].The CLT was extended to arbitrary temperatures β > 0 in [BBNY19,LS18] with a slightly stronger assumption on the regularity of f .It implies again that for all β > 0 the potential Pot N converges, in some weak sense, to a two-dimensional GFF.The two-dimensional GFF is a well-known example of a Gaussian logarithmically correlated field, that is, a centered distribution-valued Gaussian field whose correlation decays with the logarithm of the distance.Such fields can be seen as possessing essentially independent contributions from different dyadic scales, see the discussion in [Bis20].A question of particular interest is the study of extreme values of such fields, which has been treated e.g. in [DRZ17] in some generality.In view of the convergence results mentioned above, it is thus natural to ask whether the maximum of Pot N behaves like the maximum of a 2d logarithmically correlated field, and Theorem 1 shows that this is true for the leading order (see Section 1.5 for a discussion about lower order terms).
One-dimensional analogues.The density (1.2) restricted to the real line (resp.the unit circle) coincides with the eigenvalue distribution of the so-called Gaussian (resp.Circular) β-Ensemble (GβE, resp.CβE), which are Hermitian (resp.unitary) random matrix models -the cases β = 2 and (to a slightly lesser extent) β = 1, 4 being of particular interest.
Extremes of the (logarithm of the) characteristic polynomial for these matrix models have generated much interest, especially in the case of the CUE (corresponding to the CβE with β = 2), due to a celebrated conjecture of Fyodorov, Hiary and Keating [FHK12] that predicts the limiting form of the fluctuations of the maximum and links these to analogous fluctuations for the Riemann zeta-function.
This has stimulated much recent work, starting with [ABB17], [PZ18] (for the CUE), [CMN18] and [PZ22] (for the CβE); the last article indeed proves a version of the FHK conjecture.On a related subject, recall that a Gaussian multiplicative chaos (GMC) on the unit circle is the weak limit of measures whose densities with respect to the Lebesgue measure are the exponential of a smoothed version of a logarithmically correlated Gaussian field, properly normalized (see [RV14] for background).It has been established that powers of the CUE characteristic polynomial, viewed as the density of a measure with respect to the Lebesgue measure on the unit circle, converge to a GMC in the so-called L 1 phase [NSW20].Such measures are limits of exponentials of smoothed logarithmically correlated fields and they describe the fluctuations of thick points (extreme level sets) of the characteristic polynomial [JLW22].Weaker analogous results for the case of GUE are contained in [LP19] and [CFLW21], see also [BMP22] and [ABZ23] for the GβE.

Connection with the Quantum Hall Effect. The probability density P β
N is connected to the study of the (fractional) Quantum Hall Effect (QHE) through the so-called Laughlin wave function (more precisely, its absolute square).There is a huge literature on QHE, let us simply refer to the expository text [Rou19] which presents the connection with Coulomb gases using a terminology very close to ours.As we will explain next, our proof of Theorem 1 relies on computing the asymptotics of (joint) exponential moments of Pot N .These asymptotics are also related to the problem of determining the statistics of certain types of quantum quasi-particles arising in fractional QHE experiments, by considering modifications of the Laughlin function as trial states.This connection is explained in further details in [LLR22, Section 1.3] -note however that the estimates required for that problem go beyond the precision achieved here.

Comments on the strategy.
Existing strategies for logarithmically correlated fields.As noted above, a general methodology to handle extremes of Gaussian logarithmically correlated fields has been developed in the last decades; due to space limitations, we do not review here the history, and refer instead to [DRZ17] and [Bis20] for details.When applied outside the Gaussian context, this methodology requires the decomposition of the field to (essentially) independent contributions from different dyadic scales, the evaluation of exponential moments of the field, together with the introduction of certain barriers, that is restrictions on the partial sums of these contributions; we refer again to [DRZ17] for precise definitions.These techniques seem crucial in obtaining sharp results (at the level of O(1) fluctuations for the extremes), and are often hard to implement outside the Gaussian setup.For example, the works mentioned above concerning the extremes of the electrostatic potential for log-gases on the unit circle (which, for finite N , are not Gaussian fields) use, at different levels of precision, variants of these methods, with much technical work going into obtaining decomposition of the fields, inserting appropriate barriers, and controlling comparisons with the Gaussian setup.
In this paper, we crucially rely on an observation made in [LOS18] and [CFLW21] using GMC theory in order to obtain the leading order of fluctuations; one can bypass the use of barriers and sharp results on asymptotic independence often obtained by computing characteristic functions, at the cost of obtaining sharp estimates on exponential moments of Pot N .

Related computations of exponential moments.
In the context of log-gases, evaluation of exponential moments is at the heart of many proofs of the central limit theorem for linear statistics.An early application of this method is due to Johansson [Joh98] for eigenvalues of random Hermitian matrices, which can be mapped to certain one-dimensional log-gases (at β = 2).He then proceeded to analyse the log-gas model for all β.This was later refined in e.g.[Shc14].For random normal matrix models, a proof of the CLT for linear statistics going along the same lines is sketched in [AHM11, Sec.7.2].For the two-dimensional case at arbitrary temperature, the method inspired by Johansson was implemented in [LS18, BBNY19, Ser23], with additional analytic challenges compared to the 1d case.Our analysis is based on this approach that we review below.
The electric energy approach.The analysis of the Coulomb gas used here (Section 2 and Section 3 as well as the Appendix) relies on the general "electric energy" approach to 2DOCP's as developed by Serfaty and co-authors, starting with [SS15].In particular we use: • The "splitting formula" of Sandier-Serfaty and the general idea of working with the equilibrium measure m 0 instead of the confining potential, which involves a slight re-writing of the Gibbs measure P β N , bringing us closer to the physics point of view on 2DOCP's, see Section 2.1.• The electric energy and its local versions, together with local laws i.e. good controls on the exponential moments of the local energy up to the microscopic scale (which is crucial for us) as in [AS21], see Section 2.2.• The general spirit of controlling fluctuations through the electric energy, see Section 2.3.
• The "transportation" approach introduced in [LS18] and the fine energy expansion along a transport found in [Ser23].• Computation of exponential moments of linear statistics, relative expansion of partition functions and Serfaty's "smallness of anisotropy" trick, for which up-to-date statements are found in [Ser23].
The last two items are the heaviest technically, and we postpone their detailed discussion to the Appendix.
1.4.Sketch of the proof.The proof of Theorem 1 is split into an upper bound on the typical maximal value of the electrostatic potential together with a matching lower bound.In both parts, the goal can be understood as making a comparison to an ideal "Gaussian case" where the values of Pot N would be a Gaussian logarithmically correlated field.However, one can ignore the correlations for the upper-bound.
Upper bound.The core of the proof for the upper bound (in Section 3) is a good control of exponential moments for the values of Pot N i.e. for linear statistics of the form In general, moments of linear statistics can be controlled either by purely energy-based considerations (see Lemma 2.4) or in a more precise fashion using the results of [Ser23].The first option is fairly robust but it usually yields sub-optimal estimates, so we would like to use the second option, which is well-suited to linear statistics of smooth, compactly supported functions living at a certain "scale".However, when considering the test function x → log |z − x| (for z ∈ D 1 ) several problems arise: (1) It is inherently multi-scale.
(2) It is singular near x = z, whereas the results of [Ser23] require a few derivatives.
(3) It is not compactly supported.In fact, as an inspection of the proof of [Ser23] reveals, the real issue is not the lack of compact support but rather the fact that the total mass of its Laplacian is not 0. In order to be able to deal with the first item, i.e. to treat test functions that live on different scales, we go back to the proof of [Ser23] and make the necessary adaptations.As a first step, we recast the result of [Ser23] in a slightly different way (Proposition 2.6), which we then use iteratively to treat the multi-scale setting (Proposition 2.9).This is carried out in Section 2, with proofs postponed to the appendices.To deal with the second and third items, we regularize the test function x → log |z − x| and we "center" it -namely, we substract some well-chosen test function so that the Laplacian of the difference has total mass 0.
• The regularization procedure is standard.By a simple trick (Lemma 3.5) using sub-harmonicity, we see that bounding the maximal value of the regularized version of Pot N is enough to control the maximal value of the "true" Pot N .• The "centering" is constructed ad hoc in Section 3.1, and Proposition 3.2 (relying on a bound for exponential moments) guarantees that the presence of the centering does not matter when evaluating the maximal value of the field.The work done so far allows us to estimate the exponential moments of linear statistics corresponding to an appropriately regularized version of x → log |z −x| for any fixed z in the interior of the unit disk.This is only a point-wise bound (recall that we want to control the maximal value over all z's) but the probabilistic tails obtained through Markov's inequality are good enough to go through a big union bound and to control the maximal value of the potential over all points of the disk living on a very narrow lattice of stepsize ≈ N −1/2−δ with δ small, see Lemma 3.7.Finally, it remains to extend the result from that lattice to the whole disk.This requires a control of the difference between the potential felt at two points separated by ≈ N −1/2−δ , which is presented in Proposition 3.8.
Lower bound.The proof of the lower bound is provided in Section 4. As mentioned earlier, it follows the recipe of [CFLW21] and is based on the construction of a sequence of measures obtained from exponentiating regularized versions of Pot N similar to those introduced for the upper bound, see (4.3).
The heart of the proof consists then in showing (see Proposition 4.5) that this sequence of measures converges to a Gaussian Multiplicative Chaos (GMC), which is defined as the limit of similar objects constructed from a Gaussian process.The result of [CFLW21] guarantees that in order to prove the desired convergence, it is enough to obtain asymptotics of exponential moments of linear combinations of regularized logarithm centered at different points, see (4.6).The proof of the latter is done by induction and uses in a crucial way the two-scale statement of Proposition 2.9 (that was already used for proving the upper bound).1.5.Open problems.The most obvious open question regarding Theorem 1 is that it only gives the leading order of the maximum of Pot N .If one wants to actually see the influence of the underlying logarithmically correlated structure, one needs to evaluate (at least) the next order correction, which is expected to be 3 4 √ β log log N (1 + o(1)) due to the log-correlated structure.
The techniques for achieving that go beyond our methods.
In a different direction, it is natural to consider replacing |x| 2 in (1.1) by other growing (real) functions f .Applying our methods to that case requires three ingredients: first, one needs regularity of the background density m 0 , and to modify the electrostatic potential h 0 , see (A.15) accordingly.Second, one would need to modify the background function g of (3.2), which is easy to do in the radial case.And third, one should look for a replacement for Claim A.9, whose proof is based on a scaling argument.In the case of monomials f (x) = |x| q , it is simple to carry out the adaptations, but already in the case of a general (even) polynomial f (|x|) one needs to find a replacement for the scaling argument.
A particular case of interest, concerns the real Ginibre ensemble, see [FN07].There, one needs to deal with the symmetry of the point configuration X N , as well as with the special role of the real axis.This would require significant changes in our derivation, and we leave this as an open problem.
Finally, we expect the result of Theorem 1 to remain true if one considers the maximum of Pot N over the whole unit disk (or even the whole plane).We also expect that Proposition 4.2 holds without any regularization; see Remark 4.3.1.6.Notation.
• If Ω is a measurable subset, X N (Ω) denotes the number of points of X N contained in Ω.
• We denote measures with a bold typeface (e.g.m) and their densities with respect to the Lebesgue measure on R 2 with a roman typeface (e.g.m).
With this notation, if ϕ ℓ := ϕ(•/ℓ) is the "rescaled" version of ϕ at scale ℓ > 0 we have • Integrals with respect to the Lebesgue measure are often written without explicitly mentioning the volume form, ie ϕ = ϕ(x)dx.

• If A, B are two quantities (depending on various parameters) we write A B (or A = O(B))
when |A| is bounded by some universal constant times |B|.We write A = o(B) or A ≪ B if A/B → 0 when a (sometimes implicit) parameter goes to infinity.When no parameter is mentioned, it is understood that the parameter is N .We write A ≍ B when A = O(B) and • When m is a probability measure on R 2 with continuous density m, we denote its relative entropy (with respect to Lebesgue) by E(m) := R 2 m log m.
Acknowledgment.We thank Sylvia Serfaty for communicating early versions of [Ser23] to us and making some statements thereof more easily citable for our purposes.We thank the anonymous referees for comments that improved the presentation of our results, and for a careful reading.

Preliminaries
2.1.Re-writing the energy and the Gibbs measure.Recall that m 0 denotes the uniform probability measure on the unit disk.The following result is classical (see e.g. the book [Ser15]): Lemma 2.1.As N → ∞, under P β N , the empirical measure of the points 1 N N i=1 δ xi converges weakly to the equilibrium measure m 0 both in probability (and almost surely when coupling all the P β N 's in the trivial way), with large deviations at speed N 2 .
In particular, it makes sense to consider the difference 1 N N i=1 δ xi − m 0 as encoding the "second-order" behavior (fluctuations) of the system.Starting with [SS15], studies of the 2DOCP have benefited from the following (seemingly simple) rephrasing: Let X N := N i=1 δ xi be the purely atomic measure of total mass N on R 2 associated to the N -tuple of positions X N = (x 1 , . . ., x N ).We define the logarithmic interaction energy F(X N , m 0 ) as: where △ denotes the diagonal in R 2 ×R 2 .Recalling that − log is (up to a multiplicative constant) the Coulomb kernel in R 2 , we can think of F(X N , m 0 ) as being the electrostatic interaction energy of a neutral system made of N point charges placed at (x 1 , . . ., x N ) together with a continuous "neutralizing" background N m 0 of opposite charges, and this is indeed the point of view used in the physics literature about 2DOCP's (see e.g.[AJ81]).We also introduce an auxiliary function ζ (the "effective confinement"), which vanishes on D and is set to: The splitting formula of Sandier-Serfaty ([SS15, Sec.2]) consists simply in observing that: ζ(x i ) + a constant term depending on N but not on X N .
In particular, the Gibbs measure where K β N is the corresponding partition function, namely: (2.5) In the sequel, we will work with the expression (2.4) instead of (1.2) for the Gibbs measure of the 2DOCP, and with F(X N , m 0 ) instead of H N (X N ) as the "energy" of the system.
Remark 2.2.In the physics literature about the 2DOCP, the Gibbs measure is often written down as in (2.4) but with an "effective confinement" ζ set to +∞ outside D ("perfect confinement").Our analysis in the present paper applies to this model as well (the only differences would appear when studying properties close to the boundary, which is not our purpose).

Energy at global and local scales.
Let m be a probability measure on D with a density that is continuous and bounded below by a positive constant on D.
Global energy.If X N is a N -tuple of points and X N is the associated atomic measure of mass N , we extend the definition (2.1) and define the "global energy" F(X N , m) as: (2.6) We introduce the associated Gibbs measure (where with the corresponding partition function: It is known (see e.g. the pioneering analysis of [SS15, Thm.1]) that for all β > 0: (2.9) One should thus think of F(X N , m) + 1 4 N log N as being the "interesting" global energy term, which is random and typically of order N .In fact, in the 1 4 N log N term, log N is related to the logarithm of the "microscopic scale" (here N −1/2 ) and the N factor is simply the total number of particles.This will be useful to keep in mind when encountering the local version below.
Length scales.The system is supported on the unit disk, hence the natural global scale is ℓ = 1.For many interesting questions it is crucial to understand the system at local scales i.e. in squares (or disks) of size ℓ ≪ 1.One distinguishes between mesoscopic scales ℓ such that N −1/2 ≪ ℓ ≪ 1 and the microscopic scale ℓ ≃ N −1/2 .A constant ρ β ≥ 1 depending only on β was introduced in [AS21, (1.15)], it corresponds to the "minimal lengthscale" above which good controls on the energy can be obtained.In this paper when considering a length scale ℓ we will always assume that 2 (2.10) (Note that since ρ β ≥ 1, we always have N ℓ 2 ≥ 1.) ) be the true electric potential (resp.true electric field) generated by the "charged system" X N − m, namely the map (resp.vector field) defined on R 2 by: We recall that − log satisfies −∆(− log) = 2πδ 0 on R 2 in the sense of distributions.It is easy to check that the following identity is satisfied in the sense of distributions: (2.11) 1 The function ζ plays almost no role in our analysis, as we are focused on the bulk of the system.For simplicity we keep "the same ζ" in all cases.
2 Since [AS21] work with a different scaling than us, we need to rescale their ρ β by N −1/2 .and that ∇Pot XN ,m N is (almost surely) in L p loc for p < 2 yet fails to be in L 2 around each point charge.

Truncations.
In order to handle the singularities, one often proceeds to a truncation of the fields near each point charge.For η > 0 we let f η be the function: For each i = 1, . . ., N let η i be a positive real number.For a fixed choice of η := (η 1 , . . ., η N ), we let ∇Pot XN ,m, η N be the truncated electric field given by: We For i = 1, . . ., N define the "nearest-neighbor" distance r i as: (2.12) We let r := (r 1 , . . ., r N ), which can serve as a convenient choice of truncation.Note that the r i 's are always smaller than 1 4 N −1/2 .3. Electric formulation of the energy.The following identity (see e.g.[AS21, Lemma 2.2.]) can be considered as the starting point of the "electric energy" approach: (2.13) To summarize where we stand: the interaction energy between the particles plus the effect of the potential, or equivalently (by (2.3)) the electrostatic interaction energy F(X N , m) of the charged system "point charges minus background" can be (by (2.13)) rephrased as a certain "electric energy"; (1) The square of the L 2 norm of the corresponding electric field R 2 |∇Pot XN ,m, r N | 2 after a suitable truncation (this is the role of the r ′ i s).This term is typically of order N .(2) A "renormalization" term N i=1 log r i .This term is equal to the constant − 1 2 N log N plus something (typically) of order N .
(3) A correction term N i=1 D(xi,ri) f ri (t−x i )dm(t) which is deterministic and bounded uniformly in N (with constant depending on m).

Local energy.
Going back to the initial definition (1.1) of the energy H N (X N ), it seems that the natural notion of "the energy of X N within a subset Ω" could be to look at: However, due to the long-range character of the logarithmic interaction, this does not "work" and the electric field introduced above turns out to be a more convenient object to deal with.In short, it is better to localize the L 2 norm of the electric field (the field itself remains a global object depending on the entire point configuration) than to localize the interaction.We now go through some definitions found in the beginning of [AS21, Sec.2.3], which will lead us to introduce a proper notion of "local energy".Let Ω be a disk or a square in R 2 and let us take U = R 2 when reading [AS21].In our case, the function h defined in [AS21, (2.20)] is identically 0 and we can ignore it.The potential u defined in [AS21, (2.22)] is nothing but Pot N (as the equation that needs to be solved is exactly (2.11)) up to some irrelevant additive constant.The distance ri introduced in [AS21, (2.23)] (recall that we have U = R 2 so ∂U = ∅) becomes here simply: With a slight abuse of notation we let r be the vector (r 1 , . . .,r N ).The "local energy in Ω" as defined in [AS21, (2.24)] reads: compare with the right-hand side of (2.13).One should have in mind (see below for a precise result) that: (1 i,xi∈Ω log ri is equal to − 1 2 n log N (with n the number of points in Ω) plus a term which is typically of order N × |Ω|.
(3) The last term is again a deterministic, bounded correction.

The local "energy-points" density. It turns out that many relevant error terms can be controlled by the sum of:
(1) The non-negative part of the local energy (2.14) corresponding to the square of L 2 norm of the (truncated) electric field in a domain Ω, (2) and the number of points in Ω.So if z is a point in D and ℓ a length scale we will denote by EnerPts(z, ℓ) the quantity: (2.15) Local laws.Recall the condition (2.10).Due to a weaker understanding of the system near the boundary ∂D, one needs to introduce the following additional condition on (z, ℓ) (where z is a point of D and ℓ is a length scale): (2.17) log with an implicit constant depending only on β.
Thus in view of the definition (2.15), it means that in D(z, ℓ) both the local number of points n and the positive part of the local energy (up to the constant n log N term) are of order N ℓ 2 in exponential moments, and so down to the microscopic scale ℓ ≃ N −1/2 (a crucial improvement over the local laws of [Leb17] which covered all mesoscopic scales).
How to read (2.17) from the literature.The statement of [AS21, Thm.1] treats the electric energy and the number of points separately, is formulated in terms of the local energy (2.14) and not of its positive part only (see (2.15)), moreover the authors work with a different scaling than ours.We now explain how to get the statement that we want to use later, namely the estimate (2.17), from the literature.1. Number of points.Recall that we write here n for the number of points X N (D(z, ℓ)).In [AS21, Theorem 1] there are two statements about n, here we can use [AS21, (1.18)] which controls exponential moments of the discrepancy (which corresponds to n − πN ℓ 2 ).It is not hard to see that it implies (and is in fact a lot stronger than) a bound on the number of points of the form: ( (1) Control on the local energy with an additive constant term due to rescaling: which is [Ser23, Prop.3.5, (1)].
(2) Control on the positive part of the local energy in terms of the full local energy: which is [Ser23, (3.25)].Combining these two statements with (2.18) allows us (in view of the very definition of EnerPts in (2.15)) to derive (2.17) as desired.

Fluctuations.
Recall that m is a probability measure on D. If ϕ is a m-integrable function, we define "the fluctuation L m N (ϕ) of the linear statistics associated to ϕ for the configuration X N ", as: We write L N (ϕ) for L m0 N (ϕ).If ϕ is Lipschitz, one can always use the following non-optimal but uniform control on L N (ϕ).
Lemma 2.4.Let z be a point of D and ℓ be a length scale such that (z, ℓ) satisfies (2.16).Denote by L z,ℓ the set of all functions that are ℓ −1 -Lipschitz and compactly supported on D(z, ℓ).Then for all t such that |t| is smaller than some constant depending on β, we have: with an implicit constant depending only on β.A similar estimate holds for L, the set of all 1-Lipschitz function with compact support in D(0, 2).
The basic idea is that we have a configuration-wise bound of the form: for ϕ ∈ L z,ℓ , , (see e.g.[AS21, Lemma B.5] for a precise statement), and the result follows from an application of the "local laws" as presented in the previous section, see e.g.(2.17).
If ϕ is assumed to have more regularity, the results of [LS18, BBNY19,Ser23] give a much better estimate on the exponential moments of L N (ϕ), but they are only stated function-wise.The method of this paper, which relies on those results, yields a uniform control for the fluctuations of a certain class of C 2 linear statistics; see Corollary 3.9 and Proposition 3.10 below.
2.4.Re-writing the Laplace transform.When computing exponential moments of fluctuations, our first step will always be the following decomposition into a simple term akin to a variance, and a certain "ratio of partition functions" associated to two closely related 2DOCP's.
Lemma 2.5 (Laplace transforms as ratio of partition functions).Let ϕ be a C 2 function whose Laplacian is supported in D and satisfies ∆ϕ = 0. Let t, s be such that: Let m s be the probability measure on D with density m s := m 0 + s∆ϕ.The following identity holds: .

Of course, if ∇ϕ is compactly supported then one can integrate by parts and write the "variance
Proof.This follows from elementary manipulations that can be found e.g. in [LS18, Section 2.6].
It is, however, a much older idea from [Joh98] or [AHM11, Sec 7.2] and the references therein.
2.5.Comparison of partition functions.Note that in Lemma 2.5 the density of the "perturbed measure" m s is obtained by m s := m 0 + s∆ϕ and that clearly if ϕ is compactly supported then ∆ϕ = 0 (so in particular m s is again a probability measure).Most of the analysis of [LS18,BBNY19] is devoted to obtaining good asymptotics for the ratio of the partition functions associated to m s and m.One purpose of the next proposition is to extend this analysis to the slightly more general case where m s is obtained from m 0 by adding some perturbation f such that f = 0 -without f necessarily "coming" as the Laplacian of a compactly supported test function.This is of course mostly a re-writing exercise and not a major modification.Another purpose is to package together results that are written separately in the literature -in particular we incorporate Serfaty's "smallness of the anisotropy" trick (see [Ser23]) in order to have one single "ready-to-use" statement.
Proposition 2.6 (Main comparison result).Let ℓ be a lengthscale and let z ∈ D such that (z, ℓ) satisfies (2.16).Let C be some positive constant.Let m be a probability measure on D, such that its density m is of class C 3 , with: Moreover, let f be a function of class C 2 supported in D(z, ℓ), such that: (2.23) Then there exists a constant C ′ depending only on C, β and r such that the following holds.For all s ∈ R such that: let m s be the probability measure with density m s := m + sf .We have: We postpone the proof of Proposition 2.6 to Section A.1.The proof relies almost entirely on the analysis of [Ser23] but we could not find there a statement reasonably close to our needs.
Remark 2.7.In the statement of Proposition 2.6 we assume that the parameter s is O ℓ 3/2 N −1/4 , but we eventually apply this result with s of order N −1 as in (2.21), which is always a valid choice for ℓ ≥ ρ β N −1/2 .On the other hand, the well-definiteness of m s as m + sf only requires s to be O(ℓ −2 ).However, for values of s between ℓ 3/2 N −1/4 and ℓ 2 we do not get an interesting estimate.
In Proposition 2.6, we assumed that (z, ℓ) satisfies (2.16) to avoid the case where the perturbation is near the boundary (because local laws are not known to hold near the boundary).The only case where we can handle a perturbation that intersects the boundary is the one of a "macroscopic perturbation" (ℓ = 1), which will be enough for our purposes.This is covered by the following lemma.
Lemma 2.8 (Main comparison -macroscopic case).Let f = 2π (χ − m 0 ), where m 0 is the uniform density on D and χ is a smooth, radially symmetric function which is compactly supported in D r for some fixed r < 1, such that χ = 1 and |χ| k ≤ C for k = 0, 1, 2. Then there exists a constant C ′ depending only on C, β and r such that the following holds: for all s ∈ R with |s| ≤ 1 C ′ , let m ′ s be the probability measure with density m ′ s := m 0 + sf .We have: with an implicit multiplicative constant depending on C, β and r.
Lemma 2.8 is proven along the same lines as Proposition 2.6, the radial symmetry of both f and m 0 provides a simplification which allows us to efficiently treat the boundary case (note also that we are aiming at a less precise estimate, compare (2.26) with (2.25)).We give the proof in Section A.1.
The next result builds upon Proposition 2.6 and treats a situation where the perturbation f is made of two pieces living at different scales.
Proposition 2.9 (Comparison with mass transfer between scales).Let ℓ a < ℓ b be two length scales and let z a , z b ∈ D be two points such that both (z a , ℓ a ) and (z b , ℓ b ) satisfy (2.16).Let C be some positive constant and let m be a probability measure on D such that its density m is (on D) of class C 3 with: Assume that D(z a , ℓ a ) and D(z b , ℓ b ) are both contained in D r for some r < 1, so that for N large enough (depending on r), condition (2.16) is satisfied.Then there exists a constant C ′ depending only on C and β such that the following holds.For all s ∈ R such that: let m s be the probability measure with density m s := m + sf .We have: , with a multiplicative constant depending only on C, β and r.
We postpone the proof of Proposition 2.9 to Section A.3.It uses the same techniques as in [Ser23], but the statement is new.
Remark 2.10.It is important to observe that in the conclusions of Proposition 2.6 (resp.Proposition 2.9), if we work at the microscopic scale ℓ = ρ β N −1/2 (resp.ℓ a = ρ β N −1/2 ), then for s of order N −1 (which will be our choice later on) the error term (the last term in (2.25) and (2.30)) is O(1) as expected, whereas as soon as the length scale ℓ (resp.ℓ a ) is mesoscopic there is a gain and the error term becomes o(1).
As we record in the next remark, Proposition 2.6 yields the classical CLT for (arbitrary smooth mesoscopic) test functions supported inside D, see [LS18] and [BBNY19] for the original results.A CLT-like precision will be required to show that the exponential of a regularization of Pot N converges to a certain GMC measure in Section 4, which in turn is instrumental in obtaining a lower-bound for the maximum of Pot N .
Remark 2.11.Let ϕ ∈ C 4 (R 2 → R) be a function (possibly depending on N ) and assume that ∆ϕ = f , where f is as in Proposition 2.6 or Proposition 2.9.Then by combining Lemma 2.5 and Propositions 2.6 or 2.9, we get: where m = m 0 − f 2πN β .Here we used that the uniform measure m 0 does satisfy the conditions (2.22) and (2.27).Moreover, since f is supported in D with f = 0 and |f | 0 ≤ Cℓ −2 , we have In particular, L N (ϕ) converges in distribution to a Gaussian random variable with mean 0 and variance 1 2πβ R 2 |∇ϕ| 2 (this last expression follows by an integration by parts).

Law of large numbers: upper bound
The goal of this section is to prove the upper bound part of Theorem 1, namely: Proposition 3.1.Recall the definition (1.3) of Pot N .For all fixed r ∈ (0, 1) and all α > 1, we have: (3.1) lim In the rest of this section we fix some r < 1. ).An important technical observation is that ∆ log z = 2π so that (even after a mesoscopic regularization), one cannot directly apply Proposition 2.6 to control the exponential moments of L N (log z ).To fix this issue, we can consider instead the fluctuation of the test function log z −g where g is some nice function with ∆g = 1.Of course, one should be able to say something about the fluctuations L N (g).It is particularly convenient to make the following choice: let χ be a radially symmetric smooth function (independent of N ) supported in D r with χ = 1, and let (3.2) g be a solution of Poisson's equation ∆g = 2πχ.
Then, the following estimate shows that the fluctuations of L N (g) are negligible compared to the maximum of Pot N .
Proposition 3.2.One has (for N large enough depending on β, r) The value 0.8 is arbitrary, the point being that 0.8 < 1 while 1.5 > 1 and thus the probabilistic tail is better than algebraic in N .The proof of Proposition 3.2 will be given in Section A.4 and the argument actually shows that L N (g) is typically of order 1 as expected, see Corollary A.7.

A regularized version of the potential.
Let ρ be a radial C ∞ mollifier supported on the unit disk and for ε ∈ (0, 1), let ρ ε := ε −2 ρ • ε and let ϕ z,ε be defined as: ϕ z,ε := ρ ε ⋆ log z −g where g satisfying (3.2) is independent of N, z and ε.We think of ϕ z,ε as being an alternative to log z which is regularized in two ways: the singularity near z is removed by convolving with a mollifier, and the Laplacian of ϕ z,ε has total mass 0.Then, in view of Proposition 3.2, in order to prove Proposition 3.1, we first focus on controlling the exponential moments of L N (ϕ z,ε ).Proposition 3.3.For some constant C ≥ 1 depending only on β, for all fixed r ∈ (0, 1), for all N large enough (depending on r,β), for all ε such that CN −1/2 ≤ ε ≤ 1−r 2 and for all t such that |t| ≤ C −1 N ε 2 , for any z ∈ D r , we have: .
By O ε (1) we mean a term that is bounded as ε → 0 independently of N (this term depends only on β, which is fixed here).By O N (1) we mean a term that is bounded as N → ∞ independently of ε (this term depends on β and r, which here are both fixed).
Proof of Proposition 3.3.First, we impose that C is large enough (depending only on β) so that the length scale CN −1/2 is larger than the minimal length scale ρ β mentioned in Section 2.2.Next, observe that ϕ z,ε is a C ∞ function whose Laplacian is supported in D r+ε and has mean zero.Moreover, |ϕ z,ε | k is of order ε −k for any k ∈ N. Hence, for |t| ≤ C −1 N ε 2 (with C large enough depending only on β) we may apply Lemma 2.5 and write: , with s = −t 2πN β and m s := m 0 + s∆ϕ z,ε .We have: and by scaling we can see that: R 2 (log , so we may write: . The assumptions of Proposition 2.9 are fulfilled with m = m 0 , f a = 2πρ ε , f b = −2πχ, z a = z, z b = 0, ℓ a = ε and ℓ b fixed (independent of ε and N ).We obtain: , where the implicit multiplicative constant depends only on β and r.A direct computation shows that E(m s ) = E(m 0 ) + O(s) -cf.Remark 2.11 -so, as N s = O(t) and ε ≥ CN −1/2 , we conclude that . This completes the proof; the dominant error term being the first one in the right-hand side.

Lattice approximation and proof of the LLN upper bound.
We have obtained in Corollary 3.4 a control on the exponential moments of L N (ϕ z,ε ) for a fixed z ∈ D r .In the first step below, we show that it yields an upper bound on the typical values of Pot N (z) for a fixed z.
Then we get a bound controlling the (typical) values of Pot N at all points z on a sub-microscopic lattice contained in D r .Finally, we turn it into a uniform control of Pot N over D r .
1. Regularization and one-point tail estimate.First, we observe in Lemma 3.5 below that when considering the maximal value of Pot N , we can regularize the log at the microscopic scale with a small cost.For ε > 0, let Pot N,ε be the map: with ρ ε as in Section 3.2.The test function log z := ρ ε ⋆ log z is smooth, and for z, x ∈ R 2 , we have: Lemma 3.5.For ε > 0 and z ∈ D we have, with a universal implicit constant: Proof.Recall that we are interested in the value of Pot N (z) for some fixed z and that by definition: By subharmonicity of log, we have log z ≤ ρ ε ⋆ log z pointwise, which implies that On the other hand a direct computation (using e.g.Newton's theorem) yields: (3.9) Combining (3.8) and (3.9) gives: using the definition (3.6) of Pot N,ε , which proves the claim.
For the rest of the section, we fix: where the constant C is as in Corollary 3.4 (depending only on β).By Lemma 3.5 we have, with an implicit constant depending only on β: Lemma 3.6.Fix a point z ∈ D r and let α ∈ (1, 2).We have: (3.12) with an implicit constant depending on β, r.
We note that the upper bound α < 2 is done for convenience only, one could obtain similar estimates for larger α; for our needs, any α > 1 will do.
with C as in Corollary 3.4.Then, we may thus use (3.5) and write: with an implicit constant depending only on r, β.So, Markov's inequality yields: (3.13) Note that since (3.4) is valid for t ∈ R, the same bound holds for −L N (ϕ z,ε ) as well.On the other hand, by Proposition 3.2, we have |L N (g)| ≤ (log N ) 0.8 with probability 1 − exp −(log N ) 1.5 .Since by definition L N (ϕ z,ε ) = Pot N,ε (z)+L N (g), we may convert the control (3.13) on L N (ϕ z,ε ) into a similar estimate for Pot N,ε (z), as stated.
2. Tail estimate on a lattice.For α ∈ (1, 2) and fix δ (depending on α) such that: Lemma 3.7 (Tail estimate on the lattice).Let α be in (1, 2).We have: with an implicit constant depending on β, r.In particular we have: as N → ∞ -the decay being in fact algebraic in N .
Proof.The number of points of Λ δ that fall into D r is bounded above and below by positive constants (depending only on r) times N 1+2δ .Thus (3.15) follows from a simple union bound, using the one-point estimate (3.12).From our choice (3.14) for δ we easily deduce (3.16).
3. Extension to the whole disk.It remain to prove a result comparing the maximum of the Coulomb gas potential over the lattice points versus the whole unit disk.Recall that ε corresponds to some "large enough microscopic" scale, see (3.10).
Proposition 3.8 (The lattice vs. the whole disk).One has Proof of Proposition 3.8.Let z ∈ D r , and let z ′ be a point in Λ δ ∩ D r such that |z − z ′ | ≤ 4ℓ δ (such a point always exists because the lattice has been chosen narrow enough).We want to show that, with high probability, the values of the regularized potential at z and z ′ are close.Recall the definition (3.6) of Pot N,ε .We can write the difference Pot N,ε (z In view of the bounds (3.7) on the derivatives of log, and since |z − z ′ | ≤ 4ℓ δ , we get: In order to control the difference between the values of Pot N at the lattice points and on the whole disk, we would like to control the size of L N (G z,z ′ ) uniformly for all (z, z ′ ) as above.Lemma 2.4 does provide such a uniform control, but it applies to functions with a given Lipschitz constant, which is not the case of G z,z ′ , which is intrinsically multi-scale -see (3.18).We thus need to extend this result from a "single-scale" statement to a "multi-scale" one, the same way that Proposition 2.9 extends Proposition 2.6.

Dyadic scale decomposition.
We introduce a sequence of intermediate length scales ℓ 0 < • • • < ℓ n by taking c, n such that: and by setting ℓ k := c k ε for k = 0, . . ., n.Note that c ∈ [1, 2], that n = O(log N ), that ℓ 0 = ε and that ℓ n = 1−r 4 .We also set ℓ n+1 = M , with M large (depending on β but not on N ) to be chosen later.Next, we take a family (χ i ) 0≤i≤n+1 of functions such that: each χ i living at scale ℓ i around z ′ , namely: and we let

Controlling every scale
Using the bounds on G z,z ′ , ∇G z,z ′ and the ones on ℓi -Lipschitz and thus satisfies the assumptions of Lemma 2.4.We may write, using a convexity inequality for exp: where we used that n is of order log N .Since N δ/2 ℓ δ log N ℓi ≪ 1, we may use the control in exponential moments given by Lemma 2.4 and write: We obtain a similar control for the (macroscopic) scale ℓ n+1 , see the last comment in Lemma 2.4.Then Markov's inequality yields: We emphasize that since we are using Lemma 2.4, which provides uniform control for fluctuations of all Lipschitz functions living at scale ℓ around a given point, then for fixed z ′ , the control (3.19) is in fact uniform in z i.e. (3.20) 3. Handling the outliers.On the other hand, since by construction comes from hypothetical outliers living far away from the unit disk, namely outside D(0, M ).A simple large deviation estimate (see e.g.[SS15, (1.48)]) shows that the probability of any point being outside D(0, M ) decays as exp −βN M 2 as N → ∞ for M large enough depending only on β (indeed the non-negative confinement term 2ζ(x) appearing in the Boltzmann factor grows as |x| 2 for large x as can be seen in (2.2)).

Conclusion of the proof of Proposition 3.8. Finally, we obtain (by symmetry) for any fixed z
A union bound over z ′ ∈ Λ δ ∩ D concludes the proof.
Proposition 3.10.For any k ∈ N, there is a constant C k = C k (β) so that if N is sufficiently large (depending on β, k and r), Proof.We use again ε = λ/ √ N .Let ( j = (z j , ε)) M j=1 be a collection of squares centered at points z j ∈ εZ 2 such that D(0, r) ⊂ M j=1 j ⊂ D(0, 1 − δ) for a small δ > 0. The collection can be chosen so that M ≤ C β N for some constant depending on β.Using the local laws (2.18) and a union bound, we deduce that for all k ∈ N, there is a constant Let f ∈ F N , ℓ be the associated length scale, and let f ε = f ⋆ ρ ε .Since ρ is a radial mollifier (with compact support in D) and |f | 2 ≤ ℓ −2 , one has and both f, f ε are compactly supported in D(z, 2ℓ) since ℓ ≥ ε.One can tile D(z, 2ℓ) with at most 16(ℓ/ε) 2 squares of sidelength ε, thus on the event introduced in (3.22) we have: Moreover, by an integration by parts we can write ∆f Pot N,ε = L N (f ε ), and using again that |f | 2 ≤ ℓ −2 and that ∆f is supported in D(0, r), we deduce: The RHS is controlled by Corollary 3.9 and, by (3.23), on the event introduced in (3.22) we can use this to control L N (f ) uniformly for all f ∈ F N .Adjusting the constants C k , this proves the claim.

Regularized multiplicative chaos: lower bound
This section is devoted to the proof of the lower bound in Theorem 1.
4.1.Reduction to a regularized version of the potential.Recall that ϕ z,ε := ρ ε ⋆ log z −g, see (3.2), is a regularization of log z at scale ε > 0 and let U ⊂ D be a (non-empty) open ball.By Proposition 3.2, we know that for any (small) δ > 0: lim inf In addition, recall that Pot N (z) = L N (log z ) so that The right-hand side is the convolution of the function Pot N with a non-negative function of total mass 1 and as such (since the maximum of the convolution is less than or equal to the maximum of Pot N ), max where K is a closed ε-neighborhood of U.
The proof of Proposition 4.1 relies on the theory of Gaussian multiplicative chaos and in particular on [CFLW21] and [LOS18].We review these results in the next section and present the main steps of the proof.4.2.Multiplicative chaos.We introduce new notations.Let ℓ(k) = e −k for k ∈ N and set For γ > 0, define a sequence of random measure (µ γ k ) k∈N with density function This density obviously depends on the N and β even though it is not emphasized in the notation.
According to Proposition 3.3, if ℓ(k) ≥ CN −1/2 , it holds uniformly for x ∈ K, These asymptotics corresponds to [CFLW21, Assumptions 3.1].In fact, using the method developed in [CFLW21, Section 3] and [LOS18, Section 2], we will obtain the following convergence result (with respect to the vague topology for positive measures on D).
Proposition 4.2.Let n(N ) be a sequence such that n(N ) → ∞ and ℓ(n as N → ∞.Then for any 0 < γ < 2, µ γ n(N ) → GMC γ in distribution as N → ∞ where GMC γ is a Gaussian multiplicative chaos measure which is defined shortly.for N ∈ N instead of (4.3), we expect that the result of Proposition 4.2 remains true, that is, for any 0 < γ < 2, μγ N → GMC γ in distribution as N → ∞ for a slightly different GMC (associated to the GFF with free boundary condition on D).Note that the regime 0 < γ < 2 corresponds to the whole GMC L 1 -phase, as γ = 2 is the critical value with our normalization.
We now turn to the definitions of the random measures (GMC γ ) 0<γ<2 .We consider the following log-correlated field.Let C c (D) denote the space of smooth, compactly supported functions on D. Definition 4.4.Let Ψ be a (distribution-valued) Gaussian process on D with mean zero and the following correlation structure: for any with χ a radially symmetric smooth function (independent of N ) supported in D r with χ = 1, as used in (3.2).It is well-known that the RHS defines the covariance of a Gaussian process (this also follows from the CLT of Remark 2.11 which implies that with ϕ , so that with g as in (3.2), the field Ψ has correlation kernel where c ∈ R is a constant, that is the right hand side of (4.5) can be written as f (x)h(z)Σ(x, z)dzdx.
For any k ∈ N, we define Ψ k := ρ ℓ(k) ⋆ Ψ.This is a (smooth) approximation of Ψ as k → ∞ and, for γ > 0, we also let ν γ k be a random measure on D with density function Then, the following convergence result follows from the general theory of multiplicative chaos; e.g.[Ber17].Proposition 4.5.For any γ < 2 (subcritical phase), the random measure ν γ k → GMC γ in probability as k → ∞.Moreover, for any (non-empty) open set A ⊂ D, GMC γ (A) > 0 almost surely.
Hence, in the subcritical phase (γ < 2), as a consequence of Proposition 4.2 and [CFLW21, Theorem 3.4], we obtain Proposition 4.1.Finally, the proof of Proposition 4.2 will be a direct application of [CFLW21, Theorem 2.4] (reproduced there from [LOS18, Theorem 1.7]).Namely, it suffices to show that for any j The next section is devoted to the proof of (4.6).
4.3.Exponential moments asymptotics.We deduce (4.6) from the estimates of Proposition 2.9 by a simple induction.First observe that since Φ k (x) = √ β L N (ϕ x,ℓ(k) ) with ∆ϕ x,ℓ = ρ x,ℓ − g, as a consequence of Remark 2.11 (with f a = ρ x,ℓ = ρ((• − x)ℓ −1 )ℓ −2 and f b = g so that the conditions (2.28) hold), we have uniformly for x ∈ D r , ℓ(k) ≫ N −1/2 and locally uniformly for γ ∈ R. Moreover (by definition of the 2d Green's function for −∆ and by Fubini's theorem), according to Definition 4.4, we have for k, n ∈ N and x, z ∈ D, Thus, since Ψ is a (mean-zero) Gaussian process, we obtain for any with the required uniformity.This establishes that (4.6) holds when j = 1.We now proceed by induction to extend these asymptotics for any j ∈ N with j ≥ 2. Without loss of generality, we assume that k 1 ≤ • • • ≤ k j ≤ n(N ).Then, according to Lemma 2.5, we have 2πN .Moreover, using (4.7), we obtain .
We now apply Proposition 2.9 with f a = γ j ρ xj,ℓ(kj ) and f b = γ j g as above.We emphasize that the conditions (2.28) hold while the reference measure m = m 0 + s∆ϕ j−1 also satisfies with the required uniformity.Now, repeating the argument from Remark 2.11, the entropy E(m + s∆ϕ j ) = o(N −1 ) for any j ∈ N which implies that By induction, this concludes the proof of the asymptotics (4.6).
applying our assumptions (2.22) and (2.23), we can find a vector field ψ, supported on (z, ℓ), such that where C is as in the statement and C is a universal constant.
For s ∈ R, we define a map Φ s and another probability measure m s by: (A.2) Φ s := Id + sψ, m s := Φ s #m.
Here and below, we write # to denote the push-forward of a measure.In view of (A.1) we can guarantee that |Φ s − Id| 1 ≤ 1 2 as long as the parameter s is chosen smaller than 1 C ℓ 2 for some C large enough, a condition which is implied by the stronger constraint (2.24) (recall that N ℓ 2 is always larger than 1).We think of Φ s as an approximate transport map, pushing m forward not quite onto m s (whose density is m + sf ) but rather on m s .The following lemma quantifies the error in terms of partition functions.
Lemma A.1.The partition functions associated to m s and m s are close: , and so are their relative entropies: with implicit constants depending only on (A.1) and β.

Comparison of partition functions along a transport.
The so-called "anisotropy term" was introduced in [LS18], cf. also the "angle term" in [BBNY19, Section 8].We refer to [Ser23, Section 4] for a careful study of its properties.It corresponds to the first-order correction to the energy when one pushes both the configuration and the background measure by a small perturbation of the identity map.Here we will not go into the details and we treat the anisotropy as a black box.The key estimates that we need to import are contained in the following lemma.Let s, ψ, Φ s , m s be as above (in particular ψ is supported on (z, ℓ)) and recall that EnerPts(z, ℓ) (defined in (2.15)) controls both the energy and the number of points at scale ℓ near z, and is typically of order N ℓ 2 .Lemma A.2.There exists a term A 1 [ψ, m, X N ] (independent of s) satisfying: and such that provided |s| is smaller than ℓ 2 : with a "second order" error term ErrorF bounded by: (A.5) ErrorF = O ℓ −4 (1 + log(ℓN 1/2 ))EnerPts(z, ℓ) .
Proof of Lemma A.2.We apply [Ser23,Prop. 4.2] to the vector field ψ chosen above.The main task is to check that [Ser23, (4.7)], which bounds the second derivative of the energy along a transport, can itself be controlled by our ErrorF.
4. Serfaty's trick.It is hard to prove a bound on Ani that is better than (A.3) and holds configuration-wise.However one can improve the control on Ani in exponential moments, using the following trick.Recall the assumption (2.16).Proof of Lemma A.5.We first take s = s ⋆ := 1 C ′ ℓ 3/2 N −1/4 for some large enough constant C ′ (if ℓ is of order N −1/2 then ℓ 2 and ℓ 3/2 N −1/4 are comparable, thus we need to divide by C ′ large enough in order to match our previous assumptions on s.For mesoscopic length scales, this is irrelevant).By comparing the two expressions (A.7) and (A.8) and discarding negligible terms one gets: Using the expression (A.5) for ErrorF and the local law (2.17)we know that: (A.11) log E P β N,m exp s 2 ⋆ ErrorF ≤ O ℓ −1 N −1/2 log(ℓN 1/2 )N ℓ 2 = O ℓN 1/2 log(ℓN 1/2 ) .
Combining (A.10) and (A.11) and using Cauchy-Schwarz's inequality we deduce that: Thus for values of s smaller than 1 2 s ⋆ , we apply Hölder's inequality and obtain (A.9).

Conclusion of the proof of
Lemma 2.6.We compare K β N (m s ) and K β N ( m s ) using Lemma A.1 and then apply Lemma A.3 to compare K β N ( m s ) and K β N (m), using Lemma A.5 to control the anisotropy term.Lemma A.1 also allows us to replace E( m s ) by E(m s ) up to some error.Finally one can check that for |s| ≤ s ⋆ the dominant error term is the one coming from (A.9), which yields (2.25).
A.1.2.Proof of Lemma 2.8.Compared to the previous proof, we dispense with Step 1 as one can easily find an exact transport, as well as Steps 3 and 4 because we are not aiming for precise estimates on the anisotropy.Since the reference measure is m 0 and since f has radial symmetry, it is easy to construct a bijective, "radial rearrangement" map Φ s : D → D that pushes m 0 onto m 0 + sf and can be written as Φ s = Id + α s (x)x on D, for some radial function α s whose derivative is bounded on D by sC ′′ for some constant C ′′ depending only on C and r.We can always extend α s into a C 1 , compactly supported function on D 2 with |α s | ≤ 1 2 , |α s | 1 ≤ sC ′′ .Applying [Ser23, Prop 4.2] (note that now, compared to the proof of Proposition 2.6, we are only able to use -and in fact only need -results where the vector field ψ is simply assumed to be C 1 , see in particular [Ser23, (4.6)]) to Φ s and changing variables as in the previous proof, we obtain (see also [Ser23,(4 but the number of points in D 2 is always ≤ N and the (fluctuations of the) global energy have exponential moments of order N (this "global law" follows from (2.9)).We thus get (2.26).
Claim A.8.For |t| ≤ βN , we have: Proof.This follows from the analysis of [LS18].We return to the notation of (2.5), (2.8) for partition functions and make them more explicit by writing: The claim follows from Hölder's inequality.
Claim A.9.For |t| ≤ N β 2 , we have: (A.17 Proof.It follows by a scaling argument using the equivalent expression (1.2) for the joint law of the particles.
On the other hand, a Taylor's expansion of (A.17 . Using Cauchy-Schwarz's inequality combined with (A.16) and (A.18) we prove Proposition A.6.

3. 1 .
An auxiliary linear statistics.For z ∈ D, let log z : x → log |z − x|.The value Pot N (z) of the Coulomb gas potential at z corresponds to the fluctuations L N (log z ) of the linear statistics associated to log z , see (2.19

Remark 4. 3 .
If one consider the sequence of random measures with densities μγN = e γ √ βPot N Ee γ √ βPot N

Proof.
The first point follows from combining [Ser23, Lemma 5.1], which bounds | m s − m s | k for k = 1, 2 in terms of the norms of ψ (controlled in (A.1)) and m (controlled by assumption), and [Ser23, Lemma 4.9], which states a direct comparison of the partition functions in terms of m s − m s .To prove the second point, we use again [Ser23, Lemma 5.1], which bounds | m s − m s | 0 by O(s 2 ℓ −4 ), and plug that estimate into the definition of E.

8 = 2 but rather |x| 2 −1 2 −
) gives (for |t| βN small): E e O(t+t 2 ) .Since the expression of h 0 (x) is not always |x| 2 −1 ζ(x) in general (compare (A.15) with (2.2)), we can write: E e tLN (h0) = E e (under the condition (2.16) and for t less than a constant depending only on β). 2. Local energy.The statement [AS21, (1.17)] involves the local energy, whose definition was recalled in (2.14), whereas in EnerPts we only consider the positive part of it.Moreover there is a difference in the scaling convention.So (with our apologies to the reader) it might be easier to read the corresponding statements in [Ser23, Sec.3] (which uses the same scaling convention as we do) namely:

Uniform control of fluctuations for smooth linear statistics.
As a byproduct, we obtain uniform bounds for the fluctuations of linear statistics for a class of C 2 test functions with, say, | • | 2 ≤ 1.This can be thought of as an analogue of Lemma 2.4 for test functions that are smoother.The basic idea is to use integration by parts to translate the question of bounding L N [f ] into a bound on Pot N times |f | 2 .Recall that r < 1 is fixed and let