Mesoscopic central limit theorem for the circular β-ensembles and applications *

We give a simple proof of a central limit theorem for linear statistics of the circular β-ensembles which is valid at almost microscopic scales for functions of class C. Using a coupling introduced by Valkò and Viràg [48], we deduce a central limit theorem for the Sineβ processes. We also discuss connections between our result and the theory of Gaussian Multiplicative Chaos. Based on the results of [37], we show that the exponential of the logarithm of the real (and imaginary) part of the characteristic polynomial of the circular β-ensembles, regularized at a small mesoscopic scale and renormalized, converges to GMC measures in the subcritical regime. This establishes that the leading order behavior for the extreme values of the logarithm of the characteristic polynomial is consistent with the predictions coming from log-correlated Gaussian field theory.


circular β-ensembles
The circular β-ensemble or CβE N for N ∈ N is a point process 0 < θ 1 < · · · < θ N < 2π with joint density where Γ denotes the Gamma function. When β = 2, this ensemble corresponds to the eigenvalues of a random matrix sampled according to the Haar measure on the unitary group U(N ). For general β > 0, these ensembles were introduced by Dyson [18] as a toy model for scattering matrices or evolution operators coming from quantum mechanics.
They also correspond to the Gibbs measures for N charged particles confined on the circle at temperature β −1 and interacting via the two-dimensional Coulomb law. For this reason, β-ensembles are also called log-gases. It is well known that (1.1) can also be realized by the eigenvalues of certain CMV random matrices [31], so we will refer to the random points (θ j ) N j=1 as eigenvalues. We refer to Forrester [19,Chapter 2] for an in-depth introduction to circular β-ensembles. We define the empirical measure by µ N = N j=1 δ θj (in contrast to the usual convention, µ N is not normalized to a probability measure) and its centered version by µ N = µ N − N dθ 2π . In the following, a linear statistic is a random variable of the form where f is a continuous function on T = R/2π and f k for k ∈ Z denote the Fourier coefficients of f , see (1.6). By mesoscopic linear statistic, we refer to the case where the test function in (1.2) depends on the dimension N in such a way that f (θ) = w(Lθ) for w ∈ C c (R) and a sequence L = L(N ) → +∞ with L(N )/N → 0 as N → ∞. In this regime, it is usual to consider test functions with compact support so that the random variable (1.2) depends on a vanishing fraction of the eigenvalues.

Central limit Theorems
The main goal of this article is to study fluctuations of linear statistics of the CβE N for large N at small mesoscopic scales. Circular ensembles are technically easier to analyse than β-ensembles on R, so this is also an opportunity to give a comprehensive presentation of the method of loop equation introduced in [29]. Then, we discuss applications of our result to characteristic polynomials in section 1.4 and obtain a central limit theorem for the Sine β processes in section 1.5. Theorem 1.1. Let w ∈ C 3+α c (R) for a α > 0. Let L(N ) > 0 be a sequence such that L(N ) → +∞ in such a way that N −1 L(N )(log N ) 3 → 0 as N → ∞ and let w L (·) = w(·L). Then, we have for any β > 0, (1. 3) The probabilistic interpretation of Theorem 1.1 is that as N → ∞, in law as well as in the sense of Laplace transform. 1 The variance of the limiting Gaussian law is given in terms of the Sobolev norm: (1.4) where w is the Fourier transform of the test function w, which is given by w(ξ) = R w(x)e −ixξ dx 2π for ξ ∈ R.
The proof of Theorem 1.1 is given in section 2. Let us point out that we can actually obtain a precise control of the error in the asymptotics (1.3) and that a straightforward modification of our arguments yields another proof of the CLT for global linear statistics. Theorem 1.2. Let w ∈ C 3+α (T) for a α > 0 be a test function possibly depending on N ∈ N such that w L 1 (R) is uniformly bounded. There exists N β ∈ N and a C β,w > 0 (which is given e.g. by (2.12)) such that for all N ≥ N β , log E β N exp wd µ N − β −1 σ 2 (w) ≤ C β,w (log N ) 2 N , (1.5) where for any f : T → R which is sufficiently smooth: (1. 6) This CLT for linear statistics of the CβE N was first obtained by Johansson [28] for general β > 0 using a clever change of variables. When β = 2, he also discovered a connection between (1.5) and the Strong Szegő Theorem, see [44,Chapter 6]. In fact, for β = 2, because of the rich structure of the circular unitary ensemble (CUE), there exist many other different proofs of the CLT when β = 2, we refer e.g. to the survey [15]. Coming back to general β > 0, a CLT for trigonometric polynomials was obtained by Jiang-Matsumoto [27] using Jack functions. Then, Webb [53] generalized this result using Stein's method and obtained a rate of convergence. Our proof provides precise asymptotics for the Laplace transform of a linear statistic and it relies on the method of loop equation which originates in the work of Johansson [29] on the fluctuations of β-ensembles on R. Let us also point out that Johansson's method has been refined in [6,45,3] and it has been applied to two-dimensional Coulomb gases in [2]. Remark 1.3. Let us comment on the optimality in Theorem 1.2. It is known that when β = 2, for any fixed test function w : T → R such that σ 2 (w) < +∞, we have as N → +∞ E β N exp wd µ N → exp 2 β σ 2 (w) .
This follows from the Strong Szegő Theorem, e.g. [44,Chapter 6]. We conjecture that this CLT holds under the optimal regularity condition σ 2 (w) < +∞ if and only if the parameter β ≤ 2. When β = 4, using the precise variance estimates from Jiang-Matsumoto [27], we give an example of a bounded function f : T → R with mean 0 such that σ 2 (f ) < +∞, but the variance E β=4 N N j=1 f (θ j ) 2 diverges as N → +∞. Consider the following sparse sequence: We define the function f : T → R by f (θ) = k∈Z κ k e ikθ . Observe that this function f ∈ L ∞ (T) and

CLT for CβE and applications
This implies that By [27][Proposition 2 (iii)], there exists c > 0 such that This implies that along the subsequence N n = exp(exp n) , we have Hence, as claimed, the variance of the linear statistic N j=1 f (θ j ) does not have a finite limit as N → +∞.
For mesoscopic linear statistics, to our knowledge, Theorem 1.1 only appeared for β = 2 in a paper of Soshnikov [46]. Soshnikov's method is very different from ours: it relies on the method of moments and it does not yield the convergence of the Laplace transform of a linear statistics as in Theorem 1.1. For β-ensembles on R, the mesoscopic CLT was first obtained in [10,35] when β = 2. The idea of combining loop equation with rigidity estimates to prove a mesoscopic CLT for Gaussian ensembles on R valid for general β > 0 originates from the work of Bourgade et al. [9,Section 5]. This CLT has been generalized to other potentials by Bekerman and Lodhia [3] using a method of moments based on higher order loop equations. This means that they obtain the asymptotics of E β N w L d µ N n with L(N ) = N δ and 0 < δ < 1 by an induction procedure on n ≥ 1 which relies the loop equation and eigenvalues rigidity. Note that the rigidity estimates from [8] which are used in [9,3] are weaker than that of Section 1.3 and would not allow to control directly the Laplace transform of a linear statistic down to arbitrary mesoscopic scales. We need such control for the applications to the characteristic polynomial of the circular β-ensembles and Gaussian multiplicative chaos discussed in Section 1.4. We manage to obtain (optimal) rigidity estimates by studying moderate deviations for the maximum of the eigenvalue counting function in Section 4. In the next section, we present consequences for concentration of general eigenvalues statistics and eigenvalues rigidity which we believe are of general interest.

Concentration and eigenvalues' rigidity
For any function w ∈ C(T), we define a new biased probability measure: (1.7) Proposition 1.4. Let w ∈ C 1 (T) and suppose that w L 1 (T) ≤ η/ √ 2 where η is allowed to depend on N ∈ N. There exists N β ∈ N such that for all fixed n ∈ N, all N ≥ N β and any R > 0 (possibly depending on N ∈ N as well), it holds The proof of Proposition 1.4 will be given in section 4. Moreover, we immediately deduce from Lemma 4.2 below, the following moderate deviation estimates: for any N ≥ N β and any R > 0 (possibly depending on N ), we have This provides a precise control of how equally spaced are the eigenvalues of the CβE N -this property is usually called eigenvalue rigidity. In the next section, we explain how to recover optimal rigidity estimates in the sense that we find the leading order of the maximal fluctuations of θ k with the correct constant -see Corollary 1.10. These optimal estimates are obtained through a connection between the characteristic polynomial and the theory of Gaussian multiplicative chaos that we recall below.

Subcritical Gaussian multiplicative chaos
Let us discuss applications of Theorem 1.1 within GMC theory. Let D = {z ∈ C : |z| < 1} and define for any N ∈ N, Up to a phase, P N corresponds to the characteristic polynomial of the CβE N . Our goal is to investigate the asymptotic behavior of P N (z) for |z| = 1 as a random function. First, let us observe that for any 0 < r < 1 and ϑ ∈ T, Hence, log |P N (re iϑ )| is a linear statistic and it follows from Theorem 1.2 that in the sense of finite dimensional distribution as N → ∞, where G is a centered Gaussian process defined on D with covariance structure: (1.10) Indeed according to formula (1.6), one has where f re iϑ = φ r (· − ϑ) and the RHS can be explicitly computed using the Fourier series φ r (θ) = ∞ k=1 r k Cs(kθ) which converges absolutely for r < 1; see (3.4) below. We can define the boundary values of the Gaussian process G as a random generalized function on T. Then, according to formula (1.10), this random field which is still denoted by G, is a log-correlated Gaussian process. Actually, this process has the same law as π/2 times the restriction of the two-dimensional Gaussian free field on T, see [17,Proposition 1.4], so we call it the GFF on T. Moreover, one can show that the function ϑ ∈ T → log |P N (e iϑ )| converges in law to the random generalized function G in the Sobolev space H −δ (T) for any δ > 0, see [25].
Log-correlated fields form a class of stochastic processes which describe the fluctuations of key observables in many different models related to two-dimensional random geometry, turbulence, finance, etc. One of the key universal features of log-correlated EJP 26 (2021), paper 7. fields is their so-called multi-fractal spectrum which can be encoded by a family of random measures called GMC measures. Within GMC theory, these measures correspond to the exponential of a log-correlated field which is defined by a suitable renormalization procedure. For instance, using the results of [42] or [4], it is possible to define 2 µ γ G (dϑ) = lim r→1 e γG(re iϑ ) Ee γG(re iϑ ) dϑ. (1.11) The measure µ γ G exists for all γ ≥ 0, it is continuous in the parameter γ and it is non-zero if and only if γ < 2; this is called the subcritical regime. 3 The random measure µ γ G lives on the set of γ-thick points: and this set is known to have fractal dimension (1 − γ 2 /4) + . In particular, if γ * = 2 is the critical value, the fact that the measure µ γ G is non-zero if and only if γ < γ * implies that in probability: For a non Gaussian log-correlated field, it is also possible to construct its GMC measures in the subcritical regime. This has been used to describe asymptotics of powers of the absolute value of the characteristic polynomials of certain ensembles of random matrices, see e.g. Webb and co-authors [52,5] for an application to the circular unitary ensemble (β = 2), and to a class of Hermitian random matrices, in the so-called L 2 -regime. Based on the approach from Berestycki [4], a general construction scheme which covers the whole subcritical regime was given in [37] and then refined in our recent work [13]. This method has been applied to (unitary invariant) Hermitian random matrices [13], as well as to the characteristic polynomial of the Ginibre ensemble [36]. A similar approach has also been applied to study the Riemann ζ function [43] and cover times of planar Brownian motion [26]. Using the method from [37] and relying on the determinantal structure of the circular ensemble when β = 2 to obtain the necessary asymptotics, Nikula-Saksman-Webb proved in [40, Theorem 1.1] that for any 0 ≤ γ < 2, (1.14) in distribution as N → +∞. It is a very interesting and challenging problem to generalize (1.14) to all β > 0. In the following, we provide the first step in this direction which consists in constructing the GMC measures associated with a small mesoscopic regularization of |P N |. Namely, by adapting the proof of Theorem 1.1, we are able to obtain the following result: and, by analogy with (1.11), define the random measure for any γ ∈ R, For any |γ| < √ 2β (i.e. in the subcritical regime), µ γ N converges in law as N → ∞ to a GMC measure µγ G associated to the GFF on T with parameterγ = γ 2 β . 2 There exist other equivalent ways to define the GMC measures µ γ G that we do not discuss them here. We refer to [41] for a comprehensive survey of GMC theory. 3 Because of the factor 1 It is known that the random measures µ γ N also lives on the thick points of the characteristic polynomial P N , see e.g. [13, section 3] for a more in-depth discussion discussion By analogy with (1.12), these thick points are the atypical points where |P N | takes extremely large values. Concretely, for any γ > 0, we say that θ ∈ T is a γ-thick if the value of log |P N (e iθ )| is at least γE β N (log |P N (e iθ )|) 2 = γ β log N + O(1). In section 3.3, we explain how to deduce from Theorem 1.5 that the mass of the sets of thick points are given according to the predictions of log-correlated Gaussian fields. Namely, since the convergence of Theorem 1.5 holds at arbitrary small mesoscopic scales, this leads to accurate lower-bounds for the Lebesgue measure of γ-thick points. Hence, this new approach based on GMC theory allows to replace the usual second moment method which originates from the study of the random energy models (see e.g. [33]). We obtain the following results.
Proposition 1.6. For any γ > 0, let and |T γ N | be the Lebesgue measure of the set T γ N . Then for any γ < γ in probability as N → +∞. Moreover, we have in probability as N → ∞, (1.17) The interpretation of Proposition 1.6 is that the multi-fractal spectrum of the sets of γ-thick points of the CβE N characteristic polynomial is given by the function γ → (1 − γ 2 /2β) + for γ ≥ 0. This is in accordance with the behavior of Gaussian log-correlated we also obtain the limit of the so-called free energy: This shows an interesting transition at the critical value γ * = √ 2β. For log-correlated fields, the fact that the free energy becomes linear in the super-critical regime (γ > γ * ) is usually called freezing. In particular, this freezing phenomenon for the CβE N characteristic polynomial was conjectured by Fyodorov-Keating [20] 4 and it plays a crucial role in predicting the precise asymptotic behavior of |P N |.
We can also obtain analogous results for the imaginary part of the logarithm of the characteristic polynomial of the CβE N . Let where log(·) denotes the principle branch 5 of the logarithm so that the function log(1−z) is analytic for z ∈ D. We also let where ψ(θ) = log(1 − e iθ ) = θ−π 2 for all θ ∈ (0, 2π). Hence, Ψ N is directly related to the eigenvalue counting function; see (3.35) below. This connection is a crucial motivation to study the imaginary part of the logarithm of the characteristic polynomial, especially its extreme values. It turns out that the GMC measures associated with Ψ N and |P N | have the same law.
for any γ < γ * and then imply optimal rigidity estimates for the CβE N eigenvalues.

CLT for the Sine β point processes
The Sine β processes describe the bulk scaling limits of the eigenvalues of β-ensembles. This family of translation invariant point processes on R was first introduced independently by  as the scaling limits of the CβE N and by Valkó-Virág [49] as that of Gaussian β-ensembles. For general β > 0, universality of the Sine β processes in the bulk β-ensembles on R was obtained by Bourgade-Erdős-Yau [8] for a general class of one-cut regular potential by coupling two different ensembles using the Dyson EJP 26 (2021), paper 7.
Brownian motion. Our proof of Theorem 1.12 also relies on a coupling from Valkó-Virág [48] between the Sine β and CβE N point processes. The Sine β process is usually defined through its counting function which satisfies a system of stochastic differential equations [32,49]. Recently, Valkó-Virág [51] introduced an alternate characterization as the eigenvalues of a stochastic differential operator. It turns out that the CβE N also corresponds to the eigenvalues of an operator of the same kind as the Sine β and that it is possible to couple these two operators in such a way that their eigenvalues are close. This coupling was studied in detail by Valkó-Virág [48] and they obtain the following result. Theorem 1.11. Fix β > 0 and recall that 0 < θ 1 < · · · < θ N < 2π denotes the eigenvalues of CβE N . Let us extend this configuration periodically by setting θ k+ N = θ k + 2π for all k ∈ [N ] and ∈ Z. By [48, Corollary 2], there exists a coupling P of the CβE N with the Sine β process (λ k ) k∈Z such that for any > 0, there exists a random integer N and we have for all N ≥ N , As a consequence of this coupling and Theorem 1.1, we easily obtain the following result. The details of the proof will be given in section 5. Theorem 1.12. Let (λ k ) k∈Z be a configuration of the Sine β process and let w ∈ C 3+α The convergence holds in distribution and the limiting variance (1.4) is the same as in Theorem 1.1.
Let us mention that for β = 2, there is another coupling between the CUE and Sine 2 existed prior to [51,48] which is based on virtual isometries [7]. Moreover, it is possible to obtain Theorem 1.12 directly by using the determinantal structure of the Sine 2 process, see Kac [30] and Soshnikov [46]. Finally, it should be mentioned that there have recently been several developments in the study of the Sine β for general β > 0. Using the SDE representation, large deviation estimates for the number of eigenvalues in boxes were obtained in [50,23,24]. The rigidity property for Sine β in the sense of Gosh-Peres was proved by Chhaibi-Najnudel [12] and Holcomb-Paquette [22] computed the leading order of the maximum eigenvalue counting function. Finally, Leblé [38] gave recently an alternate proof of Theorem 1.12 for test functions of class C 4 c (R) which relies on the DLR equations for the Sine β process established by Dereudre-Hardy-Leblé-Maïda [16].

Organization of the paper
This paper also aims at giving an exposition of some basic concepts in the study of βensembles in a simple setting: loop equations and the connections between characteristic polynomials and GMC theory. We expect that the arguments presented here can be applied more generally.
In section 2, we prove our main results Theorems 1.1 and 1.2 by using the method of loop equation which we review in section 2.1. In section 3, we discuss applications from the perspective of Gaussian multiplicative chaos. Specifically, in sections 3.1 and 3.2, we explain how to modify the proof of Theorem 1.1 in order to obtain Theorem 1.5 and Theorem 1.7 respectively. Then, we give the proofs of Propositions 1.6 and 1.8 in section 3.3. In section 4, we obtain rigidity results for the circular β-ensemble by EJP 26 (2021), paper 7. studying moderate deviations of the eigenvalues counting function. In particular, we prove Proposition 1.4 which is a key input in our proof of Theorem 1.1. Finally, in section 5, we give the short proof of Theorem 1.12.
2 Proof of Theorem 1.1

Loop equation
The method of loop equation relies on the following formula, see [29, (2.16)], to compute the Laplace transform of the (centered) linear statistic associated with a bounded test function w : T → R. According to (1.7), it holds for any t > 0, The idea from [29] is to compute the RHS of (2.1) by an integration by parts using the explicit density (1.1). Let us record this Lemma for an arbitrary test function g; we refer to [29, (2.18)] for the analogous formula for β-ensembles on R.

Lemma 2.1 (Loop equation).
Let w ∈ C 1 (T) and P β N,w be as in (1.7). Recall that we let For our application to Theorem 1.1, it turns out that one lets g = Uw, the Hilbert transform of w. This choice is motivated by the proof of Lemma 2.3. Namely, using that where W N is a random variable which is expected to be of order 1. Thus, to relate this formula with (2.1), we are led to choose g so that w = −Ug. The Hilbert transform U on L 2 (T) is a bounded operator defined in such a way that for any k ∈ Z, where sgn(k) = ± if k ∈ Z ± and sgn(0) = 0. This operator is invertible on L 2 0 (T) with U −1 = −U. Further properties of the Hilbert transform that we shall use in the proofs are recorded by the next Proposition.

Proposition 2.2.
The Hilbert transform has the following integral representation: for In particular, this implies that the function Uf is absolutely continuous on T and Uf ∞ ≤ √ 2π f L 2 (T) .
These basic properties are easy to verify, so we skip the proof of Proposition 2.2. Our CLT follows from the following lemma and technical estimates on the random variable (2.4) that we discuss in sections 2.2 and 2.3. EJP 26 (2021), paper 7. Lemma 2.3. Let w ∈ C 2 (T) be a function which may depend on N ∈ N, let g = Uw and define for any t > 0, Proof. The result of Lemma 2.3 is classical, we give a quick proof for completeness. For Now, by Parseval's theorem and (2.2), observe that according to formula (1.6), we have (2.6) Since Ug = −w by definition of the Hilbert transform, by (2.5)-(2.6), we conclude that (2.7) By (2.1), if we integrate the LHS of (2.7) with respect to t ∈ (0, 1], this completes the proof.
Hence, in order to prove Theorems 1.1 and 1.2, we have to estimate the error term δ N from Lemma 2.3 in the mesoscopic, respectively global, regimes. This will be done carefully in the next two sections.

Estimates in the global regime: Proof of Theorem 1.2
In this section, we use our rigidity estimates from Proposition 1.4 to estimate the error term in Lemma 2.3.

Proposition 2.4.
Let w ∈ C 3+α (T) for some α > 0 be a function which may depend on N ∈ N in such a way that w L 1 (T) ≤ c for some fixed c ≥ 1 and let g = Uw. Let N β ∈ N be as in Proposition 1.4 and let δ N (w) be as in Lemma 2.3. There exists a constant C β > 0 which only depends on β > 0 and c > 0 such that all N ≥ N β and t ∈ [0, 1], (2.10) Proof. Since By an explicit computation, we verify that R 0 is given by (2.10). According to (2.4), using the triangle inequality and collecting all the terms, we obtain that there exists a universal constant C > 0 such that for all N ≥ N β and t ∈ [0, 1], (2.11) N ] , we obtain (2.8).
We are now ready to give the proof of Theorem 1.2.
Proof of Theorem 1.2. Since we assume that w ∈ C 3+α (T), by Proposition 2.2, we have g ∈ C 3 (T) and the terms (2.9) satisfy 6 for some universal constant C > 0. In order to estimate R 0 , observe that by Taylor's theorem, the integrand in (2.10) is uniformly bounded by g ∞ so that R 0 (w) ≤ C g ∞ . Combining these estimates with Lemma 2.3, we obtain (1.2) with and C β = C β 3/2 (1 + 1 β 3/2 ) for some universal constant C > 0. This completes the proof.

Estimates in the mesoscopic regime: Proof of Theorem 1.1
In comparison to the argument given in the previous section, to obtain a CLT at small mesoscopic scales, we need more precise estimates for the error δ N (w L ), see (2.8), especially for R 0 .
Proof. First of all, observe that by a change of variable, for any x ∈ [−π, π], This establishes formula (2.14) for k = 0 -the other cases follow in a similar way by observing that according to Proposition 2.2, the function g L ∈ C 3 (T) and g (k) L ) for k = 1, 2, 3. In order to obtain the estimate (2.15), we use that for any 0 < α ≤ 1, there exist universal constants c, c α > 0 such that for any , . (2.16) In order to obtain the first estimate, observe that if dt and then On the one hand, an explicit computation 7 gives for with a uniform error. On the other hand, which shows that for any x ∈ [−π, π], Now, using formula (2.14) and the estimate (2.16), we obtain which gives the bound (2.15) by splitting the last integral in two parts.
In order to identify the asymptotic variance in Theorem 1.1, we also need the following easy consequence of Proposition 2.2.
Proof. By (2.13), it is immediate to verify that for any x ∈ R, U L f (x) → Hf (x) as L → ∞ where H denotes the Hilbert transform on R. Now, by formula (2.6) and Proposition 2.2, Since w ∈ L 1 (R) and by (2.16) the functions U L w are uniformly bounded in L and x ∈ R, we conclude by the dominated convergence theorem that as L → ∞, It is well known that if w ∈ C 1 c (R), then the RHS of (2.17) equals to w 2 H 1/2 (R) which is also given by (1.4).
Like in section 2.2, our proof of Theorem 1.1 relies on Lemma 2.3, Proposition 2.4 and the following proposition which gives precise control for the term R 0 in (2.8). 7 Recall that d dt log | sin t| = 1 tan t for t ∈ T, t = 0 and supp(f ) ⊆ [− π 2 , π 2 ].
EJP 26 (2021), paper 7. Proposition 2.7. Let w be any function such that g = Uw ∈ C 3 (T) and let R 0 (w) be given by (2.10). We have for any 0 < ≤ 1, Proof. This is just a computation. Let us define By (2.10) and the triangle inequality, R 0 ≤ R 3 + R 4 + R 5 , so it suffices to estimates each integral above individually. Since | sin ϑ| ≥ 2 π |ϑ| for all |ϑ| ≤ π 2 , we have Similarly, we have R 4 (g) ≤ π 2 2 g L 1 (T) . (2.20) In order to estimate R 5 , since we assume that g ∈ C 3 (T), by Taylor theorem, this implies that for any x 1 , x 2 ∈ T with |x 1 − x 2 | ≤ , we have Since g ∈ C 3 (T), the function M g is also a continuous function on T and the previous bound shows that Collecting the estimates (2.19)-(2.21), since R 0 ≤ R 3 + R 4 + R 5 , we obtain (2.18).
We are now ready to give the proof of our main result.
Proof of Theorem 1.1. Let the sequence L = L(N ) be as in the statement of the Theorem.
Estimate for R 0 By Proposition 2.7 with = 1/L, since g L ∞ ≤ w L ∞ , we have Observe that by (2.14) and a change of variable: Thus, since w L ∞ = L 2 w ∞ , using the estimate (2.15), this shows that Using the estimate (2.16), the previous integral is O(log L). Therefore, there is a constant C w > 0 so that R 0 (w L ) ≤ C w log(πL)L.

(2.23)
Estimate for R 1 By Proposition 2.2, it is easy to check that if w ∈ C 2 (T) and g = Uw, by the Cauchy-Schwarz inequality, Thus, we obtain Estimate for R 2 Similarly, by Proposition 2.2, we check that if w ∈ C 2 (T) and g = Uw, Since we assume that L ≤ N , this shows that for some universal constant C > 0, Hence, in the regime where N −1 L(N )(log N ) 3 → 0 as N → +∞, the RHS of (2.26) converges to 0. Moreover, since σ 2 (w L ) → w 2 H 1/2 (R) by Corollary 2.6, this completes the proof.

GMC applications 3.1 Proof of Theorem 1.5
Recall that we let φ r (ϑ) = log |1 − re iϑ | −1 for any 0 ≤ r < 1 and that for any ϑ ∈ T, is a smooth linear statistic for a test function which depends on N ∈ N. The proof of Theorem 1.5 relies directly on [37,Theorem 1.7]. Recall that G denotes the GFF on T and let P r (θ) = 1 + 2 +∞ k=1 r k Cs(kθ) be the Poisson kernel for T. G is a Gaussian logcorrelated field on T whose covariance is given by (1.10) and we have for any 0 ≤ r < 1 and all ϑ ∈ T, Let µ γ N be as in (1.15). In order to apply [37, Theorem 1.7], we need to establish the following asymptotics: for any β > 0 and any n ∈ N, uniformly for all ϑ ∈ T n , 0 < r 1 , . . . , r n ≤ r N and γ in compact subsets of R n . Then, this implies that for any |γ| < √ 2β and any function f ∈ L 1 (T): in distribution as N → +∞. From this result, one can infer that for any |γ| < √ 2β, the random measure µ γ N converges in law with respect to the topology of weak convergence to the GMC measure µγ G , see e.g. [4,Sect. 6].
In order to obtain the mod-Gaussian asymptotics (3.2) and prove Theorem 1.5, let us observe that the test functions φ r (· − ϑ ) behave for 0 < r < r N like smooth mesoscopic linear statistics and we can therefore adapt our proof of Theorem 1.2. Indeed, letting Then, in order to control the error terms in Proposition 2.4, we need the following Lemmas. The proofs of Lemma 3.1 and Lemma 3.2 follow from routine computations. For completeness, the details are provided in the appendix B. Lemma 3.1. There exists a universal constant C > 1 such that the following estimates hold for any 0 ≤ r < 1, ψ r ∞ , ψ r L 1 (T) ≤ C, (3.6) φ r ∞ , φ r L 1 (T) ≤ −2 log(1 − r) + C, (3.7) ψ r ∞ , φ r ∞ ≤ 1 1 − r , (3.8) and also φ r L 1 (T) , ψ r L 1 (T) ≤ C 1 − r .
Similarly, w ∞ ≤ γ (1 − r N ) −1 by (3.8) and, with a possibly different constant C γ , we deduce from (3.6) and (3.9) that Since η = γ (3 log N +C), this shows that the first term on the RHS of (3.11) is negligible compared to the third term and we obtain for all N ≥ N β , According to Lemma 2.3 since ( φ r ) 0 = 0 for any r ∈ [0, 1], this proves that uniformly for all ϑ ∈ T n , 0 < r 1 , . . . , r n ≤ r N and γ ∈ R n , we have for all N ≥ N β , Since w = n =1 γ φ r (· − ϑ ) and the RHS of (3.13) converges to 0 as N → +∞, by formula (3.4), we obtain the asymptotics (3.2). Whence, we deduce from [37,Theorem 1.7] that for any |γ| < √ 2β, the random measure µ γ N converges in law with respect to the topology of weak convergence to the GMC measure µγ G .

Proof of Theorem 1.7
The proof of Theorem 1.7 is almost identical to that of Theorem 1.5 in the previous section, so we just go through the argument quickly. According to (1.18) and (3.5), we have for any 0 ≤ r < 1 and all ϑ ∈ T, We claim that for any β > 0 and any n ∈ N, uniformly for all ϑ ∈ T n , 0 < r 1 , . . . , r n ≤ r N and γ in compact subsets of R n . Hence, by applying [37, Theorem 1.7], we obtain for any |γ| < √ 2β the random measure µ γ N given by (1.20) converges in law with respect to the topology of weak convergence to the GMC measure µγ G associated with the GFF on T. The proof of the asymptotics (3.14) is analogous to that of (3.2). Namely, we have n =1 γ Ψ N,r (ϑ ) = gdµ N where the function g ∈ C ∞ (T) is given by (3.5). By (3.6), we have g L 1 (T) ≤ C γ , so that by directly applying Lemma 2.3 and Proposition 2.4, we obtain for all N ≥ N β , Going through the estimates of section 3.1, since the Hilbert transform of g is given by Ug = −w, we have so that the LHS of (3.15) converges to 0 as N → +∞. By definition of the Hilbert transform, σ 2 (g) = σ 2 (w) is given by (3.4). Hence, since n =1 γ Ψ N,r (ϑ ) = gdµ N , we obtain the asymptotics (3.14) and this completes the proof of Theorem 1.7.

Thick points: Proofs of Proposition 1.6 and Proposition 1.8
The goal of this section is to deduce from Theorem 1.5 some important estimates for thick points of the characteristic polynomial of the CβE N . Recall that for any γ > 0, the set of γ-thick points of the characteristic polynomial is The connection between Theorem 1.5 and Proposition 1.6 comes from the fact that the random measure µ γ N is essentially supported on T γ N for large N ∈ N, see e.g. [13, section 3]. In the following, we rely on this heuristic to obtain a lower-bound for the Lebesgue measure |T γ N | when γ is less than the critical value γ * = √ 2β. By a result of Su [47, Theorem 1.2] (see Lemma 3.4), we obtain the complementary upper-bound to prove Proposition 1.6. Since the proof of Proposition 1.8 is almost identical to that of Proposition 1.6, we skip it and only comment on the main differences.
We let for any N ∈ N, 0 < r < 1 and θ ∈ T, Observe that it follows immediately from the asymptotics (3.2) and formula (1.10) that there exists a constant R β > 1 such that for all |γ| ≤ 2γ * and θ ∈ T, The following result follows essentially from [13,Proposition 3.8]. Since our context is slightly different, we provide the main steps of the proof for completeness.

Lemma 3.4 (Upper-bounds).
For any γ > 0 and any small δ > 0, we have  Proof. These estimates follow by standard arguments using the so-called first moment method and the explicit formula for γ-moments of |P N |. By [47, Theorem 1.2] case (1) that for any γ > −1 and ϑ ∈ T, By using e.g. the asymptotics of [14, Theorem 5.1], this formula implies that there exists a constant C β > 0 such that for all γ ∈ [0, γ * ], Observe that by definition of the set T γ N and Markov's inequality, this estimate implies that for any γ ∈ [0, γ * ], By Markov's inequality, this immediately implies (3.18). In order to prove the second claim, we use that by [11,Lemma 4.3], since P N is a polynomial of degree N , we have the deterministic bound: max T |P N | ≤ 14 max k=1,...2N |P N (e i2πk/2N )|. This implies that for any δ > 0, we have if N is sufficiently large, By a union bound, Markov's inequality and using the estimate (3.20) with γ = γ * = √ 2β, we obtain if N is sufficiently large, This yields (3.19).
Recall that for 0 < r < 1, P r (·) = 1 + 2 +∞ k=1 r k Cs(k·) = 1−r 2 1+r 2 −2rCs(·) denotes the Poisson kernel. Since the function Υ N = log |P N | is harmonic in D, according to (3.16), we have for any 0 < r < 1 and x ∈ T, Proposition 3.5 (Lower-bounds). Fix γ, β > 0 and let T γ N = θ ∈ T : |P N (e iθ )| ≥ N γ/β . For any small 0 < δ < γ such that γ + δ < γ * , we have as N → +∞, In particular, we have for any small δ > 0, Proof. Let us fix 0 < δ < γ such that γ + δ < γ * and define the event By Lemma 3.4, we have P β N [A N ] → 1 as N → +∞. Let us choose L > 0 which only depends on the parameters δ, β > 0 such that By (3.22), since P r is a smooth probability density function, conditionally on A N , we have for any 0 < r < 1 and x ∈ T, where we used that max T Υ N ≤ (γ * + δ 2 ) log N conditionally on A N and (3.25) at the last step. Since P r (θ) ≤ 2 1−r for all θ ∈ T, this implies that for a constant c β > 0 depending only on β or γ * . Choosing L possibly larger, let us assume that M = π L N ∈ N and for k = 1, . . . , Then we obviously have By (3.27), this implies that conditionally on A N : Using the bounds (3.28) and (3.29), we obtain that conditionally on A N , This shows that By Lemma 3.3, the first term on the RHS of (3.30) converges to 0 and, by Lemma 3.4, we also have P β N [A c N ] → 0 as N → +∞. This completes the proof of (3.23) (since δ > 0 is arbitrary small, we may replace 2δ by δ in the end). In particular, this shows that the sets T γ N are non-empty for all 0 ≤ γ < γ * = √ 2β since they have positive Lebesgue measure.
It is now straightforward to complete our proof of Proposition 1.6.
Proof of Proposition 1.6. As (3.18) and (3.23) hold for arbitrary small δ > 0, this shows in probability as N → +∞. Then, combining the estimates (3.19) and (3.19), we obtain the claim (1.17) for the maximum of log |P N |.
Remark 3.6. The proof of Proposition 1.8 follows from similar arguments. In particular, by Theorem 1.7, we obtain the counterpart of Lemma 3.3 for the thick points of the field Ψ N,r N , (1.18). Since we have for any 0 ≤ r < 1 and x ∈ T, by going through the proof of Proposition 3.5, we obtain that for any small 0 < δ < γ such that γ + δ < γ * , lim The complementary upper-bound for |S γ N | is obtained by the first moment method as in Lemma 3.4 using the asymptotics from [47, Theorem 1.2] case (2): for any |γ| < 2 and θ ∈ T, .
and use a union bound in order to deduce the upper-bound for max T Ψ N .

Optimal rigidity: Proof of Corollary 1.10.
This is a direct consequence of Proposition 1.8, we give the details for completeness. Let us define the (centered) eigenvalue counting function  From formula (3.31), we can deduce that there is a constant C β so that for any δ ∈ [0, 1], 2) in the appendix A. This estimate implies a Gaussian tail-bound: for any δ ∈ [0, 1], Then, we deduce from Proposition 1.8 and Remark 1.9 that it holds in probability as (3.37) By formula (3.35), combining (3.37) and the estimate (3.36), we obtain as N → +∞, ues, this implies that for any β > 0 and δ > 0,

Moderate deviations for the eigenvalue counting function
Recall that we denote by h N the (centered) eigenvalue counting function (3.34). Note that almost surely, h N is a càdlàg function on T such that h N ∞ ≤ N and h N = µ N in the sense that for any function f ∈ C 1 (T), we have In this section, by using the connection between the eigenvalue counting function and the logarithm of the characteristic polynomial, see formula (3.35) above, we investigate the probability that h N takes extreme values.
For the proof of Proposition 4.1, we need the following Lemma which is an easy consequence of a result from Su [47]. For completeness, the proof of Lemma 4.2 is given in the Appendix A.

Lemma 4.2.
There exists N β ∈ N, such that for all N ≥ N β and any t > 0 (possibly depending on N ), By Lemma 4.2, this implies that if N ≥ N β , then for any t > 0, Using this Gaussian tail-bound, we can estimate the Laplace transform of the linear statistics wd µ N , we obtain By completing the square, we obtain So if N is sufficiently large (depending on η > 0 and β > 0), this shows that E β N e wd µ N ≤ 1 + 3ηN 1+η 2 /4β π log N/β. On the other hand, by Jensen's inequality and the fact that µ N is centered, we have Therefore, by Cauchy-Schwarz inequality, we have for any t > 0, For the last step, we used Lemma 4.2 and the estimate (4.3) replacing w by 2w (this only changes η by 2η). From our last estimate, we obtain (4.2) by taking t = √ 2 β η log N .
Finally it remains to give a proof of Proposition 1.4.
Proof of Proposition 1.4. Fix n ∈ N and R > 0. Like (4.1), since h N = µ N , we have for any f ∈ C n (T n ), This implies that for any f ∈ F n,R , Hence, the claim follows directly from from Proposition 4.1.
Hence, we conclude that there exists N β ∈ N, such that for all N ≥ N β and any t > 0 (possibly depending on N ),
By exactly the same argument, we obtain a similar bound for R 0 (ψ r ) which completes the proof.