Central Limit Theorems for global and local empirical measures of diffusions on Erd\H{o}s-R\'enyi graphs

We address the issue of the Central Limit Theorem for (both local and global) empirical measures of diffusions interacting on a possibly diluted Erd\H{o}s-R\'enyi graph. Special attention is given to the influence of initial condition (not necessarily i.i.d.) on the nature of the limiting fluctuations. We prove in particular that the fluctuations remain the same as in the mean-field framework when the initial condition is chosen independently from the graph. We give an example of non-universal fluctuations for carefully chosen initial data that depends on the graph. A crucial tool for the proof is the use of extensions of Grothendieck inequality.


Introduction
Fix n = 2, 3, . . . and let ξ (n) = ξ (n) ij ij=1,...,n be a collection of independent and identically distributed Bernoulli random variables, with parameter p n ∈ (0, 1] (i.e. ξ (n) defines an asymmetric Erdős-Rényi random graph of parameter p n ). Let T > 0 be a finite time horizon. We are interested in the empirical measure of a weakly interacting particle system where each particle is represented by a function on the 1-dimensional torus T ∶= R 2πZ. The population dynamics is defined by the following system of stochastic differential equations dθ i,n t = 1 np n n j=1 ξ (n) ij Γ θ i,n t , θ j,n t dt + dB i t , 0 < t T, i = 1, . . . , n, (1.1) endowed with some initial condition θ i,n 0 i=1,...,n and where (B i ) i 1 are independent and identically distributed standard Brownian motions. The dynamics of θ i,n is influenced by the θ j,n with i and j neighbors in the graph, through some regular function Γ ∶ T × T → R. The interaction in (1.1) is renormalized by the (uniform) expected degree np n of each vertex so that the interaction remains of order 1 as n → ∞. We denote by P the joint law of the graph and the initial condition, and by P the law of the Brownian motions, so we will be working with P ⊗ P. Moreover, we will denote by P g and P 0 the marginals of P of the graph and initial condition respectively. where F is a (bounded) smooth function on T modelling intrinsic dynamics for each particle. The result of the present paper remain obviously valid, up to the notational cost of adding e.g. some drift term −∂ θ [µ t (θ)F (θ)] in the nonlinear Fokker-Planck equation (1.4). We chose to restrict to F ≡ 0 for simplicity of exposition.
An easy instance of (1.1) corresponds to the mean-field case, i.e. when p n ≡ 1 so that ξ (n) ij ≡ 1 for all i and j (hence the graph of interaction is the complete graph). In such a case, the interaction in (1.1) is a functional of the empirical measure of (θ i,n ) i=1,...,n defined as The empirical measure is a (random) probability measure on T. The behavior of µ n as n → ∞ in the mean-field case is standard and particularly well-covered in the literature (see e.g. [64,29]): provided that µ n 0 converges to µ 0 , one can show that µ n converges to the unique solution to the following non-linear Fokker-Planck equation (1. 4) In (1.4), the * denotes the integration with respect to the second variable, i.e., (Γ * µ)(⋅) = ∫ T Γ(⋅, θ)µ( dθ) for µ a probability measure on T. The convergence to (1.4) can be for instance considered in the space of probability measure on continuous functions on the torus, i.e., in P(C([0, T ], T)). In general, it depends on the topology where the sequence (µ n ) n 1 is studied.
A recent interest has been shown in the literature concerning extensions as (1.1) to generic graphs of interactions. Observe that when (ξ (1.5) Contrary to the mean-field case where one easily obtains by Ito's formula a closed semimartingale decomposition of µ n (e.g., [53,64]), applying the same calculation in the general case (see, e.g., Lemma 4.3) shows that µ n,l depends itself on empirical measures involving higher order expansions within the graph structure: a whole hierarchy of empirical measures (indexed by local patterns in the graph) appear and the difficulty is to find a way to properly close this decomposition. However, if the graph (ξ (n) ij ) is sufficiently close to the complete graph (in a way to be precised), the behavior of the system (1.1) as n → ∞ should be described as well by the same macroscopic limit (1.4). A rather informal question would be to understand how universal the mean-field framework is: how much can we perturb the complete graph of interaction and still conserve the same macroscopic properties as n → ∞? This question has been addressed in details at the level of the law of large numbers: a series of recent papers [21,17,43,3,8] have shown that µ n converges to µ for a large class of graphs that includes the Erdős-Rényi case. We refer to Section 2.7 for a more detailed discussion on these results.
Note that one may be interested in two distinct graph regimes: the dense case when p n → p ∈ (0, 1] as n → ∞, i.e., the mean degree of each vertex remains proportional to the size of the population, and the diluted case (or vanishing density degree) when p n → 0 as n → ∞ and up to the sparse threshold np n → c > 0, where other phenomena are known to be in play [57,40].
1.1. Aim of this work. The purpose of the present work is to address the universality of the mean-field framework at the level of fluctuations. We are interested in studying the limit of two objects: first of all, the global fluctuation process η n , given by η n t ∶= √ n(µ n t − µ t ), for 0 t T, (1.6) i.e., the standard fluctuations of µ n around the limit µ. Note that η n not only depends on the randomness coming from the noise in (1.1), but also on the graph ξ (n) . We will consider the behavior of η n under the law P only, i.e., as a quenched object with respect to the graph sequence realisation. In addition, we will include the challenging case when the initial condition depends on the underlying graph. To the authors' knowledge, this issue has never been tackled. We refer to Section 2.7 for an overview on the existing literature.
The second aim of this paper concerns the fluctuations of local empirical measures around their limit. Recall the definition of the local empirical measure µ n,l in (1.5). This process is the empirical measure of particles at distance 1 of vertex l within the graph. Note that µ n,l is not necessarily of mass 1 (one only has ⟨µ n,l , 1⟩ → n→∞ 1, P g -a.s.), as (1.5) is not renormalised by the degree d n,l ∶= ∑ n j=1 ξ (n) lj of vertex l, but rather by its expectation E d n,l = np n . This choice of renormalisation turns out to be convenient for our fluctuation results, as it permits to focus only on the fluctuations that come from the dynamics and not to bother with intrinsic fluctuations in the vertex degrees in the graph. The influence of this last choice of renormalising constant on our results is discussed in details in the Appendix E. It turns out that the natural limit for the local empirical measures (1.5) is also given by µ solution to (1.4). Hence, the second aim of the paper will be to address the behavior of the joint fluctuation process ζ n t ∶= ζ n,1 t , ζ n,2 t ∶= √ np n µ n,1 t − µ t , √ np n µ n,2 t − µ t , (1.7) that is the joint fluctuation process for the local empirical measures around vertices 1 and 2. One point will be to understand how the limits of both η n and ζ n may or may not depend on a specific realisation of the graph and, secondly, on the graph structure itself (in particular the fact that the graph may be dense of diluted).

Strategy of proof.
To prove the convergence of fluctuations, we follow the classical trilogy of arguments: tightness of η n , identification of the limiting values of η n as solution to a suitable partial differential equation and uniqueness of the limit solution. The main tool to prove the first two steps is a semimartingale decomposition in a suitable Hilbert space of distributions (see, e.g., [51,33,25]). From this point of view, we closely follow the strategy proposed in [25]. As already said, there is no possible way to obtain a closed semimartingale decomposition for the empirical processes µ n and µ n,l (and thus for η n ): we have supplementary terms that depend on expansions of higher order within the graph. Our strategy, and the notable exception compared to [25], is to pursue the expansion and to write the semimartingale decomposition of these higher order terms, until the remaining errors in the expansion of µ n become lower than 1 √ n. Although written in a separate way for clarity of exposition, the treatment of both global and local fluctuations (note that the local fluctuations are considered under a stricter set of hypotheses concerning the initial condition) follow the same main lines. The main argument to close this expansion (and the main contribution of the paper) concerns the use of generalised Grothendieck inequalities, which enable us to decorrelate the dynamics from the proximity estimates between the graph (ξ (n) ij ) and the complete graph. More details on the use of Grothendieck inequalities are given in Section 3.
1.3. Notation. The space of probability measures on a metric space X is denoted by P(X). As usual, we denote by C([0, T ], X) the space of continuous functions from [0, T ] into X endowed with the supremum norm, this last one being denoted by ⋅ ∞ . For two probability measures µ and ν on a given metric space X, we denote by d BL (µ, ν) their bounded Lipschitz distance, i.e., where BL(X) ∶= f ∶ X → R ∶ f ∞ 1, f Lip 1 and f Lip is the Liptschitz constant of f . In order to study the limit of both (1.6) and (1.7), we will need to introduce several other weighted empirical processes, that all depend on the graph sequence ξ (n) ij . The first one is a weighted empirical measureν n t on T 2 , that is associated to the fluctuations arising from the graph sequence, i.e., as well as its rescaled counterpartη n ∶= √ nν n .
(1. 10) In (1.9) and (1.10), we used the notation ξ (n) ij ∶= ξ (n) ij p n − 1, i, j = 1, . . . , n. (1.11) A measure that will arise naturally in the study of local fluctuations is the following: (1.12) whose role is to account for the presence/absence of the pattern 1 ← i → 2 in the graph, capturing a notion of connectivity between vertices 1 and 2. Let ⟨⋅, ⋅⟩ denote the duality bracket between function spaces and their dual spaces, observe that the mass ⟨µ n,1,2 t , 1⟩ tends also almost surely to 1 as n → ∞.
For studying the sequence (η n ) n 1 defined in (1.6), we need a suitable space of distributions. In fact, quantities such as η n orν n are not probability measures, having total measure close to 0, and require to be studied in a larger space. As already done in the literature [25,33,6,4], we choose to work in a class of Hilbert spaces that include (as a continuous embedding) the probability measures. The canonical choice is given by the usual Sobolev Hilbert spaces H −r (T d ) ∶= W −r,2 (T d ) (with r > 0), dual of H r (T d ) space of test functions with derivatives up to order r with finite moments of order 2 [1]. We define h H −r (T) (resp. h H −r (T 2 ) ) as the norm on H −r (T) (resp. H −r (T 2 )). To keep the notation concise, we will often write h −r for both these norms, whenever the context leaves no ambiguity on the fact that this notation concerns functional acting on test functions on T or T 2 . Sobolev inequalities, e.g., [1], state that for α ∈ (0, 1] and r = d 2 + α, where C 0,α (T d ) is the space of continuous functions with α-Hölder regularity. By duality, P(T d ) is continuously embedded into H −r (T d ) for any r > d 2 . This implies that any probability measure on T belongs to H −r (T) for any r > 1 2 . For a given t ∈ [0, T ], the distributionν n t is an element of the Hilbert space H −r (T 2 ) for r > 1. Recall also that one has the following Hilbert-Schmidt embeddings (see [25] and [1, §6]) In the Hilbert space H r (T d ), one can define the semigroup operator S = (S t ) t 0 associated to the Laplacian operator, this last one being denoted by ∆ (or ∂ 2 θ in the onedimensional case). It is well-known, e.g., [34], that S ⋅ is an analytic semigroup. For a given operator L on some Hilbert space, we denote by L * its dual operator in the corresponding dual Hilbert space.

General hypotheses.
Assumption 2.1 (Regularity of Γ). We assume that Γ is infinitely differentiable on T 2 (and hence bounded with bounded derivatives). A careful reading of the proofs below shows that Γ being C k for a sufficiently large k would be actually sufficient.
Assumption 2.2 (Initial condition). We suppose that the initial condition, that is the random variables θ 1,n 0 , . . . , θ n,n 0 , are chosen independently from the Brownian motions (B 1 , . . . , B n ) (but not necessarily i.i.d. and they may depend on the graph), such that their empirical measure µ n 0 converges weakly to some µ 0 in the following way: for any q 1. It is possible to assume only (2.2) for some q 1 and in such a case (2.4) below remains true up to a supplementary integration w.r.t. E 0 and the results of the paper remain valid. (2) Convergence of local empirical measures: if one supposes further that the initial condition is independent of the graph: (a) if np 3 n → ∞ as n → ∞, then for any fixed l 1, the local empirical measure µ n,l defined in (1.5) verifies, for all q 1, n → ∞ as n → ∞, then the local empirical measure µ n,1,2 defined in (1.12) verifies for all q 1, The proof of Theorem 2.4 can be found in Appendix C. In comparison with the existing literature, the convergence result (2.4) generalises the previous results in two ways: we obtain the optimal dilution condition (2.3) under the quenched set-up and more importantly, we allow for initial condition that possibly depend on the graph (not to mention that they need not be necessarily i.i.d.). We refer to Section 2.7 for a more detailed discussion on this matter. It is however likely that the dilution conditions required for the convergence of the local empirical measures may not be optimal.
2.3. Global fluctuations. We now proceed with the main results concerning fluctuations. We address two issues: first, global fluctuations (Section 2.3) that is the behavior as n → ∞ of the global fluctuation process η n given in (1.6) as n → ∞; second, local fluctuations (Section 2.4) that is the joint convergence of the local fluctuation processes ζ n,1 , ζ 2,n given in (1.7). In the rest of the paper, the following indices are fixed: Recall the definitions of η n in (1.6) andη n in (1.10). We first state our main hypotheses concerning the initial condition. We suppose in the following that either Assumption 2.5 or Assumption 2.6 is true.
Assumption 2.5 (Quenched initial fluctuations). We suppose that there exists α ∈ (0, 1) such that the following estimates are true Note that under (2.8) and (2.9), P g -a.s., (η n 0 ) and (η n 0 ) are tight in H −r 1 (T) and H −r 1 T 2 respectively. In addition, we require that we have, P g -a.s., the joint convergence in law Assumption 2.6 (Annealed initial fluctuations). We suppose the same hypotheses as for Assumption 2.5, with (2.8) and (2.9) replaced by 11) and the joint convergence of (η n 0 ,η n 0 ) in law w.r.t. the joint law P of the initial condition and graph.
Let us now state the main result of this paper on global fluctuations, concerning the weak limit of the fluctuation processes η n andη n . Define first the following linear differential operators: for all test functions f and g, s ∈ [0, T ] and ν ∈ H −r with r > 1 2, (2.17) In case Assumption 2.5 holds, the above convergence is almost sure w.r.t. the randomness of the graph (quenched convergence) whereas in case of Assumption 2.6, the convergence is understood under the annealed law P ⊗ P.
A particular case of Theorem 2.7 concerns the case where the initial condition for the second order fluctuation processη n 0 goes to 0 as n → ∞: Theorem 2.8 (Universal mean-field fluctuations). Suppose Assumptions 2.1, 2.2 and either Assumption 2.5 or 2.6 are true. Suppose in addition that the limit ofη n 0 given in Assumptions 2.5 or 2.6 isη 0 ≡ 0. Then the process (η n ) converges in law as n → ∞ in with η 0 independent of W . In case of Assumption 2.5, the above convergence is almost-sure w.r.t. the randomness of the graph (quenched convergence) and in case of Assumption 2.6, the convergence holds w.r.t. the annealed law P ⊗ P.
Proof of Theorem 2.8. It suffices to note that the limiting dynamics ofη in (2.16) is deterministic and linear, so that if initiallyη 0 ≡ 0 one obtains by uniqueness thatη t ≡ 0 uniformly in t ∈ [0, T ]. Hence, Theorem 2.8 follows immediately from Theorem 2.7.
Observe that (2.18) is nothing else than the limiting SPDE of the fluctuation process in the pure mean-field case p n ≡ 1 that has been obtained in [25], under i.i.d. initial condition. In this case, Theorem 2.8 is of course compatible with the result of [25] as, when p n ≡ 1,η n 0 is equally 0 for all n and η n 0 converges to a Gaussian process so that Assumption 2.5 is trivially true. One can see Theorem 2.8 as a universality result, valid beyond the mean-field case, under the dilution condition (2.15): the system (1.1) conserves the same fluctuations as in the mean-field case, as long as one can verify Assumption 2.5 or 2.6 andη 0 ≡ 0. It is likely that the dilution condition (2.15) may not to be optimal: the critical point on this matter is the concentration estimates on quantities S T n given in Definition 3.4. Any improvement in the rates of convergence found in Proposition 3.5 would lead to corresponding improvements in (2.15). The second direction in which Theorem 2.8 generalises the Central Limit Theorem of [25] is the following: a crucial observation is that the fluctuation result in [25] was proven in the case where the initial datum θ 1,n 0 , . . . , θ n,n 0 consists of i.i.d. random variables. This hypothesis, being perfectly reasonable in the pure mean-field context as a natural means to preserve exchangeability between particles, is not really relevant in the context of (1.1), as exchangeability is lost, due to the presence of the graph. Anticipating on Section 2.5 (where sufficient conditions for the result are given), we indeed show that these universal fluctuations go well beyond the i.i.d. case as they remain valid as long as the initial condition is chosen independently on the graph.
In an opposite way, we also describe in Proposition 2.14 an example of initial condition, depending on the graph structure, such thatη n 0 has a non zero limit, and thus for which the limit fluctuations are completely described by (2.16) and no longer by the mean field fluctuations (2.18).

Local fluctuations.
We now give our result concerning the local fluctuations (recall the definitions of the local fluctuation processes ζ i n , i = 1, 2 in (1.7)). As global fluctuations compete with local fluctuations, the main result concerns the convergence of the joint fluctuation process ζ n,1 , ζ n,2 , η n . We place ourselves in the case of i.i.d. initial condition, independent on the graph. Anticipating on Proposition 2.11 and Example 2.12, we see that Theorem 2.8 is true: the global fluctuations of η n are completely described in terms of (2.18). Theorem 2.9. Suppose Assumption 2.1 and that (θ 1,n 0 , . . . , θ n,n 0 ) are i.i.d. random variables with law µ 0 , independent from the graph. Suppose that lim inf n np 5 n = ∞ and denote by p ∶= lim n→∞ p n ∈ [0, 1]. Then, P g almost surely, the joint fluctuation process ζ n,1 , ζ n,2 , η n converges as n → ∞ in C [0, T ], (H −r 1 (T)) 3 to ζ 1 , ζ 2 , η solution in C [0, T ], (H −r 2 (T)) 3 to the system where
A closer look at the structure of covariance of both initial condition in (5.2) and noise in (5.18) shows that in the diluted case p = lim n→∞ p n = 0, the process ζ 1 , ζ 2 , η are mutually independent and each ζ l (l = 1, 2) satisfy In the dense case p > 0, ζ 1 and ζ 2 are correlated in several ways: with a nontrivial correlation of their noise and initial condition, and through the coupling of the global fluctuation process η. Theorem 2.9 is proven in Section 5.

2.5.
Examples. We give in this section examples of initial condition verifying Assumptions 2.5 or 2.6 as well as sufficient conditions for the universality hypothesisη 0 ≡ 0 of Theorem 2.8.
Universal mean-field fluctuations. The main important point of this paragraph is to note that the universality conditionη 0 ≡ 0 of Theorem 2.8 is true as long as one chooses the initial condition to be independent of the graph (but not necessarily i.i.d.!).
Proof of Proposition 2.11 is given in Section 4.4.
Quenched mean-field fluctuations. We give now illustrating examples of initial condition that is independent of the graph, hence particular cases of Assumption 2.10, for which the initial fluctuation process η n 0 verifies Assumption 2.5. Example 2.12 (The case of i.i.d. initial condition). Suppose that (θ i,n 0 ) are i.i.d. with law µ 0 (and independent on the graph and on the Brownian motions (B i )). Then Assumptions 2.5 is valid and the limit η 0 is the Gaussian process with covariance (2.23) The following example shows that Assumption 2.5 is sufficiently weak to possibly include non necessarily i.i.d. initial condition. We do not try to give any sharp condition here, we refer to the references mentioned in Example 2.13 for details. (where α(A, B) is the Rosenblatt α-mixing coefficient between A and B), we have that η n 0 converges as n → ∞ in H −1 (T) to a Gaussian process with explicit covariance. Moreover, under the same condition, applying [19, Th. 1] for ϕ(η) ∶= η 2 −1 , we obtain the uniform bound (2.8) for α = 1, so that Assumption 2.5 is satisfied. In the context of Markov chains, condition (2.24) is true as soon as the chain is ergodic of degree 2 [54] and includes geometrically ergodic Markov chains [15].
An example of non-universal fluctuations. We construct in Section 4.5 an example of initial condition (depending on the graph sequence) such that the limit fluctuations are non-universal: , there exists a choice of initial condition (θ 1,n 0 , . . . , θ n,n 0 ) such that (η n 0 ,η n 0 ) satisfies Assumption 2.6 and converges in law (w.r.t. the annealed law P ⊗ P) to (η 0 ,η 0 ) with In particular, Γ * η n 0 converges to δ π 2 ≡ 0, so that the limiting process η t is governed by (2.16) and not by the universal mean-field SPDE (2.18).
Proof of Proposition 2.14 is given in Section 4.5.
For the blue histogram the interaction is of mean-field type with i.i.d. initial condition of distribution 1 2 δ0 + 1 2 δ π 2 , while for the brown one it is of symmetric Erdős-Renyi type with p = 0.5 and initial condition as described in Section 4.5. We observe a dephasing at time t = 1 at the level of fluctuations, induced by the graphdependent initial condition.

2.6.
On possible generalisations to inhomogeneous graphs. Even though we have focused on Erdős-Rényi random graphs, the same formalism should easily adapt to inhomogeneous situations, i.e., to graphons (see e.g. [3,7,43]). Notably, the concentration results used in this paper (i.e. Bernstein inequalities or the concentration result for the operator norm of a random matrix [67] used in the proof of Proposition 3.5) only require the ξ (n) ij to be independent, not necessarily identically distributed. For the sake of conciseness, we give one illustrating example generalising the present homogeneous case and leave further generalisations to the reader. This example is an elementary instance of the Stochastic Block model in case of only two communities: let n an even number and divide the population into two clusters C n 1 = 1, . . . , n 2 and C n 2 = n 2 + 1, . . . , n , and suppose that the ξ (n) ij are independent with Bernoulli law with parameter p i,j = p if i, j belong to the same cluster and p i,j = q if i, j belong to different clusters. Then the mean degree of each node is nr with r = p+q 2 , so that the dynamics is There are here two global empirical measures, each on one cluster: µ n,C l = 2 n ∑ i∈C n l δ θ i,n . Suppose for simplicity that the initial condition is chosen independently from the graph, with the appropriate convergence hypotheses of the empirical measure as n → ∞. Similar standard Ito's calculations as for Lemma 4.3 (note here that (1.11) has to be replaced bŷ ξ (2.25) The mean-field limit is given by the system of coupled PDEs Setting now the fluctuations processes η n ∶= η n,C 1 , η n,C 2 ∶= n 2 µ n, from (2.25) and (2.26) and using the same techniques as in the present paper, it is not difficult to show that the proper limit for (2.27) is given by for Here W 1 , W 2 are independent Gaussian process with covariance E W l s (f )W l t (g) = ∫ t∧s 0 ⟨µ C l u , ∂ θ f ∂ θ g⟩ du. We retrieve from this calculations several easy particular cases: (1) when q = 0, that is α = 1: then we see from (2.28) that (η C 1 , η C 2 ) are independent copies of the same process solving (2.18) (this is of course normal as the two clusters C 1 and C 2 are now disjoint).

2.7.
A look at the literature. Interacting particle systems of mean field type have been repeatedly addressed in the literature of the last fifty years, see [49,55,29,64], this list being in no way exhaustive. The first results focus on the Law of Large Numbers (LLN) for the empirical measure together with existence and uniqueness related to the limit Fokker-Planck equation, see, e.g., [28,41,29]. Shortly after, the Central Limit Theorem (CLT) [13,66,63,65,33] has been established in several scenarios. Two main methods have been proposed in the literature to study the fluctuations around the LLN limit. One method [66,62,63,61,14] consists in focusing on the fluctuation field , E µ (f ) = 0 , and to prove that, as n → ∞, it converges to some Gaussian field with prescribed covariance. The other method, the one followed here and firstly proposed in [51,33,25], directly addresses the fluctuation process (1.6), i.e., η n = √ n(µ n − µ), and aims at showing that η n converges to the solution of a stochastic partial differential equation. See [66,Chapter 3] for an interesting relation between these two approaches when the particle interaction is linear.
We must stress that the first method has three main drawbacks: (i) the convergence only concerns finite-dimensional marginals (ii) its proof relies heavily on exchangeability properties of the system (that is in no way applicable in our quenched context) and (iii) the covariance of the limit Gaussian field is not explicit but involves Radon-Nikodym derivatives and integrals of the dynamics operators. The second method is more challenging, yet it translates the limit dynamics in terms of a "classical" linear SPDE that can be further studied, see [39] for a general result on this kind of limit equations. Finally, observe that in some cases, the CLT for finite dimensional marginals can be derived from a Large Deviation Principle, see [18,Chapter 4] and [12].
In the case of interacting particle systems on graphs, most of the literature focuses on LLN [21,17,3,7], with some exceptions concerning large deviations [17,47,56,23]. It is interesting to compare the LLN established in Theorem 2.4 with the previous ones in the literature. To the authors' knowledge, there exists no result under weaker assumptions on graph and initial condition than our Theorem 2.4. Although the assumption np n → ∞ is common across other results (but only appears in an annealed context, e.g., [3,56], whereas the best condition so far in a quenched context was lim inf n→∞ npn log(n) > 0 [17]), the only work assuming general initial condition that may depend on the graph is given by [16]. However, [16] focuses on a specific particle system, the Kuramoto model, and proves a result in H −1 -norm whereas Theorem 2.4 is stated in terms of the classical weak convergence. Given the results [57,40] on the sparse regime np n → c > 0, the condition (2.3) of Theorem 2.4 appears to be optimal.
To the authors knowledge, there exists only one work addressing the CLT for diffusions on graphs, i.e., [8], which addresses interacting diffusions on R d and dense inhomogeneous random graphs. Despite the generality of the particle systems and the graphs under consideration, the CLT statement is substantially weaker than the one presented here: it concerns finite dimensional marginals, the underlying graph sequence is dense and the result is proven in probability with respect to the graph, whereas we prove a P g -a.s. convergence and consider graph sequences in possibly diluted regimes. Finally, the initial condition in [8] are taken i.i.d. and independent of the graph, whereas we only suppose the weak convergence of µ n 0 towards some probability measure, see Assumption 2.2.
2.8. Organisation of the paper. The rest of the paper is organised as follows: we present in Section 3 the main argument that we use for closing the hierarchy of empirical measures, that is extension of Grothendieck inequalities. Section 4 contains the proofs of the global fluctuations result (Theorem 2.7). The local fluctuations (Theorem 2.9) are treated in Section 5. We gather in App. A and B some technical estimates (namely Sobolev inequalities and further concentration estimates). App. C gives the proof of Theorem 2.4. Uniqueness results are gathered in App. D. We finally discuss on the importance of renormalisation of the interaction in App. E.

Grothendieck inequality and concentration estimates
Before giving the main result on Grothendieck inequality, we give a characterisation of the processes η n andη n in terms of semimartingales.
3.1. Characterisation of the processes. For the proofs of Theorem 2.7 and Theorem 2.8, we rely on the following characterisation of the processes η n andη n . Their proof can be found at the end of Section 4. For the definition of the Doob-Meyer process for Hilbert-valued martingale, we refer to [50]. Recall the definitions of L (1) and L (2) , see (2.13) and (2.14) respectively. Proposition 3.1. For any r > 3 2 and r ′ > 5 2 the joint process (η n ,η n ) belongs P ⊗ P-a.s. to Moreover, η n andη n satisfy the following semimartingale representations in H −r 1 (T) and H −r 1 (T 2 ) respectively,

2)
where, for any regular test function g with Doob-Meyer process ⟪W n ⟫ taking values in L (H r , H −r ) and given for t ∈ [0, T ] and ϕ, ψ ∈ H r by

4)
and the process (Ŵ n t ) t∈[0,T ] (whose explicit form is given in (4.12)) is a martingale in Note that one point of the proof, see Proposition 4.7, will be to show that the noiseŴ n goes to 0 as n → ∞.
A crucial step in the proofs of Theorem 2.7 and Theorem 2.8 is to show that the graph dependent term C n in (3.3) goes effectively to 0 as n → ∞. In view of its structure, a strategy would be to take advantage on concentration estimates, with respect to the graph, on quantities such as 1 3) depends in a highly nontrivial manner on the graph sequence (ξ (n) ij ), and standard concentration results (e.g. Bernstein inequalities) cannot be applied. The main novelty of the present work is to circumvent this difficulty in using multi-linear extensions of the classical Grothendieck inequality proved by R. Blei [10,11].
The use of Grothendieck inequalities is detailed in the following subsection 3.2, in the case of the term C n : an immediate consequence of Propositions 3.3 and 3.5 is that the term C n does not contribute to the limit as n → ∞. Note however that this strategy will be applied repeatedly in this work to various other functionals of the graph sequence ξ For the sake of readability, we postpone the definitions of these other quantities and the corresponding concentration results to Appendix B.
3.2. Grothendieck inequality. The classical Grothendieck equality has received a lot of attention in the recent years, as it was shown to be a powerful tool for the study of graph concentration [2,32]. Let us consider an infinite dimensional Euclidian space with coordinates indexed by a space A: endowed with the usual scalar product ⟨x , y⟩ l 2 (A) = ∑ α∈A x αȳα and the associated norm ⋅ l 2 (A) . In this context the classical Grothendieck inequality states that there exists a universal constant K such that for any finite scalar array a jk , This inequality is known to fail in general when the scalar product is replaced by a bounded trilinear functional, see for example [59]. However, in [10,11] R. Blei describes a family of multilinear functionals for which this inequality remains valid. We present this result in the following. Consider a positive integer m and a sequence U = (S 1 , . . . , S N ) of non empty sets that satisfy The functional ν U ,θ will satisfy a Grothendieck inequality under some assumptions on the covering sequence U and on θ. Following the notations of [11] we denote, for 1 j m, by k j the incidence of j in the covering sequence U, that is and by I U the minimal incidence: Moreover, we say that the mapping θ belongs to the spaceṼ U (A m ) if there exists a probability space (Ω,Ã,μ) and family of functions g and such that we have the representation The norm θ Ṽ U (A m ) corresponds to the infimum of the left-hand side of (3.6) over all possible representations. The following generalisation of the Grothendieck inequality corresponds to Theorem 11.11 and Section 12.4 in [11].
Theorem 3.2. Suppose that I U 2 and that θ ∈Ṽ U (A m ). Then there exists a positive constant K U , depending only on the covering U, such that for any finitely supported scalar n-array a j 1 ...j N , In this paper we will use this result with A = Z, m = 2, N = 3 and θ(x) = 1, which is trivially an element of V U (Z 2 ). Let us now show how this inequality can be applied to bound the term C n in (3.3).
A remark on notation: asξ ik , which encodes for the presence of both (directed) edges i → j and i → k in the graph, it is natural to label the corresponding quantity by the local tree (where each stands for a directed edge i → j). Every similar quantities in the following will be labeled according to this principle. Therefore, define and S n ∶= sup r,s,t∈{±1} n 1 n 3 n i,j,k=1ξ . Then there exists a constant C Γ , depending only on Γ, such that for n large enough Proof. Let C n,1 and C n,2 such that ⟨C n , f ⟩ = ⟨C n,1 , f ⟩ + ⟨C n,2 , f ⟩, the first term C n,1 is given by and the second term C n,2 by Let's focus on ⟨C n,1 t , f ⟩. The point is to establish a bound on ⟨C n,1 t , f ⟩ that only depends on T , some norm of Γ, the H 3 (T 2 )-norm of f , and the underlying graph ξ (n) , using Theorem 3.2. To simplify notations, let Ξ ik for every choice of i, j and k. Then Let (e a ) a∈Z be the canonical basis of L 2 (T): observe that ∂ θ 1 f and Γ can be rewritten in Moreover, define y 1,i , y 2,j and y 3,k by (3.11) With the previous notation, the term ⟨C n,1 t , f ⟩ can be decomposed as follows: In order to apply Theorem 3.2, we replace the factors defined in (3.11) by 2 -summable elements. For some δ > 0 define the following functions: for every a, b ∈ Z. By construction, the 2 (Z)-norms of x 2,j and x 3,k are equal to 1. Moreover the 2 (T 2 )-norm of x 1,i can be bounded by where the constant C Γ,δ only depends on Γ and δ. By the definition of the fractional H s -norm, for 0 < s < 1, one has that .
We can bound the previous norm with a fractional Hilbert norm on T 2 , no longer dependent on the value of θ i,n s , see Lemma A.1. Thus, it holds that . This last expression is further bounded by C ′ f (⋅, ⋅) H 2+1 2+2δ ( dθ 1 , dθ 2 ) because of Sobolev's embeddings. Choosing δ = 1 4, we conclude that there exists a constant C Γ , depending only on Γ, such that We are now able to apply Theorem 3.
Taking the supremum in t ∈ [0, T ] and in f ∈ H 3 T 2 , we finally obtain Using similar arguments, one can show that which concludes the proof.
Other controls on similar quantities have been gathered in Appendix B.

Concentration estimates. Recall the definition of theξ
(n) ij in (1.11), and the definitions of S n and S n given in (3.7) and (3.8) respectively. In view of Proposition 3.3, our aim is to obtain a bound on these two terms, as well as others, which will be of constant use in the paper: (3.14) Proof. Let us first prove (3.19). Relying on Bernstein's inequality and on a union bound, we obtain Thus, the choice t = c √ npn leads to which is summable for c = 3 with the hypothesis np n → n→∞ ∞. Let us now prove (3.20) for T = . The proof for the other cases are the same, up to a transposition of matrix. Remark that, considering A the matrix ξ ij 1 i,j n and S the diagonal matrix with values (s i ) 1 i n on the diagonal, we have, denoting ⋅ the operator norm of matrices, Consider the matrix A multiplied by p n , its coefficients are bounded by one: it is well known (see for example Corollary 2.3.5 in [67]), that there exist absolute constants c and C such that, for all λ C, Recalling the previous inequality, this means that which is summable. The bound (3.21) can be treated in the same way. Let us fix l ∈ {1, 2}, T = , and define moreoverR to the the diagonal matrix with diagonal values given by where we used the rough bound R 1 pn . We can then proceed as above. The proof for T = is similar.
We will consider in many places of the paper various (possibly weighted) empirical means involving the centered variablesξ (n) ij . We refer to Appendix B where the definitions and the corresponding asymptotics for these quantities have been gathered.

Proofs concerning the global fluctuations
In this section, we will refer to two linear forms and their continuity properties.
Lemma 4.1. Let θ, θ ′ ∈ T be fixed. The following linear forms Proof. We have for all regular f and θ, θ ′ ∈ T, where the last inequality is due to Sobolev embedding, see, e.g., [

Regularity and semimartingale representations.
We present the stochastic differential equations satisfied by µ n , η n andη n , and aim in particular at proving Proposition 3.1. We define S ∶= (S t ) t 0 as the analytic semigroup associated to the Laplacian operator. We present here a well-known argument ( [4,26]) concerning the regularity of (S t ), that we will employ at multiple steps.
Lemma 4.2. Let r > d 2 and k r. Then, there exists ε > 0 such that the following conditions hold: continuously and there exists C, only depending on ε, such that for every µ ∈ P(T d ) Proof. The first statement is a consequence of Sobolev's inequalities. The second statement comes from the regularity of the semigroup (see for example [34]).
We start with the semimartingale representation of µ n .
whereν n t is given by (1.9). The noise term M n t in (4.1) is defined by Let r > 1 2. Then, µ n satisfies the following weak-mild equation: for any h ∈ H r and t ∈ [0, T ], Proof. The proof is based on the Itô formula. Consider a regular function f = f (θ), then Observe that d θ i,n t = dt for every i = 1, . . . , n. By summing over i and integrating with respect to the time the previous expression, one obtains that, writing Finally, observe that the last expression is equivalent to (4.1) since The second part of the proposition follows from Ito formula on the test function f (θ, s) = (S t−s h)(θ) and by using the fact that Because of Sobolev inequalities, recall (1.13), P(T) ⊂ H −r continuously for any r > 1 2: any bracket ⟨⋅, ⋅⟩ −r,r is indeed the action of an element of H −r against an element of H r . Well-posedness of m n given in (4.3) as an element of H −r have already been addressed in the literature, see, e.g., [6,4,16], we will give another proof in Proposition 4.8.
Using Lemma 4.3 as well as the limit PDE (1.4), we can write down an equation for η n as defined in (1.6).
Moreover, η n satisfies the following weak semimartingale representation: for every regular test function f (recall the definition of L (1) given in (2.13)), 5) whereη n t is given in (1.10) and where W n t is defined by Let r > 1 2. The process η n satisfies the following weak-mild equation: for every h ∈ H r (T) and t ∈ [0, T ]: Proof. We first address the continuity of the processes η n : for θ, θ ′ ∈ T and ϕ a regular test function by Sobolev embedding, for any r > 3 2 . This means that Now, a.s. for all n 1, t ↦ θ 1,n t , . . . , θ n,n t is continuous from [0, T ] to T n . In particular, for all i = 1, . . . , n, for all t and t m → t, we have goes to 0 as t m → t. As a conclusion: t ↦ µ n t is a.s. continuous from [0, T ] to H −r for r > 3 2 . Classical results [25,64] assure that µ solves the Fokker-Planck equation (1.4), i.e., and that it is an element of C([0, T ], H −r ) (see for example [60]). As a conclusion, η n is a.s. continuous in H −r (T). Equation (4.5) (respectively (4.7)) is derived by subtracting the representations (resp. the weak-mild formulations) satisfied by µ n and µ, and by multiplying everything by √ n. Observe that W n in (4.6) is nothing but M n in (4.2) multiplied by √ n, and similarly for w n and m n . Well-posedness of w n as an element of H −r is postponed to Proposition 4.8.
Recall the definition of the noise term W n t in (4.6). Proof. This proofs follows closely the arguments given in [25,Prop. 4.1]. Let (ϕ p ) p 1 be a orthonormal system in H r . Let us first prove that By Doob's inequality, for the constant c r−1 > 0 given in Lemma 4.1, which gives (4.9). We now prove that the trajectories of W n are almost surely continuous in H −r . By (4.9), the series ∑ p 1 sup t∈[0,T ] W n t (ϕ p ) 2 is a.s. convergent, and hence, for fixed n and all ε > 0, there exists p 0 1 sufficiently large so that The last summand is smaller than 4ε 6 and the first one can be made smaller than ε 3 by the a.s. continuity of t ↦ ⟨W n t , ϕ⟩ for all ϕ. Hence, a.s. W n ∈ C ([0, T ], H −r ). The expression of ⟪W n ⟫ in (3.4) follows directly from Ito's isometry and (4.8) is a direct consequence of the stronger statement (4.9).
The supplementary term in the drift in the semimartingale representation (4.5) can be expressed in terms of the higher order fluctuation processη n defined in (1.10). The idea is to proceed further and write the semimartingale decomposition ofη n . For 1 i n, define the measuresμ (4.10) Lemma 4.6. Suppose Assumption 2.1. The process (η n ) defined in (1.10) belongs P ⊗ Pa.s. to C [0, T ], H −r ′ T 2 for any r ′ > 5 2 and satisfies the following weak semimartingale decomposition: for all regular test function (θ 1 , θ 2 ) ↦ g(θ 1 , θ 2 ) (recall the definition of L (2) and C n given respectively in (2.14) and Let r > 3 2. The processη n satisfies the following weak-mild equation: for every h ∈ H r (T 2 ) and t ∈ [0, T ]: with, recall definition (4.10), Λ n s andΛ n s,ij are respectively given by  Proof. The proof of continuity of the trajectories ofη n follows the same argument as in the proof of Lemma 4.4, as for any n 1,η n is a finite weighted sum of Dirac measures δ (θ i,n t ,θ j,n t ) . One has for any (θ 1 , θ 2 ), (θ ′ 1 , θ ′ 2 ) ∈ T 2 , any regular test function ψ on T 2 , for any r > 5 2 , which proves the desired continuity, with the same arguments as for Lemma 4.4. The semimartingale representation (4.11) is derived as the one of Lemma 4.3 but where Ito formula is applied to test functions of two variables, i.e., to g(θ i,n t , θ j,n t ). The second part of the proposition follows again from Ito formula, but with test functions g(θ, θ ′ , s) = (S t−s h)(θ, θ ′ ), and by using the fact that ∂ s S t−s h = − 1 2 ∇S t−s h. The choice of r > 3 2 and Sobolev inequalities, recall (1.13), assure that any bracket ⟨⋅, ⋅⟩ −r,r makes sense as the action of an element of H −r (T 2 ) against an element of H r (T 2 ). Well-posedness of m n given in (4.3) is given in Proposition 4.10. Moreover, Ŵ n satisfies P-almost surely Proof. Once again, we follow a very similar approach as in [25]. Recall from (4.12), that we may writê so that it is sufficient to prove (4.17) forŴ n,1 t andŴ n,2 t separately. Let (ϕ p ) p 1 be a orthonormal system in H r T 2 . We have that by Doob's inequality. This last quantity is further bounded by with the notations of Lemma B.2 (with k = r). Doing the same forŴ n,2 t requires to control now the term c n (Φ, I d , θ n s ), and hence we obtain directly (4.17) from (B.9) and (3.20). The continuity of the trajectories ofŴ n follows from the very same argument previously used from Lemma 4.5.
We now proceed further with moment estimates. Note that from here, our proof differs significantly from the line of proof followed in [25]. One could see Propositions 4.8 and 4.10 as equivalents of [25,Prop. 3.5], where a similar estimate on the fluctuation process is proven. Note however that the proof of [25,Prop. 3.5] uses in an essential way the exchangeability of the particles as well as the fact that the initial condition of (1.1) are i.i.d., which is no longer the case here. We circumvent this difficulty by taking advantage of the mild formulations (4.7) and (4.13) and the regularising properties of the heat kernel. By this method (see (4.21) and Remark 4.9), the control we have on the moments of η n andη n is weaker than in [25,Prop. 3.5], in the sense that one needs to have α ∈ [0, 1) in (4.18) and (4.22), whereas the same estimate is proven directly for α = 1 in [25,Prop. 3.5]. Note finally that a stronger uniform in time control E[sup t∈[0,T ] η n t 2 −r ] is proven in [25,Prop. 4.3], but a careful reading shows that this estimate is only used in [25] to prove the continuity of the fluctuation process, for which we have provided an alternative proof in Lemma 4.4. Recall that E [⋅] stands for the expectation w.r.t. the noise only, the results below hold for a fixed realisation of the graph and initial condition.  In (4.7), taking the supremum with respect to h in H r such that h r 1, one can write:

By using the classical inequality
Observe that similar to the proof of [4, Proposition 2.2], one has that where we have used the fact that Γ * η n s W r,∞ C ′ η n s −r given Assumption 2.1. Note that, as pointed out in Lemma 4.2, uniformly on s, n, ⟨µ n s , h⟩ h ∞ C h r , since µ n s is a probability measure. Hence, by Assumption 2.1, there is a constant C > 0 independent on s, n such that In turn, this would mean that, using Proposition 3.5, for any t ∈ [0, T ] and P-a.s.
However this is not enough when one considers p n converging to zero. To tackle the interesting case p n → 0, we need to take advantage of the representation of g n t (h) througĥ η n , recall (4.4). Observe that, using (4.20), we have Taking the expectation E in the inequality (4.19) yields One can then apply a version of Gronwall-Henry's inequality (see for example [31], Lemma 5.2), and obtain that Concerning the noise term, we have, as and for (ϕ p ) p 0 an orthonormal system in H r (T), Remark 4.9. We cannot use here the fact that ∂ θ S t−s −r C(1 + 1 √ t − s) because (t − s) −1 is not integrable. We follow arguments similar to [26,Lemma 11]: by exploiting Sobolev embeddings, fractional operators and the fact that there exists ε > 0 such that r − ε > 1 2, we can conclude that where we rely on Lemma 4.2.
Finally, gathering all these estimates, we deduce that there exists C > 0, depending on T , such that sup Putting every estimate together, we end up with (4.18).  The noise term (B) can be treated similarly toŴ n , recall Proposition 4.7, using the same arguments of w n , recall Proposition 4.8. In particular, for an orthonormal basis Φ = (ϕ p ) p 1 of H r one has that where the definitions of c T n (Φ, S t−s , u) are given in Lemma B.2 (taking in particular k = 3+ε if ε > 0 is such that r −2ε > 2, so that k −r = 1−ε). Since S t−s L(H r ,H k ) C 1 + Elevating everything to the power 1 + α (recall that α < 1) and taking the expectation gives that The expression within the integral in the last term (C) in (4.23), is the sum √ ng jk ∂ θ 2 S t−s h(θ i,n s , θ j,n s ) Γ(θ j,n s , θ k,n s ).
These terms are very similar to the term C n t defined in (3.3), and can be treated similarly as done Proposition 3.3, but relying moreover on the regularity of the semigroup S t−s . More precisely one can follow the steps of the proof of Proposition 3.3, only replacing f with S t−s g, the bound given (3.12) becoming then which leads to the bounds, for some constant C > 0 independent on s < t T g n,(1) Integrating w.r.t. s, elevating to the power 1 + α and taking the expectation gives finally that E [(C)] C √ np 2 n 1+α , P-a.s. Finally, with another application of Gronwall-Henri inequality, one obtains that the processη n satisfies P-a.s.
This concludes the proof.
To prove Proposition 3.1 it remains to give some regularity results for the operators L   ν ) is continuous from H −(r+2) (T) to H −r (T) (resp. from H −(r+2) (T 2 ) to H −r (T 2 )): there exist two positive constants C 1,T and C 2,T such that for all test function f (θ) and g(θ 1 , θ 2 ), Proof. This is a straightforward consequence of the definitions (2.13) and (2.14), Assumption 2.1 and the fact that ν is a probability measure. See [25, Lemma 3.7] for a very similar proof. ds is P ⊗ P almost-surely finite (recall the definition of r 0 , r 1 in (2.7)), so that this drift term makes sense as a Bochner integral. It suffices now to gather this estimate and Lemma 4.5 to conclude. The same argument works also forη n .

4.2.
Tightness results. We use the following tightness criterion [36, pp. 34-35]: a sequence of (Ω n , F n t )-adapted processes (Y n ) n 1 with path in C([0, T ], H), where H is an Hilbert space is tight if both of the following conditions hold: (1) For every t in some dense subset of [0, T ], the law of Y n t is tight in H, (2) (Aldous condition) For every ε 1 , ε 2 > 0, there exists δ > 0 and n 0 1 such that for every (F n t )-stopping time τ n T , sup Remark 4.12. Suppose that there exists a Hilbert space H 0 such that the injection H 0 → H is compact and such that, there exists m 1 such that for fixed t ∈ [0, T ], we have sup n E Y n t m H 0 < +∞. Then, for this t, Condition (1) above is satisfied. Indeed, for any R > 0, B R ∶= h ∈ H 0 , h H 0 R is compact in H, and, by Markov inequality, which goes to 0 as R → ∞, uniformly in n. Proof. We deal first with the processη n . Recall the definition of (r 0 , r 1 ) in (2.7). We apply the above tightness criterion in the case H 0 = H −r 0 T 2 and H = H −r 1 T 2 . Applying The second term (4.27) is treated using Proposition 3.3: since r 1 > 5 3, which goes to 0 as n → ∞, by Proposition 3.3, uniformly in θ δ. The last term (4.28) follows from a similar argument, using now Proposition 4.7. We have the rough bound Since r 1 4, we have by Proposition 4.7 that this term also goes to 0 as n → ∞. Putting all the previous estimates together, we obtain that Aldous criterion is verified, and hence the tightness ofη n in C [0, T ], H −r 1 T 2 . Now turn to the tightness of η n , which follows from very similar arguments. The point (1) of the tightness criterion follows directly from (4.18) and (4.22) (for r = r 0 ) and Remark 4.12. In a similar way, for the Aldous criterion, The first term is treated in the exact same way as for (4.26) above and we leave the details to the reader. Similarly, we have where we used ∂ θ 1 (Γ * η n s ) −r 1 ∂ θ 1 (Γ * η n s ) −(r 1 −1) C η n s −(r 1 −2) = C η n s −r 0 for a constant C > 0 independent of s. Recall now that r 0 > 3 so that (4.22) holds. We finally turn to the noise term W n whose treatment differs slightly from the previous calculations: we reproduce here (without giving all the details) some parts of the calculation made in [36, p. 40] (as part of the proof of the Rebolledo's theorem): we have, for all a > 0 where ⟪W n ⟫ is the Doob-Meyer process associated to W n given by (3.4). Choosing here a ε 2 1 ε 2 2, we see that it suffices to control where (ψ p ) is a complete orthonormal system in H r 1 and using Lemma 4.1. This gives the result.

4.3.
Convergence. The first result of this paragraph concerns the identification of the limit of the noise term: Proof. Tightness of (W n ) follows from (4.8) and the same calculations as the end of the proof of Theorem 4.13 (with H −r in place of H −r 1 ). Identification of the limit is a simple consequence of (3.4) and the weak convergence (2.4) of the empirical measure µ n (see also [25], Th. 5.2 for a similar proof). Proof. This proof follows the classical arguments of [25] with adequate technical changes. Consider such a convergent subsequence that we rename (η n ,η n ) for convenience and let (η,η) its limit in C [0, T ], H −r 1 (T) ⊕ H −r 1 (T 2 ) . We prove that (η,η) solves (2.16). From Assumption 2.5, (4.18) and (4.22), we deduce (replace it by the same estimate with some additional E g in case of Assumption 2.6) that P g -a.s., µs (g)⟩ ds for all test function g ∈ H r 2 T 2 . Decompose the previous quantity into, for F g (α) t ∶= ⟨α t , g⟩ − ⟨α 0 , g⟩ − ∫ t 0 ⟨α s , L (2) µs (g)⟩ ds, It is easy to see that for fixed g in is continuous, so that, since (η n ) converges in law as n → ∞ toη in C [0, T ], H −r 1 (T 2 ) , we have that F g (η n ) converges in law to F g (η) as n → ∞. It remains to prove that µs (g)⟩ ds converges in law to 0. We prove that it goes to 0 in L 1 : we have ds .
: by definition of L (2) (recall (2.14)), it suffices to estimate ∂ θ 1 g ⟨(µ n s − µ s )(dθ ′ ) , Γ (⋅, θ ′ )⟩ r 1 (the other term being treated analogously): Since Γ is regular with bounded derivatives, we have the following bound (recall the definition of d BL in (1.8)) where the constant C above is uniform in s and 0 j k r 1 . Hence, we obtain, for another numerical constantC > 0 Going back to (4.29), we obtain by Hölder inequality and using (4.22), Proof of Proposition 4.16, which relies heavily on classical estimates (see e.g. [37,38,51,46] for similar techniques) is postponed to Appendix D.

Particular cases.
The point of this paragraph is to prove the results of Section 2.5.
Proof of Proposition 2.11. Recall that we suppose Assumption 2.10 and that for any test function g Considering (e p ) p∈Z the canonical orthonormal basis of L 2 (T), the family defined by ψ p,q = (1 + p 2 + q 2 ) − r 2 e p ⊗ e p p,q∈Z constitutes an orthonormal basis of H −r (T 2 ). Then The term we get the decomposition Since the random variables V ij = 1 So, choosing t = p − 3 2 n n − 1 2 +δ for δ small and remarking that ∑ n i,j=1 1−pn for n large enough, we obtain, for some positive constant c, Now, relying on the decoupling inequality provided by [58], we get that for some constant C and forξ Consider the filtration F l,m = σ ξ (n,1) ,ξ . Denoting is a martingale which satisfies Y n,n = 1 n 3 ∑ i,j,k,l∈{1,...,n},(i,j)≠(k,l)ξ (n,1) ijξ (n,2) kl H i,j,k,l . It satisfies moreover X k,l ξ (n,1) ,ξ Cr pnn 2 + tCr which means that, taking t = p n n − 1 2 +δ we obtain P ( Y n,n > t) n 2 e −cn 2δ

4.5.
A case where the initial condition depends on the graph. The aim is to construct an example of initial condition that depends on the graph and for whichη n 0 has a non trivial limit in distribution. We place ourselves in the case where Γ is given by Γ(θ, θ ′ ) = − sin(θ − θ ′ ) and the graph sequence is a symmetric Erdős-Rényi with p = 1 2 . In particular, the variables ξ ji for every 1 i, j n. We further suppose that they do not depend on n, i.e., ξ (n) ij = ξ ij for every i, j and n. Let (G n ) n 1 be the corresponding filtration with G n = σ (ξ ij ) 1 i,j n .
Let the sequence (θ i 0 ) i 1 be defined by recursion with values in {0, π 2 }. We initiate the recursion by choosing θ 1 0 uniformly in {0, π 2 }. Then, supposing that (θ 1 0 , . . . , θ n 0 ) are already defined (and G n -measurable), we consider the G n+1 -measurable random variables and make the choice θ n+1 < R n 0 , and θ n+1 0 = U k with U k chosen uniformly in 0, π 2 (independently from G n ) when R n π 2 = R n 0 . Remark that by symmetry of the laws of R n 0 and R n π 2 (recall that p = 1 2 ), θ n+1 0 is a uniform random variable in {0, π 2 } independent from G n . Hence, the sequence (X n ) n 1 defined by X n = ∑ n i=1 1 {θ i 0 =0} is a symmetric simple random walk on Z. Observe that the laws of R n 0 and R n π 2 depend on G n only via X n . Conditionally on {X n = x}, +n−x 2 is a Binomial with parameters n−x and 1 2 , independent of R n 0 . Since µ n 0 = Xn n δ 0 + n−Xn n δ π 2 , we have d BL (µ n 0 , µ 0 ) → n→∞ 0 P−a.s., where µ 0 = 1 2 δ 0 + 1 2 δ π 2 , while η n 0 = √ n Xn n − 1 2 δ 0 + √ n n−Xn n − 1 2 δ π 2 converges in law in H −r 1 to η 0 = Z 1 δ 0 +Z 2 δ π 2 , where (Z 1 , Z 2 ) has centered Gaussian distribution of covariance matrix . The processη n 0 can be expressed in terms of R n 0 , R n π 2 as follows: and ⟨Γ * η n Let us study the convergence of these different terms. Since we have E R k 0 2 n 4 and and similarly Hence, it remains to study the convergence of the terms Recall that, conditionally to {X n = x} and up to a scaling factor 1 2, R n 0 and R n π 2 are (centered) Binomial random variables. Since X n n converges almost surely to 1 2 , one can apply the classical Normal approximation of the binomial distribution and approximate R n 0 n 2 by a centered Normal random variable. This yields to where, considering two independent random variables Y 1 , Y 2 with standard normal distribution, This means that N 1 n converges almost surely to − 1 . Via similar arguments, one can prove that N 2 n a.s.
Finally, we deduce that in this particular exampleη n 0 convergences law in , while Γ * η n 0 converges in law to 2

About fluctuations of local empirical measures
The purpose of the present section is to prove Theorem 2.9. Recall the definition (1.5) of the empirical measure µ n,l t of particles at distance 1 of vertex l = 1, 2 and the definition (1.12) of the empirical measure µ n,1,2 t of particles that are connected to both vertices 1 and 2. Recall that we are interested in the behavior of the joint fluctuation process (1.7) ζ n 5.1. The initial condition. We first address the convergence of the initial condition ζ n,1 0 , ζ n,2 0 , η n 0 . Proposition 5.1. Suppose that (θ i,n 0 ) are i.i.d. random variable with law µ 0 , independent of the graph. Recall that p ∶= lim n→∞ p n ∈ [0, 1]. Then, if np 2 n → ∞, for all r > 1 2 , and ζ n,1 0 , ζ n,2 0 , η n 0 converges in law as n → ∞ in H −(r+1 2) (T) 3 to the Gaussian process (ζ 1 0 , ζ 2 0 , η 0 ) with covariance 3) where f i , g i , h i , i = 1, 2 are test functions on T. In particular, (ζ 1 0 , ζ 2 0 , η 0 ) are mutually independent in the diluted case p = 0. Taking the expectation (w.r.t. both graph and initial condition), we obtain Recalling that ψ p = (1 + p 2 ) − r 2 e p , this last quantity is bounded provided r > 1 2 . Hence (ζ n,l 0 ) is tight in H −(r+1 2) (T) and the triplet ζ n,1 0 , ζ n,2 0 , η n 0 has convergent subsequences in H −(r+1 2) (T) 3 . It suffices to identify its finite dimensional marginals: let u, v, w ∈ R and f, g, h test functions. Define where F ξ is the σ-field generated by the variables ξ Note that, using (B.5) and (B.6), we have that P-a.s. s 2 Fix some δ > 0 and compute Applying Hölder inequality twice (to the expectation E ⋅ F ξ and to the discrete mean 1 n ∑ n j=1 ), we have, This means that P-a.s., 1 s 2+δ F ξ n − δ 2 which goes to 0 as n → ∞. The Lyapounov's condition for CLT is satisfied (see [9, eq. (27.16), p. 385]). We are in position to apply Th. 27.3, p. 385 of [9]: P-a.s., 1 s n,U ∑ n j=1 U (n) j converge in law to a standard Gaussian N (0, 1), which gives that uX (1) n (f ) + vX (2) n (g) + w ⟨η n 0 , h⟩ converges in law to N (0, s 2 U (u, v, w)) where s 2 U (u, v, w) is given by (5.5). We use the same argument for the term uY which goes as n → ∞ to The Lyapounov condition is also verified: 1 Hence, applying the same result, we have that uY (1) n (f ) + vY (2) n (g) converges in law to some Gaussian N 0, s 2 V , where s 2 V is given by (5.6). With these two convergence results at hand, we can now go back to (5.4): since that P-a.s., uX (1) n (f ) + vX (2) n (g) + w ⟨η n 0 , h⟩ converges in law to N (0, s 2 U ), we have that P-a.s., Then one can write from (5.4), The first term above converges to exp − s 2 U +s 2 V 2 and by dominated convergence theorem, we see that the second term above converges to 0.

Semimartingale decompositions.
We follow here the same approach as for the global fluctuation process, that is to apply Ito's formula so as to derive a proper semimartingale decomposition for ζ n,l t , the key step being to identify the vanishing terms as n → ∞ in this decomposition (which correspond here to terms in the asymptotic development of µ n,l that are of order lower than 1 √ npn ). Since the approach is highly similar to the one followed for global fluctuations, we only give the main lines of proof and leave to the details to the reader. where U s is given in (2.20), V n,l s f (θ) ∶= ⟨µ n,l s (dθ ′ ) , ∂ θ f (θ ′ ) Γ (θ ′ , θ)⟩ is the microscopic equivalent of V s defined in (2.21) and Θ is defined by (2.12). The remaining drift term in (5.7) is given by 8) and the noise term is The process (W n,1 t , W n,2 t , W n t ) t∈[0,T ] is a martingale in C [0, T ], (H −r ) 3 for r > 3 2 , with Doob-Meyer process given for t ∈ [0, T ] and ϕ, ψ ∈ H r by ⟪W n,l , W n,l ⟫ t ⋅ ϕ(ψ) = Main lines of proof of Proposition 5.2. First begin by applying Ito's formula to µ n,l , for l = 1, 2: for f regular, ⟨µ n,l t , f ⟩ = ⟨µ n,l 0 , f ⟩ + t 0 ⟨µ n,l s , ij ∂ θ f θ i,n s Γ θ i,n s , θ j,n s ds.
All of this gives (5.7), using the definition of the noise in (5.9) and the drift term n,l in (5.8). The rest of the proof follows from the same arguments as for Lemma 4.4.

5.3.
Tightness and convergence. Recall Proposition 2.11: under our hypothesesη n converges toη ≡ 0 as n → ∞, so that we see that the term ∫ t 0 √ p n Θ * ηn s ds in (5.7) does not contribute to the limit when n → ∞. It remains to deal with the term ∫ t 0 Θ * n,l s ds that we want also to prove that it vanishes as n → ∞. This is the purpose of the following proposition: Main lines of proof of Proposition 5.3. We follow the same strategy as forη n : to write a semimartingale decomposition for n,l , to prove tightness of this process and to identify its limit as the unique solution to a linear SPDE with noise and initial condition identically zero, so that, by uniqueness lim n→∞ n,l ≡ 0. We only draw the main lines of proof here.
Another application of Grothendieck inequality gives that the remaining terms in the last two sums are controlled in H −r 1 T 2 by respectively √ np n S n (l) defined in (3.17) and √ np n S n (l) defined in (3.18). Hence, by Proposition 3.5, this term is of order which goes to 0 as n → ∞. Secondly, the noise term in (5.15), that we denote by W n, t (g) is again controlled as follows and by the same argument as above, the H −r T 2 -norm of W n (for r > 3) is of order 1 np 3 n which goes to 0 as n → ∞, so that the noise term (5.15) vanishes as n → ∞. Finally, we turn to the initial condition:  By the same arguments as before, one can prove that the process ( n,l t ) is tight in H −r 1 T 2 and converges as n → ∞ to the unique solution s in H −r 2 T 2 to T ] is independent of the initial condition (ζ 1 0 , ζ 2 0 , η 0 ) given in Proposition 5.1.

5.5.
Proof of the main convergence result. We now turn to the proof of Theorem 2.9: Main lines of proof of Theorem 2.9. Putting all the previous estimates into the semimartingale decomposition (5.7) and applying the same arguments as for the process η n (using in particular the weak-mild formulation (5.11) which lead to similar estimates as for Proposition 4.8), we see that ζ n,l is tight in H −r 1 (T) and that, almost surely w.r.t. the randomness of the graph, the joint process ζ n,1 , ζ n,2 , η n converges in C [0, T ], (H −r 1 (T)) 3 towards ζ 1 , ζ 2 , η solution to that is nothing else as the weak formulation of (2. 19). Note that we use here the convergence (2.5) to identify the limit as in the proof of Proposition 4.15. Uniqueness of a solution to (5.7) follows from the same arguments as for Section D.
Lemma A.1. There exists a constant C > 0 such that, for any regular function v ∶ T × T → R, it holds that sup . In particular, By integrating the previous expression with respect to θ 1 , one obtains which implies . By using [1, Theorem 5.4, Part I, Case (B)], there exists an universal constant C > 0 such that sup Observe that sup θ 1 ∈T u(θ 1 ) = sup θ 1 ∈T v(θ 1 , ⋅) 2 L 2 ( dθ 2 ) , the proof is concluded with the constant given by 2C.

Appendix B. More on Grothendieck inequalities and concentration estimates
We gather in this Section the definitions of some auxiliary (possibly weighted) empirical mean values concerning the centered variablesξ  4)) the subscript 1 (resp. 2) stands for the fact that U n,1 and U n,1 (resp. U n,2 and V n,2 ) are of order 1 (resp. order 2) in the sense that one sums over j only (resp. over both i and j). Moreover, under np n → n→∞ +∞, for all ε ∈ 0, 1 2 , Finally, suppose that p n c n 1−δ for some δ > 0 (as n → ∞). For l ∈ {1, 2}, for any ε Proof of Lemma B.1. We first prove (B.5). By Bernstein inequality, since ξ (n) lj v j 1 pn . Choosing for ε ∈ 0, 1 2 , t = n 1 2 +ε p ε− 1 2 n , we obtain that P 1 n ∑ n j=1ξ (n) Since np n → ∞, for n large we have 1 3(npn) 1 2 −ε 1, so that the previous quantity is further bounded by 2 exp − 1 4 (np n ) 2ε , which is summable under the assumptions of the present lemma. To prove (B.6), we apply the same Bernstein inequality to the sequence of in- 2j v j , j = 1, . . . , n: since ξ (n) Hence, the calculation is the same as before, replacing p n by p 2 n and the result follows from the same calculations. Estimate (B.7) is again a simple consequence of Bernstein inequality: for all t > 0 we have . Choosing t = n 1+ε p ε−1 n , the previous bound becomes Since np n → ∞, this quantity is further bounded, for n large, by 2 exp − 1 4 n 2ε p 2ε−1 n . Now note that n ε p 2ε−1 n 1 when ε ∈ 0, 1 2 , so that the final bound becomes 2 exp − 1 4 n ε . Let us now give the proof of (B.8) for l = 1. Fix (u i , v j ) such that u i 1 and v j 1 and define ..,n is a (F i )-martingale and one has, for all k < n ∑ a∈Z e a (θ 2 ) ∫ T ∂ θ 1 F (ϕ p )(θ 1 , θ)ē a (θ) dθ as well as, for a, b ∈ Z, 1 2 for some δ > 0. We then have, using Cauchy-Schwarz inequality for the linear form I F,a,u (ϕ) ∶= Observe that for fixed a, a 1, u ∈ T and any l 1 ]ē a (θ)dθ + (ia) l I F,a,u (ϕ).
Taking the absolute values in the previous expression and using Lemma A.1, one obtains that there exists a positive constant C, independent of a and u, such that This means that, taking l = ⌊k⌋ − 2, sup u∈T n I a,u −r C a l F L(H r ,H k ) , for any a ∈ Z, a 1.
Going back to (B.10), choosing δ small enough, this implies that Consider the initial condition (A): writing ξ li pn =ξ li + 1, we see that µ n,l 0 = µ n 0 +μ n,l 0 (recall the definition ofμ n,l t in (4.10)) and that µ n,1,2 0 = µ n 0 +μ n,1 0 +μ n,2 0 +μ n,1,2 0 wherê µ n,1,2 Thus, settingm we have in all cases that (noting that if f ∈ BL = BL(T) (recall (1.8)), P 0,T f is also in BL) , which is, P-a.s., uniformly in f ∈ BL, smaller than Cβ q n with Concentrate now on the last term (D) in (C.3). Developing into Fourier series gives, for e a , a ∈ Z the standard Fourier basis, for θ 1 , θ 2 ∈ T, Γ (θ 1 , θ 2 ) = ∑ a∈Z e a (θ 2 ) ∫ Γ (θ 1 , θ)ē a (θ)dθ. Note that ∫ Γ (θ 1 , θ)ē a (θ)dθ C (1+ a ) r for some constant C > 0 independent of θ 1 , as ∂ r θ 2 Γ(θ 1 , θ 2 ) ∞ < +∞, which means that So we obtain, by Jensen inequality, recalling that ∂ θ P t,T f (θ) C 0 and the definition of S n (Ξ) in (C.2), Taking the expectation on both sides and noting that for any fixed a ∈ Z, e a is bounded and Lipschitz with constant equal to a , we obtain Choosing r = q + 2, we deduce finally that there is another constant C > 0 such that Taking expectation E [⋅] in (C.3), we obtain, for all f ∈ BL, for some constant Specify first the analysis to the case m n = µ n : recalling thatm n 0 ≡ 0 and S n (Ξ) = 1 in this case, one obtains E sup s T ⟨µ n s − µ s , f ⟩ q C ⎛ ⎝ 2 q−1 d BL (µ n 0 , µ 0 ) q + β q n + (γ n S n ) q + If we would have been able to put the supremum in f ∈ BL inside the expectation in the lefthand side of (C.5), the result would follow simply by a Grönwall argument. To bypass this difficulty, we proceed by a compactness argument, which is will be useful not only to m n = µ n but to the other cases too, so that we write it with a general m n : the set BL is compact, by Ascoli-Arzelà theorem. Thus, for all ε > 0, there exists f 1 , . . . , f k ∈ BL such that for all f ∈ BL, there exists j = 1, . . . , k such that sup θ∈T f (θ) − f j (θ) ε. Take now f ∈ BL, n 1, we have, Note that since µ s is a probability measure, sup s T ⟨µ s , f − f j ⟩ q ε q ε for ε 1. In a same way, sup s T ⟨m n s , f − f j ⟩ q S n (Ξ) q ε, P g -a.s. (and this P g -a.s. does not depend on f , g j , nor ε). Hence, there is a universal constant C > 0 such that, for any t ∈ [0, T ] Apply once (C.6) to the righthand side of (C.5) (with m n = µ n ), then take f = f j and finally the maximum over j = 1, . . . , k in (C.4) we obtain, max j=1,...,k E sup s T ⟨µ n s − µ s , f j ⟩ q C ⎛ ⎝ 2 q−1 d BL (µ n 0 , µ 0 ) q + β q n + (γ n S n ) q + 2CT ε Taking lim sup n→∞ on both sides, using (2.1), the fact that both β n and γ n S n go to 0 as n → ∞ under the present assumptions, we obtain, setting that v T 2C 2 T ε + C ∫ T 0 v t dt, so that by Grönwall Lemma, v T C ′ ε, for a constant C ′ > 0 that only depends on (Γ, T ). Inserting this estimate into (C.6) gives finally that lim sup n→∞ E sup s t d BL (µ n s , µ s ) q (C + C ′ )ε for all ε, which gives the desired convergence (2.4).
We now turn to the proof of the convergence of the local empirical measures: combining (C.6) with (C.4) applied to f = f j and taking now advantage that we know that (2.4) is true, we obtain Note here that it is not sufficient for us to directly apply Grothendieck inequality to the remaining term (since it is only saying that this term is bounded, not that it goes to 0). The point here is to note that f j ∈ BL so that g j ∶= P 0,T f j ∈ BL too. Hence, there exists some j ′ , such that g j − f j ′ ∞ ε. Writing again g j = g j − f j ′ + f j ′ , we obtain further that for the choice of v j = f θ n j,0 . Using (B.5) and (B.6), we obtain that in any case P g -a.s., max j ′ =1,...,k ⟨m n 0 , f j ′ ⟩ q → 0 as n → ∞. Note however that this P g -a.s. depends on the choice of the functions f j and thus on ε. Taking now ε of the form ε = 1 p with p 1, we have from (C.7) and the previous argument that P g -a.s., for any p 1, which concludes the proof.
The next step is to prove that the previous equality is also valid in the space C r 2 . This relies on the following lemma (see [37], Lemma 3.11 for a proof of this result) Lemma D.1. Under the assumptions of Section 2.1, for any probability measure ν the operator L (2) ν is continuous from C r 2 +2 to C r 2 and L (2) ν g C r 2 C g C r 2 +2 , as well as, for any j r 2 + 2 the operator U (t, s) is a linear operator from C j to C j such that U (t, s)g C j C g C j , 0 s t T, U (t, s)g − U (t, s ′ )g C j C g C j+1 √ s ′ − s, 0 s s ′ t T.
In particular, we know that for g ∈ C r 2 +3 , s ↦ L (2) s (U (t, s)g) is continuous in C r 2 and hence that ∫ t 0 L (2) s (U (t, s)g)ds makes sense as a Bochner integral in C r 2 . In particular, we obtain, for every g ∈ C r 2 +3 that U (t, s)(g) − g = t s L (2) r U (t, r)gdr, in C r 2 . Moreover, one has the representation where V is given in (D.5).
We are now in position to prove Proposition 4.16: Proof of Proposition 4.16. The proof follows arguments similar to [51] (see also [46]). , and using the fact that µ t has for t > 0 a smooth density µ t (dθ) = p t (θ)dθ (see e.g. [18] or [30,Prop. 7.1]) Since ∂ θ f ∞ f C 1 C f 2 C f r 1 and since the quantity in (D.7) goes to 0 as t − s → 0, we conclude from this that R ∈ C ([0, T ], H −r 1 (T)). Remark also that for all j r 1 for a constant C that only depends on Γ. In particular, K t f C r 1 C f C 1 C f C r 1 and hence K The main point of the proof is to see that h solution of (D.9) can be approximated by the converging sequence (h n ) n 1 defined recursively as follows Indeed, by the boundedness of the semigroup V (t, u) and by (D.8), we obtain that for all 0 < u < t < T , for all f ∈ C r 1 , for all h ∈ C −r 1 Thus, the sequence (h n ) n 1 defined in (D.10) satisfies, for all n 2 h n+1 (t) − h n (t) C −r 1 C t 0 h n (u) − h n−1 (u) C −r 1 du.
By an immediate recursion, for all k 1, h n+1+k (t) − h n+k (t) C −r 1 C k T k k! , so that (h n ) n 1 is a Cauchy sequence in C([0, T ], C −r 1 ) and thus converges to h, solution of (D.9). Turning back to η and writing S(t) ∶= η 0 + ∫ t 0 Θ * η s ds + W t , we obtain that η is uniquely written as This proves pathwise uniqueness. But if one chooses another solutionη defined on another probability space, with initial conditionη 0 and noise W with the same law as (η 0 , W ), we obtain the same expression as above with S replaced byS(t) =η 0 + ∫ t 0 Θ * η s ds + W t . Since S andS have then the same law, uniqueness in law follows from (D.11). Proposition 4.16 is proven. ij Γ θ i,n d,t , θ j,n d,t dt + dB i t , 0 < t T, i = 1, . . . , n, (E.2)