A Law of Large Numbers for interacting diffusions via a mild formulation

Consider a system of $n$ weakly interacting particles driven by independent Brownian motions. In many instances, it is well known that the empirical measure converges to the solution of a partial differential equation, usually called McKean-Vlasov or Fokker-Planck equation, as $n$ tends to infinity. We propose a relatively new approach to show this convergence by directly studying the stochastic partial differential equation that the empirical measure satisfies for each fixed $n$. Under a suitable control on the noise term, which appears due to the finiteness of the system, we are able to prove that the stochastic perturbation goes to zero, showing that the limiting measure is a solution to the classical McKean-Vlasov equation. In contrast with known results, we do not require any independence or finite moment assumption on the initial condition, but the only weak convergence. The evolution of the empirical measure is studied in a suitable class of Hilbert spaces where the noise term is controlled using two distinct but complementary techniques: rough paths theory and maximal inequalities for self-normalized processes.


Introduction
The theory of weakly interacting particle systems has received great attention in the last fifty years. On the one hand, its mathematical tractability has allowed to obtain a deep understanding of the behavior of the empirical measure for such systems: law of large numbers [28,5], fluctuations and central limit theorems [31,12], large deviations [11,19] and propagation of chaos properties [30] are by now established. On the other hand, the theory of weakly interacting particles enters in several areas of applied mathematics such as mean-field games or finance models [4], making it an area of active research.
Depending on the context of application, several results are available. The class of mean-field systems under the name of weakly interacting particles is rather large and models may substantially vary from one another depending on the regularity of the coefficients or the noise. This richness in models is reflected in a variety of different techniques implemented in their study (see e.g. [5,28,30] for three very different approaches).
If one focuses on models where the interaction function is regular enough, e.g. bounded and globally Lipschitz, one of the aspects that has not been completely investigated so far, concerns the initial condition. To the authors' knowledge, most of known results require a finite moment condition in order to prove tightness properties of the general sequence (e.g. [19]) or to apply a fixed-point argument in a suitable topological space (e.g. [5]). The only exceptions are given by [30,31], although they require independent and identically distributed (IID) initial conditions. We want to point out that existence of a solution to the limiting system, a non-linear partial differential equation (PDE) known as Fokker-Planck or McKean-Vlasov equation, does not require any finite moment condition on the initial measure, see e.g. [30,Theorem 1.1]. Furthermore, whenever the particle system is deterministic, there is no need to assume independence (or any finite moment) for this same convergence, e.g. [10,27].
We present a result in the spirit of the law of large numbers, without requiring any assumption on the initial conditions but the convergence of the associated empirical measure. Our main idea consists in exploiting a mild formulation associated to the stochastic partial differential equation satisfied by the empirical measure for a fixed (finite!) population. The main difficulty is giving a meaning to the noise term appearing in such formulation: exploiting the regularizing properties of the semigroup generated by the Laplacian in two different ways, using rough paths theory and maximal inequalities for self normalized processes respectively, we are able to adequately control it. By taking the limit for the size of the population which tends to infinity, the stochastic term vanishes and the limiting measure satisfies the well-known McKean-Vlasov equation.
1.1. Organization. The paper is organized as follows. In the rest of this section we present the model, known results and introduce the set-up in which the evolution of the empirical measure is studied along with notation used.
In Section 2 we give the definition of our notion of solution as well as a corresponding uniqueness statement. The law of large numbers, Theorem 2.3, is presented right after; the section ends with a discussion, the strategy of the proof and the existing literature.
The noise perturbation mentioned in the introduction is tackled in Section 3 where rough paths techniques and maximal inequalities for self-normalized processes are exploited. The proof of Theorem 2.3 is given at the end of this section.
Appendix A recalls general properties of analytic semigroups; Appendix B provides an extension of Gubinelli's theory for rough integration to our setting.
1.2. The model and known results. Consider Ω, F , (F t ) t 0 , P a filtered probability space, the filtration satisfying the usual conditions. Fix d ∈ N, let (B i ) i∈N be a sequence of IID R d -valued Brownian motions adapted to the filtration (F t ) t 0 .
Fix n ∈ N and T > 0 a finite time horizon. Let Γ : R d × R d → R d be a bounded Lipschitz function, and (x i,n ) 1 i n the unique strong solution to The initial conditions are denoted by the sequence (x i 0 ) i∈N ⊂ R d , whenever they are random they are taken independent of the Brownian motions. Existence and uniqueness for (1.1) is a classical result, e.g. [29].
The main quantity of interest in system (1.1) is the empirical measure ν n = (ν n t ) t∈[0,T ] , defined for t ∈ [0, T ] as the probability measure on R d such that Observe that ν n is apriori a probability measure on the continuous trajectories with values in R d , i.e. ν n ∈ P(C([0, T ], R d )), however in many instances we rather consider its projection (ν n t ) t∈[0,T ] ∈ C([0, T ], P(R d )) as continuous function over the probability measures on R d . This last object does not carry the information of the time dependencies between time marginals, but is in our case more suitable when studying (1.1) in the limit for n which tends to infinity.
Known results. Fix a probability measure ν 0 ∈ P(R d ). Whenever (x i 0 ) i∈N are taken either to be IID random variables sampled from ν 0 , or such that ν n 0 weakly converges to ν 0 with some p 1 finite moment, it is well known (e.g. [30,Theorem 1.4] and [5,Theorem 3.1]) that ν n converges (in a precise sense depending on the setting) to the solution of the following PDE, known as non-linear Fokker-Planck (or McKean-Vlasov) equation where * denotes the integration with respect to the second argument, i.e. for µ ∈ P(R d ) A solution to (1.3) is linked with the following non-linear process: where B is a Brownian motion independent of (B i ) i∈N and x 0 . It is well-known that i.e. the space of smooth functions with compact support, with respect to the norm where α = (α 1 , . . . , α d ) with |α| = α 1 + · · · + α d and Fix p = 2 and m > d/2, we consider the Hilbert space H m := W m,2 (R d ), with norm denoted by · m and its dual H −m := (H m ) * with the standard dual norm defined by µ −m := sup h m 1 µ, h −m,m , where ·, · −m,m is the action of H −m on H m . By duality, if follows from (1.5) that We denote by (·, ·) m the scalar product in H m and by ·, · the natural action of a probability measure on test functions, i.e. for ν ∈ P(R d ) and a smooth function h, ν, h = R d h(x)ν( dx). We often abuse of notation denoting the density of a probability measure by the probability measure itself. If ν ∈ P(R d ), and thus ν ∈ H −m , let ν ∈ H m be its Riesz representative, then we have for any h ∈ H m If (µ n ) n∈N is a sequence of probability measures which weakly converges to some µ ∈ P(R d ), we use the notation µ n ⇀ µ. For weak convergence and weak-*convergence of a sequence (x n ) n ⊂ X to some x ∈ X, X being a Banach space, we use the standard notations x n ⇀ x and x n * ⇀ x respectively. As introduced in [26], we will use · −m as distance in the space P(R d ) and our results will be expressed with respect to this topology.
The various constants in the paper will always be denoted by C or C α to emphasize the dependence on some parameter α. Their value may change from line to line.

Main result
Before stating the main result, we give the definition of weak-mild solution to (1.3) in the Hilbert space H m . We denote by S = (S t ) t∈[0,T ] the analytic semigroup generated by ∆ 2 on H m , see Appendix A for general properties of S.
If Γ is sufficiently regular, uniqueness is a consequence of Gronwall's Lemma.
In particular, Observe that, for µ ∈ H −m it holds that where we have used the properties of the semigroup. Using the continuous embedding of P(R d ) into H −m , we conclude that there exists a (new) constant C > 0: A Gronwall-like lemma yields the proof.
We are ready to state the main result.
be the unique m-weak-mild solution to (2.1) associated to some ν 0 ∈ H −m . Let ν n be the empirical measure associated to the particle system (1.1). If in probability. Moreover, if ν 0 ∈ P(R d ), then ν is the unique weak solution of the McKean-Vlasov equation (1.3) and, in particular, ν ∈ C([0, T ], P(R d )).
2.1. Discussion. Theorem 2.3 shows a law of large numbers in L ∞ ([0, T ], H −m ) by directly studying the evolution of the empirical measure and without passing through trajectorial estimates as done in several proofs after the seminal work by Sznitman [30]. This allows to deal with very general initial data: the weak convergence of (ν n 0 ) n∈N in H −m , which is implied by the weak convergence in P(R d ), suffices.
Working in H −m for a suitable choice of m assures a bound on ν −m which is uniform in ν ∈ P(R d ) because of the continuous embedding of P(R d ) in H −m and the duality properties of probability measures, see Lemma A.2. This establishes a compactness property for (ν n ) n∈N which is usually hard to obtain in P(R d ) and becomes our main tool for establishing the existence of a convergent subsequence.
Even if weak-mild solutions make sense for any m > d/2, we have to require the stronger condition m > d/2+3, and thus more regularity on the interaction function Γ, in order to give a pathwise meaning to the stochastic perturbation. As pointed out in the introduction, this regularity in Γ is already enough to have existence of a weak solution to the McKean-Vlasov equation (1.3) for any initial measure ν 0 . Observe that uniqueness of solutions to (1.3) is a byproduct of our result.
The authors believe that Theorem 2.3 remains true for every m > d/2, but they couldn't avoid the use of rough paths theory and the consequent need of regularity; see Remark 3.7 for more on this aspect.

2.2.
Strategy of the proof. Using Itô's formula, we derive an equation satisfied by ν n for every fixed n ∈ N, which turns out to be the McKean-Vlasov PDE perturbed by some noise w n , see Lemma 3.1. This equation makes sense in L ∞ ([0, T ], H −m ) and in this space we study the convergence of (ν n ) n∈N .
The main challenge towards the proof of Theorem 2.3 is giving a meaning to w n and suitably controlling it. We first give a pathwise definition of such term through rough paths theory, see Lemma 3.2, referring to Appendix B for a suitable theory of rough integration in our setting. This in turn will allow to show that (ν n ) n∈N is uniformly bounded in L ∞ ([0, T ], H −m ) and extract a weak-* converging subsequence, see Lemma 3.8.
To show that a converging subsequence satisfies the weak-mild solution (2.1) in the limit, as shown in Lemma 3.9, we need a further step: the pointwise estimate of w n (h), for a fixed h ∈ H m . Using a suitable decomposition of the semigroup and a maximal inequality for self-normalized processes, we are able to prove that w n (h) converges to zero in probability as n diverges, see Lemma 3.6. If on the one hand the rough paths bound cannot take advantage of the statistical independence of the Brownian motions and thus, cannot be improved in n, on the other hand the probability estimate does not suffice to define w n as an element of L ∞ ([0, T ], H −m ). We refer to Subsection 3.2 and Remark 3.7 for more on this aspect.
The uniqueness of weak-mild solution, Proposition 2.2, is the last ingredient to obtain that any convergent subsequence of (ν n ) n∈N admits a further subsequence that converges P-a.s. to the same ν satisfying (2.1). This is equivalent to the weak-* convergence in probability to the weak-mild solution ν.

2.3.
Existing literature. Proving a law of large numbers by directly studying the empirical measure and not the single trajectories is the classical approach in the deterministic setting [27,10]. In the case of interacting diffusions driven by Brownian motions, the idea of studying the equation satisfied by the empirical measure for a fixed n, comes from the two articles [3,24] and the recent [6], where a weak-mild formulation is derived and carefully studied. Contrary to our case, in [3,6,24] the particles live in the one dimensional torus which considerably simplifies the analysis, we refer to Remark 3.5 for more on this aspect. A Hilbertian approach for particle systems has already been discussed in [12], where it is used to study the fluctuations of the empirical measure around the McKean-Vlasov limit. However, [12] does not make use of the theory of semigroups but instead requires strong hypothesis on the initial conditions which have to be IID and with finite (4d + 1)-moment (see [12, §3]). The evolution of the empirical measure (1.2) is then studied in weighted Hilbert spaces (or, more precisely, in spaces of Bessel potentials) so as to fully exploit the properties of mass concentration given by the condition on the moments. The function Γ is required to be C [12, §3]. Observe that we are not able to present a fluctuation result, given the lack of a suitable uniform estimate on the noise term.
A mild formulation for the empirical measure exists only because of the regularizing properties of the noise in (1.1) and cannot be directly extended to deterministic particle systems. The so-called semigroup approach has recently been used in similar settings, e.g. [13,14], but never dealing with empirical measures tout-court.
Observe that, under a suitable change of the time-scale, the n-dependent SPDE satisfied by the empirical measure (1.2) is the mild formulation of the Dean-Kawasaki equation [23, Theorem 1] and [22].

Proofs
We start by giving the n-dependent stochastic equation satisfied by the empirical measure for each n ∈ N. We then move to the control on the noise term and, finally, the proof of Theorem 2.3.

3.1.
A weak-mild formulation satisfied by the empirical measure. Recall that (S t ) t∈[0,T ] denotes the semigroup generated by ∆ 2 on H m .
Summing over all particles and dividing by 1/n, the claim is proved modulo wellposedness of the noise term w n which is presented in the following subsection.

3.2.
Controlling the noise term. The aim of this subsection is to control the noise term w n appearing in the weak-mild formulation (3.1) for the empirical measure. We start by giving a pathwise definition of the integral (3.2), i.e. for any ω ∈ A ⊂ Ω where P(A) = 1 and any h ∈ H m we define which in turn allows to define w n as an element of for ω ∈ A, see Lemma 3.2. For this purpose, we extend Gubinelli's theory for rough integration (see [17] and [18, §3 and 4 ]) to our setting, see Appendix B for notations and precise results on this extension. A probabilistic estimate is then given, exploiting the independence of the Brownian motions; Lemma 3.6 shows that This estimate will allow us to prove the convergence of (3.1) to (2.1) for every fixed h ∈ H m , see Lemma 3.9.
Pathwise definition via rough paths theory for semigroup functionals. We start by observing that the noise term w n t (h) in (3.2) is neither a stochastic convolution that could be treated using a maximal inequality in Hilbert spaces (e.g. [7, §6.4] and [2] in the context of an unbounded diffusion operator), nor a classical controlled rough path integral (e.g. [15]) as the integrand depends on the upper integration limit.
We combine the strategies in [18,17] so to define w n t (h) in a pathwise sense. Note that our setting is different from [18], where an infinite dimensional theoryà la Da Prato-Zabczyk is constructed, while we are interested in finite dimensional stochastic integrals over functionals of such objects. Our construction is nonetheless similar to [18]: we fix the Itô-rough path lift associated to Brownian motion and extend the algebraic integration in [18] to our setting of semigroup functionals. This extension is presented in detail in Appendix B, where the main ingredient, the Sewing lemma, is proven. Before stating Lemma 3.2, we present in a heuristic fashion the main ideas towards a rough path construction of (3.2).
Note that it suffices to define integrals of the form in a pathwise sense for a class of sufficiently regular functions f and where (x u ) u is an R d -valued process controlled by the Brownian motion (B u ) u , such that Recall that in the classical setting of rough paths theory, one has for s t where we have used the notation B ts := B t − B s as well as In particular, is a remainder in the terminology of [17]. In the same spirit of [17], we rewrite the left hand side of this expression as We are thus left with [δI] ts = A ts + R ts .
Recall that Gubinelli's Sewing Lemma formulates precise conditions under which a given germ A gives rise to a unique remainder term R ts = o(|t − s|) and such that I can be obtained as If one tries to follow a similar approach for the quantity of interest (3.3), a canonical candidate for local approximations to However, notice that if we were to set in contrast to the above setting, meaning the standard approach of [17] The idea is hence to change the cochain complex in [17] and to consider a perturbed version of it associated to the operatorδ, this is done in Lemma B.1. Lemma B.2 proves a Sewing Lemma in this modified setting, which in turn allows to construct the above remainder R ts . The germ will therefore be For 0 = t 0 < · · · < t n+1 = t, note that due to the correct way of sewing together the germs is given by Define the operator A acting on f ∈ H m into C(∆ 2 , R) via where D x denotes the Jacobian in R d and · the scalar product between tensors of the same dimension. In the sequel, we adopt the following shorter notation [Af ] ts := ∇S ts f s B ts + D x ∇S ts f s B ts .
As in classical rough paths theory [Af ] ts is not a 1-increment (i.e. a difference as B ts ) but a continuous function of the two variables s and t. In particular A ∈ D 2 , i.e. A is a linear operator from the Banach space H m to C 2 . One can actually prove that A ∈ D α 2 : for 0 s t T and where C α = C α (ω) depends on the α-Hölder norm of B(ω) and B(ω) and we have used the properties of S, see Lemma A.1. Note in particular that C α < ∞, P-a.s. and that C α has finite moments of all orders.
Recall the definition ofδ (Lemma B. In particular, using Chen's relation We rewrite everything as the sum of four terms where C α = C α (ω) depends on the α-Hölder norm of B(ω) and we have used the properties of S, see Lemma A.1. Note in particular that C α < ∞, P-a.s. and that C α has finite moments of all orders. Similarly, for A 2 (with a different C α ) Observe now that, since f ∈ C 3 b , the function D x ∇S ts f is Lipschitz uniformly in s and t, from which we extract that Using (3.4), we recognize in A 4 the Taylor expansion of ∇S ts f around x s , i.e.
Putting the four estimates together, we have just shownδA ∈ D 1+ 3 and, in particular, that δ A The proof is concluded.
Controlling w n t (h) via a maximal inequality for self-normalized processes. The aim of this subsection is to give a probabilistic bound on by exploiting the independence of the Brownian motions (we have removed the product symbol · for the sake of notation).
Observe that if w n t (h) didn't involve a convolution with the semigroup S, w n t (h) would be a standard martingale and classical estimates like the Burkholder-Davis-Gundy inequality could be used to establish the desired bound. While the convolution with the semigroup S destroys the martingale property, w n t (h) is still closely related to maximal inequalities for self-normalized martingales for which the following fine estimate due to Graversen and Peskir [16] for every stopping time τ T .
Observe that this result is a consequence of more general bounds on self-normalized processes of the form X t = A t /B t (e.g. [8]), where in this case A t = M t is a martingale and B 2 t − 1 = M t its quadratic variation. Let us illustrate in the following example how this interpretation can be used to directly obtain a bound on v t = 1 n n j=1 t 0 e −a(t−s) dB j s , a > 0, which could be seen as a most simple toy model for w n t (h).
Example 3.4. Let (B j ) j n be independent Brownian motions on a common filtered probability space (Ω, F , (F ) t ) t , P). For a > 0, let (X j ) j n be the following associated familiy of Ornstein Uhlenbeck processes: and consider the quantity We remark that we may rewrite

Notice that M is a martingale of quadratic variation
and therefore, by Lemma 3.3, we conclude that log (1 + 2aT ).
Note that we crucially exploited the splitting e −a(t−s) = e −at e as , which is not available in the semigroup setting we are concerned with. Intending to employ such a step suggests to pass by a functional calculus for the semigroup, which we briefly discuss next.
Plugging and to ensure that this bound is integrable for ρ ∈ (r, ∞), see Lemma A.3. Putting all the above considerations together with care, one obtains a maximal inequality for w n t (h) that we present in Lemma 3.6.

Remark 3.5. A similar control has already been used in
Proof. Let h ∈ H m and γ r,η be the curve in (3.7) with η ∈ (π/2, π) and r > 0. Since the real values of η and r are not crucial for the proof, we may suppose r > 1. Using the decomposition of S we obtain: where in the third step we have used that ∇ is a closed linear operator on D( ∆ 2 ) and with (3.9) Using the classical estimate (a + b + c) 2 3(a 2 + b 2 + c 2 ), it follows that We focus on Z 1 t (h), but similar estimates for Z 2 t (h) and Z 2 t (h) follow in exactly the same way.
We compute the quadratic variation of X ǫ,ρ t (h): Lemma A.2 assures that for every ǫ such that 0 < 2ǫ Thus, the quadratic variation of X ǫ,ρ t (h) is bounded P-a.s. by Observe then E sup The term sup t∈[0,T ] We now invoke Lemma 3.3, which in conjunction with (3.11) allows to deduce that where in the last inequality, we have bounded the constant C appearing in (3.11) by max{1, C}. Further modifying C accordingly, we are thus left with Concerning Z 3 t (h), computations are the same if one replaces η by −η. Concerning Z 2 t (h), computations are easier since there is no a priori diverging integral to deal with and we omit the proof. The overall bound on w n t (h) is thus obtained by summing the three estimates and choosing the constant C accordingly.
Remark 3.7. Note that Lemma 3.6 implies by Jensens' inequality the following bound which is sharper in n with respect to (3.6), but in a weaker topology. One could ask if it is possible to establish a similar

Observe that such a bound would yield a stronger convergence in Theorem 2.3, namely convergence in H −m -norm. To the authors' knowledge, proving a uniform bound on w n which exploits the independence of Brownian motions is almost equivalent to prove a maximal inequality for self-normalized processes with values in a general Hilbert space.
3.3. Proving Theorem 2.3. The proof of Theorem 2.3 consists in two steps: using the pathwise bound on w n , Lemma 3.8 shows that we can extract from (ν n ) n∈N a weak-*-convergence subsequence; then, by exploiting the probability bound on w n t (h) for a fixed h ∈ H m , we identify through Lemma 3.9 the limit with a solution to (2.1).
Extraction of a weak-*-convergent subsequence. The main result of this subsection is given by the next lemma.
Proof. It suffices to show that (ν n ) n∈N is uniformly bounded in L ∞ ([0, T ], H −m ) P-a.s., an application of Banach-Alaoglu yields the existence of a convergent subsequence.
Exploiting the mild formulation in Lemma 3.1 and the bound on w n t (h) in Lemma 3.2 for some α ∈ (1/3, 1/2), one obtains that where we have exploited the properties of the semigroup and the bound already used in (2.3). A Gronwall-like argument implies the existence of a constant a independent of n and T such that In particular, using Lemma A.2, we conclude We move to the identification of the limit ν ∈ L ∞ ([0, T ], H −m ).
The limit coincides with an m-weak-mild solution. We prove that any possible limit of (ν n ) n∈N is a weak-mild solution (2.1). Given the uniqueness of (2.1), this implies the weak-* convergence in L ∞ ([0, T ], H −m ) of (ν n ) n∈N to the element ν given in Lemma 3.8. In particular, this is true for (ν n k 0 ) k since S t h ∈ H m . Furthermore, Lemma 3.6 implies that lim k→∞ w n k t (h) = 0, in P-probability and thus in particular the convergence holds P-a.s. along a sub-subsequence (n k j ) j . Thus, it remains to show that P-a.s. For better readability and lighter notation, we will not distinguish between n and n k j in the following, understanding that we continue to work on the sub-subsequence. is in H m for every s ∈ [0, t], since Γ(x, ·) ∈ H m for a.e. x ∈ R d , recall (2.2). Namely, We conclude that This establishes (3.12).
Overall, we have thus shown that any subsequence of (ν n ) n converges along some further subsequence P-a.s. Proof of Theorem 2.3. In order to show that ν n * ⇀ ν in L ∞ ([0, T ], H −m ) in probability, we show that any subsequence (ν n k ) k admits a further subsequence that converges P-a.s. in weak-* topology of L ∞ ([0, T ], H −m ) to ν.
Let (ν n k ) k be hence a subsequence. By assumption of the Theorem, Lemmas 3.6 and 3.8, we find a further subsequence (ν n k j ) j , along which P-a.s., where the limitν ∈ L ∞ ([0, T ], H −m ) may apriori depend on the subsequence chosen. Notice however that due to Lemma 3.9, any such limit is a m-weak-mild solution to (1.3). By the uniqueness result of Proposition 2.2, we conclude that the limitν = ν must be the same for any subsequence chosen. The first part of the Theorem is proved. Note that apriori, our limit ν is only a distribution in H −m at each fixed timepoint.
Suppose ν 0 ∈ P(R d ). In order to show that ν t is actually a probability measure for each t ∈ [0, T ], we observe that a weak solution µ ∈ C([0, T ], P(R d )) to (1. holds. Note that by standard approximation, (3.14) holds also for f ∈ H m ⊂ C 3 b , meaning that µ is indeed a weak-mild solution. By the uniqueness statement of Proposition 2.2 we conclude µ = ν and thus in particular ν ∈ C([0, T ], P(R d )). This concludes the second part and thus the entire proof of the Theorem.

Appendix A. Hilbert spaces and Semigroups
The Laplacian semigroup. The following definitions are taken from [20,25]. For the sake of notation, we focus on ∆, the standard Laplacian on L 2 (R d ), instead of It is not difficult to see that H m+2 ⊂ D(∆), where the inclusion is dense, and that ∆ is a sectorial operator with spectrum given by (−∞, 0]. In particular, it generates an analytic strongly continuous semigroup denoted for all t 0 by S t ; recall that S 0 := Id is the identity operator.
When computing the semigroup against a function h through (3.7), we use the following decomposition into three real integrals: The section ends with some estimates concerning the regularity of S.
where D 2 is the Hessian and D 3 the tensor with third-order derivatives. In particular Proof. We calculate explicitly where we exploited the asymmetry of the first Taylor component. The first statement follows from a similar consideration, considering an order one Taylor expansion of ∇f (x − y) around x instead of an order two Taylor expansion. The proof follows by Sobolev's embeddings.
Whenever s is an integer, it is well known that this definition coincides with the standard definition of the Sobolev space W s,2 (R d ).
The next lemma extends the embedding (1.5) to H s and its relationship with the space of probability measures.
where C is the norm of the identity operator between H s and C b (R d ).
Fractional operators on H s . We have the following lemma.
Lemma A.3. Let λ = ρe iη ∈ ρ(∆) and suppose ρ > 1. There exists a positive constant C = C η such that for every ǫ ∈ (0, 1/2) Proof. Exploiting the Fourier multipliers associated to ∇ and R, one obtains Since we are assuming ρ > 1, we have that Appendix B. Rough integration associated to semigroup functionals We mostly follow [18] and use very similar notations. Let k 1 and ∆ k be the k-dimensional simplex given by Let C k = C(∆ k ; R) and W a Banach space with a strongly continuous semigroup (S t ) t∈[0,T ] acting on it. Define D k as the space of linear operators from W to C k . Furthermore, let D * = k 1 D k and define the following operators on D * : For A ∈ D k and f ∈ W , they are defined as where ✓ ✓ t i means that the argument t i is omitted and S t 1 t 2 stands for S t 1 −t 2 . We are ready for the first lemma.
Proof. The proof mimics [18,Proposition 3.1]. We only mention that for proving Kerδ D k+1 ⊂ Imδ D k , a possible choice for A ∈ Kerδ D k+1 is given by B ∈ D k defined as We now introduce some analytical assumptions on the previous function spaces. We start with a Hölder-like norm on C k for k = 2, 3. For µ > 0 and g ∈ C 2 , define g µ := sup t,s∈∆ 2 |g ts | |t − s| µ , and consequently C µ 2 := g ∈ C 2 ; g µ < ∞ . For γ, ρ > 0 and g ∈ C 3 , define g γ,ρ := sup t,u,s∈∆ 3 for any f ∈ W . We conclude Letting P n ([s, t]) be for example be the dyadic partition, one obtains for Q ∈ D γ 2 , with γ > 1, the estimate where we exploited that S is a contraction semigroup. Returning to the telescope sum, we obtain |[Qf ] ts | 2 n(1−γ) Q D γ 2 f W |t − s| γ .
By passing to the limit for n which tends to infinity, we conclude that for any f ∈ W and any [s, t] ⊂ [0, T ] [Qf ] ts = 0 yielding Q = 0, i.e. ΛA = ΛA for any A ∈ D 1+ 3 ∩δ(D 2 ), concluding uniqueness. Towards existence, let A ∈ D 1+ 3 ∩δ(D 2 ), i.e. there exist a B ∈ D 2 and η > 1 such thatδB = A ∈ D η 3 . Let (r n k ) 0 k 2 n be the dyadic partition of [s, t]. We set, following [18]  Let ΛδB ∈ D η 2 be its limit. By a telescope argument  [AS tu f ] uv , (B.3) where the limit is over any partition of [s, t] whose mesh tends to zero.
Proof. The proof is an easy application of the Sewing Lemma and the properties of (D * ,δ). Indeed, observe thatδ(Af − ΛδAf ) = 0 for any f ∈ W , which means that A − ΛδA ∈ Kerδ D 2 , and thus there exists I ∈ D 1 such thatδI = A − ΛδA. ΛδAS tu f uv .
By taking the limit for the mesh which tends to zero and using the fact that ΛδA ∈ D 1+ 2 , the last sum converges to zero.