Averaging 2d stochastic wave equation

We consider a 2D stochastic wave equation driven by a Gaussian noise, which is temporally white and spatially colored described by the Riesz kernel. Our first main result is the functional central limit theorem for the spatial average of the solution. And we also establish a quantitative central limit theorem for the marginal and the rate of convergence is described by the total-variation distance. A fundamental ingredient in our proofs is the pointwise $L^p$-estimate of Malliavin derivative, which is of independent interest.


Introduction
We consider the 2D stochastic wave equation for any given β ∈ (0, 2). In other words, the driving noiseẆ is white in time and it has an homogeneous spatial covariance described by the Riesz kernel. HereẆ is a distributionvalued field and is a notation for ∂ 3 W ∂t∂x1∂x2 , where the noise W will be formally introduced later.
Because of the choice of boundary conditions (1.3), {u(t, x) : x ∈ R 2 } is strictly stationary for any fixed t > 0, meaning that the finite-dimensional distributions of {u(t, x+y) : x ∈ R 2 } do not depend on y; see e.g. [7,Footnote 1]. Then it is natural to view the solution u(t, x) as a functional over the homogeneous Gaussian random field W . Such Gaussian functional has been a recurrent topic in probability theory, for example, the celebrated Breuer-Major theorem (see e.g. [1,2,19]) provides the Gaussian fluctuation for the average of a functional subordinated to a stationary Gaussian random field. Therefore, one may wonder whether or not the spatial average of u(t, x) admits Gaussian fluctuation, that is, as R → +∞ Here t > 0 is fixed, u(t, x) solves (1.1) and N (0, 1) denotes the standard normal distribution. Recently, the above question has been investigated for stochastic heat equations (see [4,9,10,20]) and for the 1D stochastic wave equation (see [7]). Our work can be seen as an extension of the work [7] to the two-dimensional case. In Theorem 1.1 below we provide an affirmative answer to the above question and we will provide more literature overview in Remark 3. Let us first fix some notation that will be used throughout this article.
Notation. (1) The expression a b means a ≤ Kb for some immaterial constant K that may vary from line to line.
(1) The limiting process G has the following stochastic integral representation: where {Y t : t ∈ R + } is a standard Brownian motion.
(2) We point out that σ R > 0 is part of our main result. Indeed, it is a consequence of our standing assumption σ(1) = 0. In fact, we have the following equivalences: The proof can be done similarly as in [7,Lemma 3.4] and by using Proposition 3.1.
(3) The total-variation distance d TV induces a much stronger topology than that induced by the Fortet-Mourier distance d FM , where the latter is equivalent to that of convergence in law. For real random variables X, Y , where the first supremum runs over all Borel subsets of R and the second supremum runs overs all bounded Lipschitz functions h with h ∞ + h ′ ∞ ≤ 1. Our quantitative CLT (1.8) is obtained by the Malliavin-Stein approach that combines Stein's method of normal approximation with Malliavin's differential calculus on a Gaussian space; see the monograph [15] for a comprehensive treatment. One can also obtain the rate of convergence in other frequently used distances, such as the 1-Wassertein distance and Kolmogorov distance, and the corresponding bounds are of the same order as in (1.8). Now let us sketch a few paragraphs to briefly illustrate our methodology in proving Theorem 1.1. The main ingredient is the following fundamental estimate on the p-norm of the Malliavin derivative Du(t, x) of the solution u(t, x). It is well-known (see e.g. [14]) that Du(t, x) ∈ L p (Ω; H) for any p ∈ [1, ∞), where H is the Hilbert space associated to the noise W , defined as the completion of C ∞ c (R + × R 2 ) under the inner product where c β is given in (1.6) and F f (s, ξ) = R 2 e −ix·ξ f (s, x)dx. 2 The space C(R + ; R) is equipped with the topology of uniform convergence on compact sets. Theorem 1.2. The Malliavin derivative Du(t, x) is a random function denoted by (s, y) → D s,y u(t, x) and for any p ∈ [2, ∞) and any t > 0, the following estimates hold for almost all (s, y) ∈ [0, t] × R 2 : where the constants C β,p,t,L and κ p,t are given in (4.6) and ( Before we proceed to explaining our proof strategy, let us provide a brief literature overview. Remark 3. It was the paper [9] by Huang, Nualart and Viitasaari that first studied spatial averages of stochastic heat equation with 1 spatial dimension driven by space-time white noise. Soon later, the same authors and Zheng investigated the same equation in higher dimension; in their paper [10], the spatial correlation is described by the Riesz kernel as in the present work. The above two references considered the noise that is white in time, leading to the natural martingale structure. This enables one to take advantage of Itô calculus mentioned in previous remark. However, when the noise is colored in time, these tools are not available any more and we should restrict ourselves to the linear equation (that is, when σ(u) = u). The linear equation, also known as the parabolic Anderson model, admits the explicit Wiener chaos expansions, and in the work [20] by Nualart and Zheng, similar central limit theorems are established at qualitative level by using the so-called chaotic central limit theorem (see e.g. [15,Section 6.3]). The authors of [7] first considered the same problem for the stochastic wave equations where spatial dimension is one and the driving Gaussian noise is white in time and fractional in space. Unlike in the heat setting, the fundamental wave solution differs in different dimensions and as we will see shortly, the analysis in our work is quite different from that in [7]. Here we also remark that it is natural to study the same problem for wave equations when the noise is colored in time, and it may be a hard problem to get a quantitative central limit theorem in this setting. Now let us first sketch the main steps for the proof of Theorem 1.1 and then we will present the key steps in proving (1.11).
The typical proof of the functional CLT consists in three steps: (S1) We establish the limiting covariance structure, this is the content of Section 3.1.
In particular, the variance of the spatial average F R (t) is of order R 4−β , as R → ∞. As one will see shortly, the important part of this step is the proof of the limit (3.3): Cov σ(u(s, y)), σ(u(s, z)) → 0 as y −z → ∞. This limit is straightforward when σ(u) = u and in the general case, we will apply the Clark-Ocone formula (see Lemma 2.4) to first represent σ(u(s, y)) as a stochastic integral and then apply the Itô's isometry in order to break the nonlinearity for further estimations.
(S2) From (S1), we have the covariance structure of the limiting Gaussian process G. Then we will prove the convergence of R β 2 −2 F R (t) : t ∈ R + to G t : t ∈ R + in finitedimensional distributions. This is made possible by the following multivariate Malliavin-Stein bound that we borrow from [9, Proposition 2.3] (see also [15,Theorem 6.1.2]). We denote by D the Malliavin derivative and by δ the adjoint operator of D that is characterized by the integration-by-parts formula (2.6). Moreover, D 1,2 is the Sobolev space of Malliavin differentiable random variables X ∈ L 2 (Ω) with E DX 2 H < ∞ and Domδ is the domain of δ; see Section 2 for more details.
where the implicit constant does not depend on t, s or R. This will proves Theorem 1.1.
Finally let us pave the plan of proving the fundamental estimate (1.11). The story begins with the usual Picard iteration: We define u 0 (t, x) = 1 and for n ≥ 0, It is a classic result that u n (t, x) converges in L p (Ω) to u(t, x) uniformly in x ∈ R 2 for any p ≥ 2; see e.g. [6,Theorem 4.3]. Now it has become clear that if we assume σ(1) = 0, we will end up in the trivial case where u(t, x) ≡ 1, in view of the above iteration. For each n ≥ 0, u n+1 (t, x) is Malliavin differentiable, as one can show by induction on n. Our strategy is to first obtain the uniform estimate of sup D s,y u n (t, x) p : n ≥ 0 and then one can hope to transfer this estimate to D s,y u(t, x) p . As mentioned before, Du(t, x) lives in the space H that contains generalized functions. To overcome this, we will carefully apply the following inequality of Hardy-Littlewood-Sobolev to show Du(t, x) is a random variable in L 4 4−β (R + × R 2 ), with β ∈ (0, 2) fixed throughout this paper.
then there is some constant C that only depends on p, α and n, such that for any locally integrable function g : R 2 → R, where with α ∈ (0, n), x − y α−n g(y)dy.
Once we obtain the uniform estimate of sup D s,y u n (t, x) p : n ≥ 0 and prove Du(t, x) ∈ L 4 4−β (R + × R 2 ), that is, (s, y) −→ D s,y u(t, x) is indeed a random function, we proceed to the proof of (1.11). In view of the Clark-Ocone formula (see Lemma 2.4), we have E[D s,y u t,x |F s = G t−s (x − y)σ(u(s, y)) almost surely, where {F s : s ∈ R + } is the filtration generated by the noise; see Section 2.2. Then, the lower bound in (1.11) follows immediately from the conditional Jensen inequality. The upper bound follows from the uniform estimates of D s,y u n (t, x) p by a standard argument.
Before we end this introduction, let us point out another technical difficulty in this paper. After the application of Lemma 1.5 during the process of estimating D s,y u n (t, x) p , we will encounter integrals of the form where q ∈ (1/2, 1) and δ ∈ {1, 1/q}. In the case of stochastic heat equation, the estimation of the above integrals is straightforward due to the semi-group property. However, for the wave equation the kernel G t does not satisfy the semi-group property and the estimation of the above integrals is quite involved. For the case of the 1D stochastic wave equation, as one can see from the paper [7], the computations take advantage of the simple form of the fundamental solution (i.e. 1 2 1 {|x−y|<t−s} ). For our 2D case, the singularity within the fundamental solution G t−s (x − y) puts the technicality to another level and we have to estimate the convolution G 2q t−r * G 2q r−s by exact computations. A basic technical tool used in this problem is the following lemma.
where the implicit constant only depends on q.
The rest of this article is organized as follows: Section 2 collects some preliminary facts for our proofs, Section 3 contains the proof of Theorem 1.1 and Section 4 is devoted to proving the fundamental estimate (1.11).

Acknowledgement:
We are grateful to two referees for their critical comments that improved our work.

Preliminaries
This section provides some preliminary results that are required for further sections. It consists of two subsections: Section 2.1 contains several important facts on the function G t−s (x − y) and Section 2.2 is devoted to a minimal set of results from stochastic analysis, notably the tools from Malliavin calculus.

Basic facts on the fundamental solution.
Let us fix some more notation here.
Recall the function ϕ t,R (r, y) introduced in (1.14): In what follows, we put together several useful facts on the function G t (z).
The proof of Lemma 2.1 is omitted, as it follows from simple and exact computations. As a consequence of Lemma 2.1-(2), we have (2. 2) The following lemma is also a consequence of Lemma 2.1.
Here the quantities c β and κ β are given in (1.6).
Proof. By using Fourier transform as in (1.10), we can write where in the last equality we made the change of variables ξ → ξR −1 . [20]), where J 1 is the Bessel function of first kind with order 1 introduced in (1.7). Then, we can rewrite Ψ R (t 1 , t 2 ; s) as is uniformly bounded over s ∈ (0, t] and converges to t − s as R → ∞, then the statement (i) holds true and by the dominated convergence theorem with the dominance condition κ β < ∞.
Remark 4. By inverting the Fourier transform, we have

Basic stochastic analysis.
Let H be defined (see (1.9) and (1.10)) as the completion of C ∞ c (R + × R 2 ) under the inner product Consider an isonormal Gaussian process associated to the Hilbert space H, denoted by As the noise is white in time, a martingale structure naturally appears. First we define F t to be the σ-algebra generated by P-null sets and interpreted as the Dalang-Walsh integral ( [5,22]), is a square-integrable F-martingale with quadratic variation given by Let us record a suitable version of Burkholder-Davis-Gundy inequality (BDG for short); see e.g. [12,Theorem B.1].
We refer interested readers to the book [12] for a nice introduction to Dalang-Walsh's theory. For our purpose, we will often apply BDG as follows. If Φ is F-adapted and Now let us recall some basic facts on the Malliavin calculus associated with W . For any unexplained notation and result, we refer to the book [16]. We denote by C ∞ p (R n ) the space of smooth functions with all their partial derivatives having at most polynomial growth at infinity. Let S be the space of simple functionals of the form Then, the Malliavin derivative DF is the H-valued random variable given by The derivative operator D is closable from L p (Ω) into L p (Ω; H) for any p ≥ 1 and we define We denote by δ the adjoint of D given by the duality formula for any F ∈ D 1,2 and u ∈ Dom δ ⊂ L 2 (Ω; H), the domain of δ. The operator δ is also called the Skorohod integral and in the case of the Brownian motion, it coincides with an extension of the Itô integral introduced by Skorohod (see e.g. [8,18]). In our context, the Dalang-Walsh integral coincides with the Skorohod integral: Any adapted random field Φ that satisfies E Φ 2 H < ∞ belongs to the domain of δ and The proof of this result is analogous to the case of integrals with respect to the Brownian motion (see [ The operators D and δ satisfy the commutation relation By Fubini's theorem and the duality formula (2.6), we can interchange the Skorohod integral and Lebesgue integral: which gives us (2.8). In particular, the equalities in (1.13) are valid.
With the help of the derivative operator, we can represent F ∈ D 1,2 as a stochastic integral. This is the content of the following two-parameter Clark-Ocone formula, see e.g.
Lemma 2.4 (Clark-Ocone formula). Given F ∈ D 1,2 , we have almost surely We end this section with the following useful fact: If Φ s : s ∈ R + is a jointly measurable and integrable process satisfying R+ Var(Φ s ) 1/2 ds < ∞, then

Gaussian fluctuation of the spatial averages
We follow the three steps described in our introduction.
It remains to verify (3.3). By Theorem 1.2, for any 0 < s < t, D s,y u(t, x) p G t−s (x − y).
As a consequence, E σ(u(s, y))σ(u(s, z)) = ξ 2 (s) + T (s, y, z), where By the chain-rule (2.5) for the derivative operator, D r,γ σ(u(s, y)) = Σ s,y D r,γ u(s, y) with Σ s,y an adapted random field uniformly bounded by L, where we recall that L is the Lipschitz constant of σ. This implies, Thus, Suppose y − z > 2s, then This implies (3.3) and hence concludes our proof.

3.2.
Convergence of finite-dimensional distributions. As it was explained in the introduction, a basic ingredient for the convergence of finite-dimensional distributions is the following estimate where we recall that V t,R (s, y) = ϕ t,R (s, y)σ(u(s, y)) and ϕ t,R is defined in (1.14).
Note that the Malliavin-Stein bound (1.16) and the above bound (3.4) with t 1 = t 2 = t lead to the quantitative CLT in (1.8). In fact, from (3.4) and (1.16), we have for any fixed t > 0 and Z ∼ N (0, 1), by Proposition 3.1, σ 2 R R β−4 converges to some explicit positive constant, see (3.1). So we can write, for all R ≥ R t d TV F R (t)/σ R , Z ≤ CR −β/2 , where R t is some constant that does not depend on R. As the total variation distance is aways bounded by 1, we can write for R ≤ R t , Therefore, the bound (1.8) follows.
Note that (3.4), together with Proposition 1.3, implies the convergence in law of the finite dimensional distributions. In fact, fix any integer m ≥ 1 and choose m points t 1 , . . . , t m ∈ (0 , ∞), then consider the random vector Φ R = F R (t 1 ), . . . , F R (t m ) and let G = (G 1 , . . . , G m ) denote a centered Gaussian random vector with covariance matrix (C i,j ) 1≤i,j≤m given by Recall from (1.13) that F R (t i ) = δ(V ti,R ) for all i = 1, . . . , m. Then, by (1.12) we can write for every h ∈ C 2 (R m ) with bounded second partial derivatives. Thus, in view of (3.5), in order to show the convergence in law of R β 2 −2 Φ R to G, it suffices to show that for any i, j = 1, . . . , m, Notice that, by the duality relation (2.6) and the convergence (3.1), we have Therefore, the convergence (3.6) follows immediately from (3.7) and (3.4). Hence the finite- The rest of this subsection is then devoted to the proof of (3.4).
The commutation relation (2.7) implies for s ≤ t, By the chain rule for the derivative operator (see (2.5)) where Σ r,z is an adapted random field bounded by the Lipschitz constant of σ. Substituting (4.9) into (3.8), yields, for s ≤ t, D s,y F R (t) = ϕ t,R (s, y)σ(u(s, y)) + t s R 2 ϕ t,R (r, z)Σ r,z D s,y u(r, z)W (dr, dz).
Then by Cauchy-Schwarz inequality and Theorem 1.2, we can see that the above covariance term (3.12) is bounded by Now we can plug the last estimate into (3.11) for further computations: Var R 4 ϕ t1,R (s, y)ϕ t2,R (s, z)σ(u(s, y))σ(u(s, z)) y − z −β dydz In order to obtain Var( as other terms from (3.13) can be estimated in the same way with the same bound. For s ∈ (0, t 1 ∧ t 2 ], we write, using (1.14), Making the change of variables Using the fact (2.1), we can integrate out Suppose R ≥ t 1 + t 2 and notice that Therefore, integrating out y, y ′ in (3.14), we obtain We further integrate out z, z ′ and use (2.1) again to write So we obtain Var(A 1 ) R 8−3β for R ≥ t 1 + t 2 , where the implicit constant does not depend on R.
Next we estimate the variance of A 2 .
So, using the estimate (1.11), we obtain As a consequence, the variance-term U s is indeed a second moment and which has the same kind of expression as T s . The same arguments that led to the uniform estimate of T s yields This completes the proof of (3.4).

Tightness.
Set q = 2 4−β ∈ (1/2, 1). As explained in the introduction, by the Kolmogorov-Chentsov criterion for tightness, it is enough to prove the inequality (1.17): For any T > 0, p ≥ 2 and for any 0 ≤ s < t ≤ T ≤ R, where the implicit constant does not depend on t, s or R.
Proof of (3.15). Recall that F R (t) = t 0 R 2 ϕ t,R (s, y)σ(u(s, y))W (ds, dy). Then by BDG inequality (2.3) and (1.20) we have, with the convention that ϕ s,R (r, y) = 0 if r > s, Applying Minkowski's inequality yields Note that The first summand S 1 is bounded by 1 {r≥s} (t − r)1 { y ≤R+t} ≤ (t − s)1 { y ≤R+t} , in view of Lemma 2.1- (2). For the second summand, we can write In the same way, the third summand can be bounded as follows Therefore, we can continue with (3.16) to write This implies (3.15).

Fundamental estimate on the Malliavin derivative
This section is devoted to the proof of Theorem 1.2. After a useful lemma, we study the convergence and moment estimates for the Picard approximation in Section 4.1. The main body of the proof of Theorem 1.2 is given in Section 4.2 and we leave proofs of two technical lemmas to Section 4.3. Recall that β ∈ (0, 2) is fixed throughout this paper.
Lemma 4.1. Given any random field {Φ(r, z) : (r, z) ∈ R + × R 2 }, we have for any x ∈ R 2 , 0 ≤ s < t < ∞ and p ≥ 2, where q = 2 4−β ∈ (1/2, 1) and the constant K β only depends on β. Proof. By (1.20), there exists some constant C β that only depends on β such that where we have used the fact that G 2q t−r (y)dy, with 2q < 2, is a finite measure on R 2 with total mass (2π) 1−2q 2−2q (t − r) 2−2q in view of (2.1) and we have put K β = C β Using the estimates (2.4) and (4.1), we can write with 2q = 4 4−β ∈ (1, 2), p ≥ 2 and n ≥ 1, Then, using (2.1), we can write where L is the Lipschitz constant of σ. This leads to where H n (t) = sup x∈R 2 u n (t, x) 2 p , +3−2q and c 2 := pK * β L 2 t 1/q is a constant depending only on β. Therefore, by iterating the inequality (4.3) and taking into account that H 0 (t) = 1, yields In what follows, we will denote by C * β a generic constant that only depends on β and may be different from line to line. In this way, we obtain 4.2. Proof of Theorem 1.2. The proof will be done in several steps.
Step 1. In this step, we will establish the following estimate (4.5) for the p-norm of the Malliavin derivative of the Picard iteration.

5)
for almost all (s, y) ∈ [0, t] × R 2 , where κ p,t is defined in (4.4) and the constant C β,p,t,L is given by with C * β a constant only depending on β. One key ingredient for proving Proposition 4.2 is the following Lemma 4.3, which is a consequence of the technical Lemma 1.6. Both Lemma 1.6 and Lemma 4.3 will be proved in Section 4.3. Lemma 4.3. For q ∈ (1/2, 1), δ ∈ [1, 1/q] and s < t, we have where the implicit constant only depends on q.
Proof of Proposition 4.2. Fix (t, x) ∈ R + × R 2 and p ≥ 2. Let us first establish the following weaker estimate: for almost all (s, y) ∈ [0, t] × R 2 , where the constant C may depend on n. It follows from (4.2) that the claim (4.8) holds true for n = 0, 1, because D s,y u 0 (t, x) = 0 and D s,y u 1 (t, x) = σ(1)G t−s (x−y). Now suppose the claim (4.8) holds true for n ≥ 1, then taking the Malliavin derivative in both sides of equality (4.2) and using the commutation relationship (2.7) and the chain rule (2.5), we obtain where Σ (n) s,y : (s, y) ∈ R + × R 2 is an adapted random field that is uniformly bounded by L, for each n. We recall that the constant L is the Lipschitz constant of the function σ appearing in (1.1). It follows that H0 dr, by applying Minkowski's inequality and using the induction hypothesis, where κ p,t is defined in (4.4) and H 0 has been introduced in (1.19). Note that Lemma 1.5 (see (1.20 where the last inequality follows from Lemma 4.3 with δ = 1/q and C * β is a constant that only depends on β. Finally, using we get D s,y u n+1 (t, x) p ≤ C n+1 G t−s (x− y) with C n+1 = 2κ 2 p,t + pL 2 C 2 n C * β t 2 q −1 and thus by routine computations, we can show u n+1 (t, x) ∈ D 1,p ; see also Step 2. This shows (4.8) for each n. Moreover, we point out that D s,y u n+1 (t, x) = 0 if s ≥ t.
To obtain the uniform estimate in (4.5), we proceed with the finite iterations where T (n) k denotes the kth item in the sum and r 0 = t, z 0 = x. For example, T (n) 0 = G t−s (x − y)σ u n (s, y) and We are going to estimate the p-norm of each of term T (n) k for k = 0, . . . , n.
Case k = 0: It is clear that T where κ p,t is the constant defined in (4.4).
. Then, we can deduce immediately from Lemma 4.3 (with δ = 1/q) that for some generic constant C * β , which only depends on β. Taking (4.9) into account, we obtain Case k = 2: We can write with N r1,z1 defined to be (4.14) The same arguments used to obtain the bound (4.13) for T Substituting (4.15) into (4.14) and applying Lemma 4.3 with δ = 1/q, we obtain In view of (4.9), we obtain Case 3 ≤ k ≤ n: The strategy to handle these cases will be slightly different. We need to get rid of the power 1 q in order to iterate the integrals in the time variables and obtain a summable series. We can write with N r1,z1 defined to be which is F r1 -measurable. Then, we deduce from (2.4) and (4.1) that Now we can iterate the above process to obtain where N r k−1 ,z k−1 is defined to be Therefore, the same arguments for estimating T (n) 1 2 p (see (4.12)), lead to with C * β being a generic constant that only depends on β. On the other hand, applying Lemma 4.3 with δ = 1, we can write with the convention z 0 = x and r 0 = t. Plugging the estimates (4.19) and (4.20) into (4.18), yields By Cauchy-Schwartz inequality and (2.1), In this way, we obtain The indicator function 1 { x−y <t−s} appears in (4.21), because Now, we can perform the integration with respect to dz k−3 , . . . , dz 1 one by one to get So up to a subsequence, Du n (t, x) also converges to Du(t, x) in the weak topology on L p Ω; L 2q (R + × R 2 ) . In particular, we have (2q < 2 ≤ p < ∞) and D s,y u(t, x) is a real function in (s, y).
Step 3. Let us prove the lower bound. By Lemma 2.4, we can write so that a comparison with (1.4) yields E D s,y u(t, x)|F s = G t−s (x − y)σ(u(s, y)) almost everywhere in Ω × R + × R 2 . It follows that E[D s,y u(t, x)|F s ] p = G t−s (x − y) σ(u s,y ) p , thus by conditional Jensen, we have D s,y u(t, x) p ≥ G t−s (x − y) σ(u s,y ) p , which is exactly the lower bound in (1.11).
Step 4. We are finally in a position to prove the upper bound in (1.11). Put p ⋆ = p/(p − 1), which is the conjugate exponent for p. Let us pick a nonnegative function M ∈ C c (R + × R 2 ) and random variable Z ∈ L p ⋆ (Ω) with Z p ⋆ ≤ 1. Since Du n (t, x) converges to Du(t, x) in the weak topology on L p Ω; L 2q (R + × R 2 ) along some subsequence (say Du n k (t, x)), we have, in view of (4.5) This implies that for almost all (s, y) ∈ [0, t × R 2 , E ZD s,y u(t, x) ≤ C β,p,t,L κ p,t G t−s (x − y) Taking the supremum over {Z : Z p ⋆ ≤ 1} yields D s,y u(t, x) p ≤ C β,p,t,L κ p,t G t−s (x − y), which finishes the proof.

Proof of technical lemmas.
For convenience, let us recall Lemma 1.6 below.
To derive the expression (4.24) for I, we have used the fact that the Jacobian of the change of variables is ∂(x, y) ∂(ξ, η) Then, integrating first in the variable y yields where D(x) = y ∈ R : (x, y) ∈ D = y ∈ R : y < s 2 , √ x − w 2 < y < √ x + w 2 and S q (x) = Let us first deal with S q (x) for every x ∈ (0, t 2 ). There are two possible cases, depending on the value of x: (4.26) where w = z > 0 and 0 > δ(1 − 2q) ≥ 1 q − 2 > −1. Define