Three-dimensional magnetohydrodynamics system forced by space-time white noise

We consider the three-dimensional magnetohydrodynamics system forced by noise that is white in both time and space. Its complexity due to four non-linear terms makes its analysis very intricate. Nevertheless, taking advantage of its structure and adapting the theory of paracontrolled distributions from \cite{GIP15}, we prove its local well-posedness. A first challenge is to find an appropriate paracontrolled ansatz which must consist of both the velocity and the magnetic fields. Second challenge is that for some non-linear terms, renormalizations cannot be achieved individually; we overcome this obstacle by strategically coupling certain terms together rather than separately. Our proof is also inspired by the work of \cite{ZZ15}.


Introduction
When solutions to a system of partial differential equations (PDE) lack sufficient regularity, a common remedy is to multiply by a sufficiently smooth function, integrate by parts to rid of any derivative on the solution, and only ask that its integral formulation is well-defined; this is the standard definition of a weak solution (e.g. [16,20]). However, if the PDE are non-linear, then the lack of regularity creates difficulty in understanding any product of the solution with itself because there is no universal agreement on the definition of a product of distributions. Some physically meaningful models which have found rich applications in the real world were forced by a term that is white in both space and time, the so-called spacetime white noise (STWN), ever since its first derivation. A prominent example is the Kardar-Parisi-Zhang (KPZ) equation (4) ([38, Equation (1)]); we also refer to [2,23,33,48] concerning the Boussinesq system forced by STWN. While considering the mild solution formulation typically solved the issue in case the noise is white only in time, the STWN leads to a lack of spatial regularity of the solution, and the construction of a solution has created a significant obstacle because the non-linear term seemed to be ill-defined in the classical sense. Let us briefly describe the very recent remarkable developments that ultimately led to the two novel approaches of the theory of regularity structures by Hairer [30] and the theory of paracontrolled distributions by Gubinelli, et al. [25].
Following the notations of Young [54, pg. 258], given a function f (x) over [x ′ , x ′′ ], let us denote where δ > |x ′′ − x ′ | and call it the p-variation of f . We also write f ∈ W p if V p (f ) < ∞, and point out that the space of functions of bounded variations (BV) corresponds to W p in case p = 1. It follows (see e.g. [22,Proposition 5.3]) that if f is continuous over [x ′ , x ′′ ] and 1 ≤ p ≤ p ′ < ∞, then V p ′ (f ) ≤ V p (f ). It would be instructive here to recall that f is an element of C α , specifically of Hölder continuous with exponent α ≥ 0 if the following norms are finite: where ∂ x ∂ ∂x and ∂ k x ∂ k (∂x) k (e.g. [3,Definition 1.49]). It is immediate that f ∈ C 1 p for p ∈ (0, ∞) is continuous and of finite p-variation, but a function of finite p-variation need not be continuous (consider a step function) (see [22, pg. 78] for this discussion). Let us recall in particular that Brownian motion is locally Hölder continuous with exponent α for every α ∈ (0, 1 2 ) ([37, Chapter 2, Remark 2.12]) but nowhere locally so for α > 1 2 , and in fact it is not locally Hölder continuous with α = 1 2 everywhere on [0, ∞) ( [37, pg. 113-114]). Therefore, Brownian motion has finite p-variation for every p > 2 but does not have finite 2-variation. Now a well known sufficient condition for a Lebesgue-Stieltjes integral g(x)df (x) to be well-defined requires that f, g ∈ BV and at least one of them is continuous (see [20, pg. 103] for more precise statement). Young [54, pg. 265] proved a theorem in which if f ∈ W p , g ∈ W q where p, q > 0, 1 p + 1 q > 1, and they have no common discontinuities, then their Stieltjes integral still exists. In other words, Young's contribution states that instead of requiring both f and g to be in W 1 , each has to satisfy the weaker conditions of W p or W q where 1 p + 1 q > 1 and instead of one function being continuous at all x, each must be continuous wherever the other is not (see also [40]).
In order to understand the implication of Young's theory of integration better, let us now introduce the Navier-Stokes equations (NSE). Let us denote by u : T N × R + → R N and π : T N × R + → R the N -dimensional (N -d) velocity vector field and the pressure scalar field, respectively. Additionally, by denoting by ν ≥ 0 the viscous diffusivity and ∂ t ∂ ∂t , x = (x 1 , . . . , x N ), we are able to write down the NSE as ∂ t u + (u · ∇)u + ∇π − ν∆u = ξ u , ∇ · u = 0, with initial data u 0 (x) u(x, 0), where ξ u is the Gaussian field that is white in both time and space; i.e. E[ξ u (x, t)ξ u (y, s)] = δ(x − y)δ(t − s).
We will also need the definition of the Hölder space with negative exponent; for this purpose, let us recall the basic background of Besov spaces ( [25] and also [34] on how the Littlewood-Paley theory on R 3 may be transferred to T 3 ). Let us use the notation of A a,b B in case there exists a non-negative constant C = C(a, b) that depends on a, b such that A ≤ CB; similarly let us write A ≈ a,b B in case A = CB. Moreover, unless elaborated in detail, we denote k∈Z 3 by k . Firstly we recall the Fourier transform f (x)e ix·k dx with its inverse denoted by F −1 T 3 , let D be the set of all smooth functions with compact support on T 3 , D ′ its dual and thus the set of all distributions on T 3 . We let χ, ρ ∈ D be non-negative, radial such that the support of χ is contained in a ball while that of ρ in an annulus and satisfy χ(ξ) + j≥0 ρ(2 j ξ) = 1 ∀ ξ, supp(χ) ∩ supp(ρ(2 −j ·)) = ∅ ∀ j ≥ 1, supp(ρ(2 −i ·)) ∩ supp(ρ(2 −j ·)) = ∅ for |i − j| > 1.
We realize that χ(·) = ρ(2 −1 ·) and ρ j = ρ(2 −j ·), and define Littlewood-Paley operator as . We also write S j f i≤j−1 ∆ j f . Now for α ∈ R, p, q ∈ [1, ∞], we may define the inhomogeneous Besov space The Hölder-Besov space C α (T 3 ) is the special case when p = q = ∞; i.e. C α (T 3 ) = B α ∞,∞ (T 3 ). For α ∈ (0, ∞) \ N, C α (T 3 ) = C α (T 3 ); however, for k ∈ N, C k (T 3 ) is strictly larger than C k (T 3 ) ( [3, pg. 99]). We point out that Now for simplicity let us consider the 1-d analogue of (u · ∇)u in the NSE (1), specifically u∂ x u corresponding to the non-linear term of the Burgers' equation which was studied by Da Prato, et al. [12]. Following the discussion of [27, pg. 1548], assuming that its solution u ∈ C α for α > 1 2 , we may multiply this non-linear term by a smooth periodic function ψ and understand it as which is well-defined as a Young's integral because ψu ∈ C α for α > 1 2 . Of course, we can also write u∂ x u = 1 2 ∂ x u 2 and integrate by parts. However, when one considers a generalized Burgers' equation with non-linear term of the form g(u)∂ x u where g = ∂ x G for some function G : R → R, as considered by Hairer in [27], integration by parts becomes out of reach and one may only turn to Young's theory of integration. Unfortunately the assumption of u ∈ C α for α > 1 2 turns out to be a wishful thinking. In fact, in the general case when the spatial dimension is N , considering that the space-time dimension is N + 1 so that the scaling S ∈ N N is S = (S 1 , . . . , S N +1 ) = (2, 1, . . . , 1 N-many ) with the first entry informally representing the dimension of time due to ∂ t and ∆, we actually know that ξ ∈ C α (T N ) in space for α < − |S| 2 where |S| = N + 2 by [30,Lemma 10.2] (see also [30,Lemma 3.20]). This leads to u ∈ C α (T N ) for α < 2 − N +2 2 due to regularization from the diffusion (see [30, pg. 417, 481]). Therefore, the Young's integral (3) is ill-defined even in case N = 1 as 2 − N +2 At this point one may turn to the theory of stochastic integrals such as the Ito's integral in hope for some help; such integrals have been known to be extendable to a wider class of semi-martingales and discovered remarkable applications in the real world. However, its limitations have also been noticed over decades (e.g. [25, pg. 6], [27, pg. 1548]); Ito's integral requires an "arrow of time," specifically a filtration and adapted integrands, a probability measure because it is defined as the L 2 -limit of appropriate approximations, and the integrand must have the L 2orthogonal increments. In order to complement the theory of Ito's integrals, Lyons developed a theory of rough path ( [41,42]). As stated on [42, pg. 28], rough path is informally a continuous path on which a sequence of iterated path integrals may be constructed. Recall (e.g. [22, pg. 405]) that a fractional Brownian motion (fBm) β H with Hurst parameter H ∈ (0, 1) is a zero-mean Gaussian process with covariance of Because fBm for H = 1 2 is not a semi-martingale ( [6, pg. 12-13]), the Ito's integral cannot be extended to fBm. Nevertheless, using rough path theory, we may still construct a path-wise integral with respect to fBm (see [22,Proposition 15.5]). Subsequently, Gubinelli in [24] extended the Lyon's rough path theory furthermore; we refer to [21,26,27,32] for further study and applications of rough path theory. As one of the most prominent examples of a result inspired from the rough path theory, let us briefly discuss recent developments of the KPZ equation (4); this discussion will be relevant anyhow because we will subsequently need the notions of renormalizations. The KPZ equation as an interface model of flame propagation was first derived in [38,Equation (1)] as where h(x, t) represents the interface height, λ > 0 is the coupling strength, x ∈ S 1 and ξ h is the STWN. Following [29, pg. 562], inspired by the Cole-Hopf transform, let us consider a multiplicative stochastic heat equation We denote by Z ǫ the solution to the same equation with W replaced by a mollified noise W ǫ , which is obtained from multiplying the k-th Fourier component of W by f (kǫ) for a smooth cut-off function f with compact support such that f (0) = 1. Then Ito's formula shows that h ǫ (x, t) where k∈Z f 2 (kǫ) ≈ 1 ǫ R f 2 (x)dx → ∞. Therefore, informally the limiting process as ǫ ց 0 actually solves [28] for further discussion of such a phenomenon). This simple computation displays the necessity to rely on techniques from quantum field theory (e.g. [46] and [43, Section 4 on pg. 295]) such as renormalization, which amounts to strategically subtracting off a large constant from a regularized equation, and replacing a standard product by Wick product (e.g. [35, pg. 23], also [14,15]) which informally guarantees mean zero condition. Let us point out that these techniques actually have long history of its utility in stochastic quantization. In particular Da Prato and Debussche [11] proved the existence of a unique strong solution to the 2-d stochastic quantization equation for almost all initial data with respect to the invariant measure using such techniques. We also refer to [5,13] for works on the KPZ equation using these techniques. Without delving into further details, we mention that Hairer [29] in particular discovered two additional logarithmically divergent constants beside the 1 ǫ in (5) and successfully introduced a completely new concept of a solution to the KPZ equation (4) using rough path theory.
Let us now discuss this direction of research in the case of the NSE (1). To the best of the author's knowledge, Flandoli and Gozzi [18] were the first to consider the 2-d NSE in T 2 with the forcing that is not regular; they proved in [18,Theorem 4.3] that the Kolmogorov equation associated to the NSE with covariance operator that is an identity has a weak solution. However, due to the spatial roughness of the noise, the authors in [18] were not able to make the connection to the original equation. Da Prato and Debussche [10] overcame this difficulty using techniques of renormalization and Wick products.
At this point let us introduce the magnetohydrodynamics (MHD) system of our main concern because the failure to apply the proofs of [10,18], which we will explain shortly, displays clearly the complexity of the MHD system in contrast to the NSE. Let us denote by b : T N × R + → R N the magnetic N -d vector field and η ≥ 0 the magnetic diffusivity. Then the MHD system reads as ∂ t u + (u · ∇)u + ∇π = ν∆u + (b · ∇)b + ξ u , ∇ · u = 0, (6a) for which we write the solution as y (y 1 , . . . , y 6 ) (u, b) (u 1 , u 2 , u 3 , b 1 , b 2 , b 3 ), with initial data y 0 (x) (u 0 , b 0 )(x) = (u, b)(x, 0), and ξ (ξ u , ξ b ) where ξ u (ξ 1 u , ξ 2 u , ξ 3 u ) = (ξ 1 , ξ 2 , ξ 3 ) and ξ b (ξ 1 b , ξ 2 b , ξ 3 b ) = (ξ 4 , ξ 5 , ξ 6 ), is a Gaussian field which is white in both space and time. For simplicity of computation, let us assume that ν = η = 1 as well as that T 3 ξ u dx = T 3 ξ b dx = 0 which in turn allows us to assume that (u, b) are also mean zero; this may be justified via a standard scaling argument of the solution to the MHD system. Remark 1.1. As a STWN, the correlation of ξ u and that of ξ b are both products of a delta function in x with another delta function in t. In the literature on Boussinesq system such as [23,Equation (3)], the authors make an assumption corresponding to the MHD system that the correlation of ξ u and ξ b vanish; i.e. E[ξ i u ξ j b ] = 0 for all i, j ∈ {1, 2, 3}. Considering that there is no physical reason why ξ u and ξ b should have any independence, in this manuscript we shall assume that the correlation of ξ u and ξ b is also a product of a delta function in x with another delta function in t (see (116) which is a corollary of this assumption). Our computations are thus more general. Indeed, it is easy to recover the case E[ξ i u ξ j b ] = 0 for all i, j ∈ {1, 2, 3} because many terms within our proof vanish due to the mixed non-linear terms such as (u·∇)b and (b·∇)u. This is actually an interesting difference from the case of the NSE; the computations of the mixed non-linear terms can be actually much simpler than the case of the NSE under the assumption of the zero correlation among ξ u and ξ b .
It is well known that if we take the L 2 (T N )-inner products of (1) with u, then the non-linear term, as well as the pressure term, both vanish by divergence-free property; e.g. T 3 (u · ∇)u · udx = 1 2 T 3 (u · ∇)|u| 2 dx = 0. An analogous attempt of taking L 2 -inner products on (6a) with u actually fails because in general. Yet, if we take L 2 (T N )-inner products on (6b) with b simultaneously and add the two resulting equations, then all the non-linear terms and the pressure term in (6a)-(6b) do indeed vanish because T 3 (u · ∇)b · bdx = 1 2 T 3 (u · ∇)|b| 2 dx = 0 and Even though there exist some extensions of techniques on the NSE to the MHD system such as this, attempts to modify the proofs of [10,18] on the 2-d NSE to the 2-d MHD system face a non-trivial difficulty. In both works of [10,18], the authors rely on the following key identity: This follows immediately from the vector calculus identity of ∇ × (∇ × f ) = ∇(∇ · f ) − ∆f , integration by parts and divergence-free property of u; in fact, one of the reasons why the authors admit that extending to other boundary conditions beside T 2 is not easy (e.g. [18, pg. 312]) is exactly this identity (9). The identity (9) was used in [18, pg. 328] and [10, pg. 190], and this identity actually fails in the case of the MHD system because T 3 [(u · ∇)u − (b · ∇)b] · ∆udx = 0 and even if we add similarly to (8), in general. In fact, the identity (9), which is equivalent to has also been used crucially in various other works on the NSE (e.g. [31]), many of which have not been extended to the MHD system with (10) being one of the sources of the technical issues. Zhu and Zhu [56, pg. 4444-4445] gave a very nice discussion of how the proof within [10] cannot be extended to the 3-d NSE and thus most certainly has no chance of being extended to the 3-d MHD system; let us recollect it here. Da Prato and Debussche [10] considered (1) in T 2 , z to be the solution to the linear Stokes equation forced by the fixed STWN ξ u and the equation solved by v u − z, q π − p, specifically Similarly to the discussion of the Burgers' equation in (3), due to [30,Lemma 10.2] (see also [30,Lemma 3.20]) the solution z is very rough, and only in C α (T N ) for α < 1 − N 2 . Thus, if N = 2, then z ∈ C α (T 2 ) for α < 0 and considering div(z ⊗ z) ∈ C α (T 2 ) for α < −1, the diffusion leads to v ∈ C α (T 2 ) for α < 1. This implies that according to Bony's estimates (see Lemma 1.1 (4)) the product v ⊗ v and even v ⊗ z can be well-defined, leaving only z ⊗ z for which one can turn to Wick products, to be described in more detail subsequently. However, in the case N = 3 we would have z ∈ C α (T 3 ) for α < − 1 2 and thus div(z ⊗ z) ∈ C α (T 3 ) for α < −2 so that the diffusion leads to v ∈ C α (T 3 ) for α < 0. This implies that not only z ⊗ z but even z ⊗ v is ill-defined.
Two novel approaches have been developed to bring about a resolution to such an issue, specifically the theory of regularity structures due to Hairer [30] and that of paracontrolled distributions due to Gubinelli, et al. [25]. Both of these theories were strongly inspired by the rough path theory due to Lyons [41]; in fact, it is described in [30,Section 4.4] that rough path may be considered as an example of a regularity structure, and it is also acknowledged on [25, pg. 1] that their work is inspired from the theory of controlled rough path [24].
Without delving into the deep theory of the regularity structures, the heuristic behind it is the key observation that even though a function is typically said to be smooth if it may approximated by a Taylor polynomial, it is already somewhat misguided to apply this notion to a solution of an equation forced by STWN because locally it does not behave similarly to a polynomial but more similarly to the STWN convoluted with the Green function from diffusion. The work of Hairer [30] allows one to construct a regularity structure endowed with a whole set of calculus operations such as multiplication, integration and differentiation, so that one can recover a fixed point theory, and finally rely on the reconstruction theorem to conclude the existence and uniqueness of a solution to the original problem.
On the other hand, the theory of paracontrolled distributions relies heavily on the Bony's decomposition (e.g. [3, g. 86]) beside the rough path theory, which we now describe briefly. The purpose of the Bony's decomposition is to split f g in parts where the frequency of f and g are low and high, specifically The terms π < (f, g) and π > (f, g) are called paraproducts while π 0 (f, g) the remainder. The key observation by Bony was that π < (f, g) and similarly π > (f, g) are well-defined distributions such that the mapping (f, Heuristically π < (f, g) behaves at large frequencies similarly to g, and f provides only a modulation of g at large scales. The key lemma on which we will rely heavily is the following: By our discussion, only difficulty in defining the product f g boils down to π 0 (f, g), and for this purpose, Gubinelli, et al. in [25] relied on a paracontrolled ansatz (see (29) and (32)) and a commutator lemma (see Lemma 4.2).
Beside the work of Zhu and Zhu in [56], we wish to mention the work of Catellier and Chouk [8], by which our work was inspired directly. The purpose of this manuscript is to prove the local existence of a unique solution to the MHD system forced by the STWN (6a)-(6b); the specific statement will be stated in Theorems 1.2 and 3.2. To the best of the author's knowledge, this is a first attempt on the local well-posedness of a system of non-linearly coupled equations forced by STWN. It is worth noting that one should be able to provide another proof of our main result, or its analogue, by relying on the theory of regularity structures. Indeed, Hairer [30] introduced the notion of local subcriticality ([30, Assumption 8.3]), and showed that his theory may be applied particularly to parabolic Anderson model and Φ 4 3 model (see [30, pg. 417-418 and Sections 9.1, 9.2]) because they are locally subcritical. Instead of go through lengthy proof, let us convince ourselves to believe so by observing that the 3-d MHD system, similarly to the 3-d NSE, is locally subcritical satisfying the [30,Assumption 8.3]. Let us denote L ∂ t − ∆ and apply the Leray projection P on the MHD system (6a)-(6b) to deduce Then we may let β = 2 considering ∆ (see [30, pg. 417]) and we already know that |S| = 2 + N and ξ belongs to By [30,Assumption 8.3], e.g. we know U b P b has homogeneity (β +α)+(β +α−1) = 2β + 2α − 1 and the definition of local subcriticality requires 2β + 2α − 1 > α which boils down to 4 > N . Therefore, the 3-d MHD system is indeed locally subcritical.
We wish to point out here that by definition of local subcrticality in [30,Assumption 8.3], the 4-d NSE is actually not locally subcritical, and perhaps may be considered as locally critical if anything. Curiously in [49], there is a lengthy discussion of how fourth dimension is indeed the critical dimension in the study of Serrin regularity criteria, as well as partial regularity theory, for the NSE. This connection seems to be no coincidence and yet remains unclear to the author at the time of writing this manuscript.
Without further ado, let us state our main result; the precise statement is deferred until Theorem 3.2.
. Then there exists a unique local solution to The proof of an analogue of the Theorem 1.2 for the 2-d MHD system goes through verbatim as in our case of the dimension being three. However, considering our discussion concerning local subcriticality before Remark 1.2, we suspect significant difficulty will arise in an attempt to extend Theorem 1.2 for the 4-d case.
Let us also emphasize that it is completely inaccurate and actually misleading to believe that any result on the NSE may be generalized to the MHD system via more computations. As already mentioned, the work of Hairer and Mattingly [31] on the ergodicity of the 2-d NSE seems difficult to be extended to the 2-d MHD system. In the deterministic case, there exist also abundance of results for which an extension from the case of the NSE to the MHD system is a challenging open problem. For example, although Yudovich [55] over 55 years ago proved the global regularity of the solution to the 2-d NSE with zero viscous diffusion, which is the Euler equations, its extension to the 2-d MHD system with zero viscous diffusion remains open despite extensive interest from many mathematicians (e.g. [7,17,36,51] where ǫ ≥ 0 is the Hall parameter. We note that the case ǫ = 0 reduces (12a)-(12b) to the MHD system (6a)-(6b). Since this system was introduced by Lighthill [39] over 75 years ago, it has found rich applications in astrophysics, geophysics and plasma physics; we refer to [1,9] for its study in the deterministic case and [50,53] in the stochastic case. By the same computation of how we showed that the 3-d MHD system is locally subcritical, it can be shown that the N -d Hall-MHD system is not locally subcritical for any N ≥ 2. Indeed, ((∇ × b) · ∇)b from Hall term would become P b P b by the notations of [30,Assumption 8.3] which would have homogeneity of (β + α − 1) + (β + α − 1) with β = 2 and in order for the Hall-MHD system with spatial dimension being N to be locally subcritical, it requires (β + α − 1) + (β + α − 1) > α which boils down to 2 > N . In fact, Hairer [30] (e.g. [30,Abstract]) clear states that his theory works for a semi-linear PDE while the Hall-MHD system is quasi-linear. The author believes extending Theorem 1.2 to the Hall-MHD system is a mathematically challenging and physically meaningful open problem.
Remark 1.5. To the best of the author's knowledge, this is the first well-posedness result for a system of PDE which are non-linearly coupled and forced by STWN. All the previous work on the MHD and related systems forced by random force have been devoted to the case the noise is white in only time and not space (e.g. [4,45,47,52]). With this accomplished, it has become clearer how to establish similar results for other systems such as the Boussinesq system for which its study with STWN has been suggested by physicists for decades ([2, 23, 33, 48]) but shied away by mathematicians due to technical difficulty. Moreover, it will be interesting to study a system of PDE forced partially by STWN. For example, the Boussinesq system with only the equation of the temperature forced by STWN has been studied in the case the noise is white only in time (e.g. [19]).
We employ the theory of paracontrolled distributions from [25] and follow the work of [8,56]; some interesting non-trivial modifications must be made within our proof. In particular, renormalizations must be done very carefully considering the coupling (see Remark 3.4). Moreover, it is crucial to make appropriate paracontrolled ansatz (see (29) and (32)), and this is difficult due to the complex structure of the MHD system (see Remark 3.2).

Preliminaries
We review basic facts and computations on Wick products which will be used repeatedly throughout. We first need the following definition of a Feynman diagram: Definition 1.35]; we also refer to [44]) A Feynman diagram of order n ≥ 0 and rank r ≥ 0 is a graph consisting of a set of n vertices and a set of r edges without common endpoints. There are, thus, r disjoint pairs of vertices, each joined by an edge, and n − 2r unpaired vertices. The Feynman diagram is complete if r = n 2 so that all vertices are paired off and incomplete if r < n 2 so that some vertices are unpaired. A Feynman diagram labelled by n random variables ξ 1 , . . . , ξ n is a Feynman diagram of order n with vertices 1, . . . , n, where we think of ξ i as attached to vertex i. The value of such a labelled Feynman diagram γ with edges (i k , j k ), k = 1, . . . , r, and unpaired vertices {i Similarly we denote the order and rank of a Feynman diagram γ by n(γ) and r(γ), respectively.
The formula which is most useful in order to compute Wick products of the form : ξ 1 . . . ξ n : is the following: Theorem 3.4]) The Wick product is given by where we sum over all Feynman diagrams γ labeled by The following examples are consequences of Lemma 2.1: : : : The following lemma allows us to compute an expectation of products of Wick products.
where the summation is over all complete Feynman diagrams γ labeled by {ξ ij } i,j such that no edge joins two variables with ξ i1j1 and ξ i2j2 with i 1 = i 2 . Finally, the following inequality is standard and will be used many times: We leave the rest of the preliminaries results in the Appendix.
We now specify that and finally, we postpone specific description of the constants; e.g. C ǫ,ij 0,1 , C ǫ,ij 2,3 and C ǫ,ij 1,3 are given in (117), (162) and (195), respectively. Now we consider the following equations and define π 0,⋄ (u j 4 , u i 1 ) of (23b) as follows: where We also define a paracontrolled ansatz of additionally we define Similarly we may define π 0,⋄ (b i 4 , b j 1 ) of (23d) as follows: where π 0,⋄ (P ii1 π < (u i1 We also define a paracontrolled ansatz of additionally we define This step is absolutely crucial and it took a few trials and errors to finally see what it should be, even following the case of the NSE in [56], particularly the signs of the four terms within (32) were not clear at first. We chose (32) in order to make the proof work, particularly bearing in mind the crucial steps at (21), (51) and (31).
For π 0,⋄ (u i 4 , b j 1 ) of (23f), it is essentially identical to π 0,⋄ (u i 4 , u j 1 ) in (27) with u j 1 replaced by b j 1 because u i 4 has already been defined in (29). We leave details here: replaced by u j 1 , which is automatic because we already defined b i 4 in (32). In detail, (26), for all δ ∈ [0, 4] we may compute because e.g.
by (26) and Lemma 4.4. We fix Let us assume that for all i, j, i 1 , j 1 ∈ {1, 2, 3} so that we may define a finite number of let us write C ξ in case ǫ = 0. We mention in particular the inclusion of the last two summations in (37) will be crucial in (70) and (76). Now from (17) we see that by Lemmas 4.5 and 4.4, (35) and (37). Similarly from (18), we may compute by Lemma 4.4, (35) and (37), and therefore Next, from (19)- (21), we may compute by Lemma 4.5 where it is immediate that we may estimate for ǫ ∈ (0, 1) fixed, due to Lemma 4.4, (35) and Remark 3.1. Thus, we now focus on I 2 T . Firstly, we may estimate also for ǫ ∈ (0, 1) fixed, by Lemma 4.4 and Lemma 1.1 (4). Similar computations on other terms in I 2 T of (40) show that for all ǫ ∈ (0, 1) fixed, there exists a maximal existence time T ǫ > 0 and y 4 ∈ C([0, T ǫ ); C Now we set and realize that in the computation of (42), we could have instead estimated  (43). This leads us to the next estimate of by the paracontrolled ansatz (29) and (32). Firstly, we may estimate by Lemma 4.5, Lemma 1.1 (1), and (2). Similar estimates may be deduced for (44). Therefore, we obtain Now we obtain from (29) (20), (23a)-(23d). We make a crucial observation that we can cancel out π < (u i1 Similarly we can compute by (32), that L = ∂ t − ∆, (26), (21), (23e)-(23h). Again we cancel out π < (u j In contrast to the NSE, we not only have to define π 0 (u i by (29) and Leibniz rule. Similarly, by (32) and Leibniz rule. We can define π 0 (u i 4 , b j 1 ) and π 0 (b i 4 , u j 1 ) similarly. We only consider the first four terms in π 0 (u i 4 , u j 1 ) of (52) and π 0 (b i 4 , b j 1 ) of (53) as other terms are similar. For the first term in π 0 (u i 4 , u j 1 ) of (52) we write for the second term in π 0 (u i 4 , u j 1 ) of (52) we write for the third term in π 0 (u i 4 , u j 1 ) of (52) we write and for the fourth term in π 0 (u i 4 , u j 1 ) of (52) we write Similarly we can write the first four terms of we may firstly estimate where we used that −δ ≤ 1 2 − 3δ 2 − δ 0 due to (59), Lemmas 4.2 and 4.5. Remark 3.3. Let us emphasize strongly that this estimate (61) seems very difficult, if not impossible, without relying on the commutator estimate Lemma 4.2, e.g. by utilizing only Lemma 1.1.

3.2.2.
Terms in the fourth chaos. We wish to estimate where V I 1 t is that of (156) of which it suffices to estimate for example a mix term such as second and third terms multiplied; i.e.