Parametrix techniques and martingale problem for some degenerate Kolmogorov’s equations

. We prove the uniqueness of the martingale problem associated to some degenerate operators. The key point is to exploit the strong parallel between the new technique introduced by Bass and Perkins [BP09] to prove uniqueness of the martingale problem in the framework of non degenerated elliptic operators and the Mc Kean and Singer [MS67] parametrix approach to the density expansion that has previously been extended to the degenerate setting that we consider (see Delarue and Menozzi [DM10]).


Martingale problem and parametrix techniques
The martingale approach turns out to be particularly useful when trying to get uniqueness results for the stochastic process corresponding to an operator. In a recent work, R. Bass and E. Perkins [2] introduced in the framework of non-degenerate, non-divergence, time-homogeneous operators a new technique to prove uniqueness of the associated martingale problem. Precisely, for an operator of the form the authors prove uniqueness provided a is uniformly elliptic, bounded and uniformly η-Hölder continuous in space (η ∈ (0, 1]), i.e. there exists C > 0 s.t. for all (x, y) ∈ d , |a(x) − a( y)| ≤ C(1 ∧ |x − y| η ). That is, for a given starting point x ∈ d , there exists a unique probability measure on C( + , d ) s.t. denoting by (X t ) t≥0 the canonical process, [X 0 = x] = 1 and for t 0 L f (X s )ds is a -martingale. In the indicated framework, this result can be derived from the more involved Calderón-Zygmund like L p estimates established by Stroock and Varadhan [17], that only require continuity of the 234 DOI: 10.1214/ECP.v16-1619 diffusion matrix a, or from a more analytical viewpoint from some appropriate Schauder estimates, see e.g. Friedman [5]. Anyhow, the technique introduced in [2] can be related with the first step of Gaussian approximation of the parametrix expansion of the fundamental solution of (1.1) developed by McKean and Singer [13] that we now shortly describe. Suppose first that, additionally to the previous assumptions of ellipticity, boundedness and uniform Hölder continuity, the diffusion coefficient a is smooth (say C ∞ ( d , )). Thus, the fundamental solution p(s, t, x, y) of (1.1) exists and is smooth for t > s, see e.g [5]. Precisely, we have: where L * stands for the adjoint of L and acts on the y variable. For fixed starting and final points x, y ∈ d and a given final time t > 0, in order to estimate p(0, t, x, y), one introduces the Gaussian processX y u = x + σ( y)W u−s , u ∈ [s, t], s ≤ t, where (W u ) u∈[0,t−s] is a standard ddimensional Brownian motion and σσ * ( y) = a( y). Observe that the coefficient ofX y is frozen here at the point where we consider the density. Denote byp y (s, t, x, .) the density ofX y at time t starting from x at time s, and for ϕ ∈ C 2 0 ( d , ), define byL y ϕ(x) = 1 2 Tr(a( y)D 2 x ϕ(x)) its generator. The densityp y (s, t, x, .) satisfies the Kolmogorov equation: whereL y acts here on the x variable. Take now z = y in the above equation. By formal derivation and the previous Kolmogorov equations we obtain: p(0, t, x, y) −p y (0, t, x, y) = The previous uniform Hölder continuity assumption on a is therefore a sufficient (and quite sharp) condition to remove the time-singularity in H. The idea of the parametrix expansion is then to proceed in (1.2) by applying the same freezing technique to p(0, s, x, w) introducing the densitỹ p w (0, s, x, .) of the process with coefficients frozen at point w. One eventually gets the formal expansion where H ⊗k , k ≥ 1, stands for the iterated convolutions of H, and ∀(s, t, z, y) ∈ ( + ) 2 × d , p(s, t, z, y) :=p y (s, t, z, y). The Hölder continuity gives that H is a "smoothing" kernel in the sense that there exist c, C > 0 (with the same previous dependence) s.t.
with c, C depending on d, the uniform ellipticity constant and L ∞ bound of a and C depending on T as well. We refer to Konakov and Mammen [9] for details in this framework.
Up to now we supposed a was smooth in order to guarantee the existence of the density and justify the formal derivation in (1.2). On the other hand, the r.h.s. of (1.3) can be defined without additional smoothness on a than uniform η-Hölder continuity. The Gaussian upper bound (1.4) also only depends on the Hölder regularity of a. A natural question is to know whether the r.h.s. of (1.3) corresponds to the density of some stochastic differential equation under the sole assumptions of uniform ellipticity, boundedness and Hölder continuity on a. A positive answer is given by the uniqueness of the martingale problem associated to (1.1). Indeed, considering a sequence of equations with mollified coefficients, we derive from convergence in law, the Radon-Nikodym theorem and (1.4) that the unique weak solution of dX t = σ(X t )dW t associated to L admits a density that satisfies the previous Gaussian bound. It is actually remarkable that the uniqueness of the martingale problem can be proved using exactly the smoothing properties of the previous kernel H. That is what was achieved by Bass and Perkins [2] in the framework we described and it is the main purpose of this note in a degenerate setting.
To conclude this paragraph, let us emphasize that the previous parametrix approach has been used in various contexts. It turns out to be particularly well suited to the approximation of the underlying processes by Markov chains, see Konakov and Mammen [9,10] for the non-degenerate continuous case or [11] for the approximation of stable driven SDEs. On the other hand, recently, we used this technique to give a local limit theorem for the Markov chain approximation of a Langevin process [12] or two-sided bounds of some more general degenerate hypoelliptic operators [3]. In particular, in both works, we have an unbounded drift term. The unboundedness of the first order term imposes a more subtle strategy than the previous one for the choice of the frozen Gaussian density. Namely, one has to take into consideration in the frozen process the "geometry" of the deterministic differential equation associated to the first order terms of the operator. This will be thoroughly explained in the next section. Anyhow, the strategy of the previous articles allows to extend the technique of Bass and Perkins to prove uniqueness of the martingale problem for some degenerate operators with unbounded coefficients.

Statement of the Problem and Main Results
Consider the following system of Stochastic Differential Equations (SDEs in short) standing for a d-dimensional Brownian motion, and each (X i t ) t≥0 , 1 ≤ i ≤ n, being dvalued as well. From the applicative viewpoint, systems of type (1.5) appear in many fields. Let us for instance mention for n = 2 stochastic Hamiltonian systems (see e.g. Soize [16] for a general overview or Talay [18] and Hérau and Nier [6] for convergence to equilibrium). Again for n = 2, the above dynamics is used in mathematical finance to price Asian options (see for example [1]). For n ≥ 2, it appears in heat conduction models (see e.g. Eckmann et al. [4] and Rey-Bellet and Thomas [15] when the chain is forced by two heat baths).
In what follows, we denote a quantity in nd by a bold letter: i.e. 0, stands for zero in nd and the solution (X 1 t , . . . , X n t ) t≥0 to (1.5) is denoted by (X t ) t≥0 . Introducing the embedding matrix B from d into nd , i.e. B = (I d , 0, . . . , 0) * , where " * " stands for the transpose, we rewrite (1.5) in the shortened form We introduce the following assumptions: [1,n]] are uniformly Lipschitz continuous with constant κ > 0 (alternatively we can suppose for i = 1 that the drift of the non degenerated component F 1 is measurable and bounded by κ). The diffusion matrix (a(t, .)) t≥0 is uniformly η-Hölder continuous in space with constant κ, i.e.
There exists a closed convex subset which is an open set.
Assumptions (UE), (ND-η) can be seen as a kind of (weak) Hörmander condition. They allow to transmit the non degenerate noise of the first component to the other ones. Also, the particular structure of F (t, .) = (F 1 (t, .), · · · , F n (t, .) yields that the i th component has intrinsic time scale We notice that the coefficients may be irregular in time. The last part of Assumption (ND-η) will be explained in Section 2.1. We say that assumption (A-η) is satisfied if (R-η), (UE), (ND-η) hold. Under (A-η), we established in [3] Gaussian Aronson like estimates for the density of (1.5) over compact time interval [0, T ], for η > 1/2. Precisely, we proved that the unique weak solution of (1.5) admits a density that satisfies that for all T > 0, ∃C To derive (1.6), we proceeded using a "formal" parametrix expansion considering a sequence of equations with smooth coefficients for which Hörmander's theorem guaranteed the existence of the density, see. e.g. Hörmander [7] or Norris [14]. Anyhow, as in the previous paragraph, our estimates did not depend on the derivatives of the mollified coefficients but only on the η-Hölder continuity assumed in (A-η). Anyhow, to pass to the limit following the previously described procedure, some uniqueness in law is needed. Using the comparison principle for viscosity solutions of fully non-linear PDEs, see Ishii and Lions [8], we managed to obtain the bounds under (A-η), η > 1/2. However, the viscosity approach totally ignores the smoothing effects of the heat kernel and is not a "natural technique" to derive uniqueness in law. Introduce the generator of (1.5): Adapting the technique of Bass and Perkins [2] we obtain the following results.
In particular, weak uniqueness in law holds for the SDE (1.5).

Choice of the Gaussian process for the parametrix
In this section we describe the Gaussian processes that will be involved in the study of the martingale problem and that have been previously involved in the parametrix expansions of [3]. We first introduce in Section 2.1 a class of degenerate linear stochastic differential equations that admit a density satisfying bounds similar to those of equation (1.6). We then specify, how to properly linearize the dynamics of (1.5) so that the linearized equations belong to the class considered in Section 2.1.

Some estimates on degenerate Gaussian processes with linear drift
Introduce the stochastic differential equation: and U t ∈ nd ⊗ nd is an "upper triangular" block matrix with zero entries on its first d rows. We suppose that the coefficients satisfy the following assumption (A linear ): Denoting by (R(s, t)) 0≤s,t the resolvent associated to From Propositions 3.1 and 3.4 in [3], the family (K(s, t)) t∈(s,T ] of covariance matrices associated to the Gaussian process (G s,x t ) t∈(s,T ] satisfies, under (A linear ), a "good scaling property" in the following sense: Precisely the family (K(s, t)) t∈(s,T ] satisfies under (A linear ) a good scaling property with constant C := C(T,(A linear )).

Remark 2.1. We point out that it is precisely the second assumption of (A linear ) concerning the existence of convex subsets
This means that the off-diagonal bound of Gaussian processes with dynamics (2.1) and fulfilling (A linear ) is homogeneous to the square of the difference between the final point y and R(t, s)x (which corresponds to the transport of the initial condition by the deterministic system deriving from (2.1), that is

Linearization of the initial dynamics and associated estimates
The crucial feature of the parametrix method described in the introduction was to choose a "good" process to approximate the density of the diffusion. In the uniformly elliptic case, with bounded coefficients, one could take, as a first approximation, the Gaussian process with coefficients frozen in space at the fixed final spatial point where we wanted to estimate the density. The choice is natural since it makes the kernel H (defined in (1.2)) "compatible" with the bounds of the frozen density. It is precisely the off diagonal term in exp(−c|x − y| 2 /(t − s)) that allows to equilibrate the singularity in |x − y| η /(t − s) coming from the second order spatial derivatives. In their work, [2], Bass and Perkins exactly exploited the specific behavior of the singular kernel H which has an integrable singularity in time at 0 (see their Proposition 2.3), to derive uniqueness of the martingale problem in the non-degenerate time-homogeneous framework. This approach provides a natural link between parametrix expansions and the study of martingale problems for uniformly Hölder continuous coefficients. Parametrix expansions, to derive density estimates on systems of the form (1.5), have been discussed in [3]. We thus have in the current degenerate framework of assumption (A-η) some natural candidate defined below. The key idea is to consider a "degenerate" Gaussian process whose density anyhow has a specific "off-diagonal" behavior similar to the one exhibited in equation (2.3) and to choose the freezing process in order that the singularity deriving from H is still compatible with the "off-diagonal" bound in the sense that it will be sufficient to remove the time-singularity. We follow the same line of reasoning in our current framework. For fixed parameters T > 0, y ∈ nd , introduce the linear equation: where (θ t,T (y)) t≥0 solves the ODE [d/d t]θ t,T (y) = F(t, θ t,T (y)), t ≥ 0, with the boundary condition θ T,T (y) = y and ∀(t, x) ∈ [0, T ] × nd , is the subdiagonal of the Jacobian matrix D x F. Writep T,y (t, T, x, .) for the density ofX T,y T starting from x at time t. The deterministic ODE associated withX T,y has the form d d tφ t = F(t, θ t,T (y)) + DF(t, θ t,T (y))[φ t − θ t,T (y)], t ≥ 0. (2.5) We denote by (θ Above, (R T,y (t, s)) s,t≥0 stands for the resolvent associated with the matrices (DF(t, θ t,T (y))) t≥0 . We now claim Lemma 2.1. Let T 0 > 0 be fixed. There exists a constant C 2.1 ≥ 1, depending on (A) and T 0 such that, for any t ∈ [0, T ), T ≤ T 0 and x, y ∈ nd , This means that we can compare the rescaled "forward" transport of the initial condition x from t to T by the linear flow and the rescaled "backward" transport from T to t of the final point y by the original deterministic differential dynamics. We refer to Lemma 5.
where for all a > 0, t > 0, g a,t (y) = t −n 2 d 2 exp(−a −1 t| −1 t y| 2 ), a, t > 0, y ∈ nd . For all 0 ≤ s < t ≤ T, z, y ∈ nd , define now the kernel H as: where L is the generator of the initial diffusion (1.5) defined in (1.7),L t,y s,z andp t,y (s, t, z, .) respectively stand for the generator at time s and the density at time t ofX t,y ,X t,y s = z with coefficients "frozen" w.r.t. t, y. The lower script in z is to emphasize that z is the differentiation parameter in the operator. We have the following control on the kernel (see Lemma 5.5 of [3] for a proof).
We conclude this section with a technical Lemma whose proof is postponed to Appendix A. Then Ξ (s, x) converges boundedly and pointwise to h(s, x) when → 0.

Proof of Theorem 1.1
Suppose we are given two solutions 1 , 2 of the martingale problem associated to (L t ) t∈ [s,T ] starting in x at time s. W.l.o.g. we can suppose here that T ≤ 1. Define for a bounded Borel function f : [0, T ] × nd → : where (X t ) t∈[s,T ] stands here for the canonical process associated to ( i ) i∈[ [1,2]] . Let us specify (as indicated in [2]) that S i f is only a linear functional and not a function since i does not need to come from a Markov process. Let us now introduce ) then by definition of the martingale problem we have: For a fixed point y ∈ nd and ≥ 0, We insist that in the above equationp t+ ,y (s, t, x, z) stands for the density at time t and point z of the processX t+ ,y defined in (2.4) starting from x at time s with coefficients depending on the backward transport of the freezing point y by (θ u,t+ ) u∈ [s,t] . In particular, the parameter can be equal to 0 in the previous definition.  + , x, y)h(t, y), (3.5) exploiting the semigroup property of the frozen densityp t+ ,y for the last inequality. Write now for all (s, x) ∈ [0, T ) × nd , Thus, from Lemma 2.3, using the bi-Lipschitz property of the flow for the last but one inequality and up to a modification of C 3.6 in the last one. Anyhow, the constant C 3.6 only depends on known parameters in (A-η).
Thus for T and sufficiently small we have from (3.6) Now, equation (3.1) and the above definition of S ∆ yield: From the bounded convergence part of Lemma 2.4 and (3.7), we have: By a monotone class argument, the previous inequality remains valid for bounded measurable functions h compactly supported in [0, T ) × nd . Taking the supremum over h ∞ ≤ 1, we obtain Θ ≤ 1 2 Θ which gives Θ = 0 since Θ < +∞. Hence, 1
The structure of the "partial gradient" DF s+ ,y associated to the η-Hölder continuity of the mapping up to a modification of C 3 and using the bi-Lipschitz property of the flow θ for the last but one inequality (see the end of the proof of Proposition 5. where there exists a constantC :=C((A-η)) independent of s.t.
Plugging this estimate into (A.7), we then get from (A.6), using as well the good scaling property (A.2), that there exists C 6 := C 6 ((A-η)), up to a modification of C 6 in the last inequality. Let us consider now Ξ 2 (s, x). Write: