On the Distances between Probability Density Functions

J o u r n a l o f P r o b a b i l i t y Electron. Abstract We give estimates of the distance between the densities of the laws of two functionals F and G on the Wiener space in terms of the Malliavin-Sobolev norm of F − G. We actually consider a more general framework which allows one to treat with similar (Malliavin type) methods functionals of a Poisson point measure (solutions of jump type stochastic equations). We use the above estimates in order to obtain a criterion which ensures that convergence in distribution implies convergence in total variation distance; in particular, if the functionals at hand are absolutely continuous, this implies convergence in L 1 of the densities.


Introduction
In this paper we give estimates of the distance between the densities of the laws of two functionals F and G on the Wiener space in terms of the Malliavin-Sobolev norm of F − G. Actually, we consider a slightly more general framework defined in [5] or [6] which allows one to treat with similar methods functionals of a Poisson point measure (solutions of jump type stochastic equations).Such estimates may be used in order to study the behavior of a diffusion process in short time as it is done in [3].But here we focus on a different application: we use the above estimates in order to obtain a criterion which guarantees that convergence in distribution implies convergence in total variation distance; in particular, if the functionals at hand are absolutely continuous, this implies convergence in L 1 of the densities.Moreover, by using some more general distances, we obtain the convergence of the derivatives of the density functions as well.The main estimates are given in Theorem 2.1 in the general framework and in Theorem 2.14 in the case of the Wiener space.The convergence result is given in Theorem 2.11 and, for the Wiener space, in Theorem 2.20.
The reader interested in the Wiener space case may go directly to Section 2.4.For functionals on the Wiener space we get one more result which is in between the Bouleau-Hirsch absolute continuity criterion and the classical criterion of Malliavin for existence and regularity of the density of the law of a d dimensional functional F : we prove that if F ∈ D 2,p with p > d and P(det σ F > 0) > 0 (σ F denoting the Malliavin covariance matrix of F ) then, conditionally to {σ F > 0} the law of F is absolutely continuous and the density is lower semi-continuous.This regularity property implies that the law of F is locally lower bounded by the Lebesgue measure and this property turns out to be interesting -see the joint paper [4].
In the last years number of results concerning the weak convergence of functionals on the Wiener space using Malliavin calculus and Stein's method have been obtained by Nourdin, Peccati, Nualart and Poly, see [16], [17] and [19].In particular in [16] and [19] the authors consider functionals living in a finite (and fixed) direct sum of chaoses and prove that, under a very weak non degeneracy condition, the convergence in distribution of a sequence of such functionals implies the convergence in total variation.Our initial motivation was to obtain similar results for general functionals: we consider a sequence of d dimensional functionals F n , n ∈ N, which is bounded in D 3,p for every p ≥ 1.Under a very weak non degeneracy condition (see (2.39)) we prove that the convergence in distribution of such a sequence implies the convergence in the total variation distance.
Moreover we prove that if a sequence F n , n ∈ N, is bounded in every D 3,p , p ≥ 1, lim n F n = F in L 2 and det σ F > 0 a.s., then lim n F n = F in total variation.Recently, Malicet and Poly [15] have proved an alternative version of this result: if lim n F n = F in D 1,2 and det σ F > 0 a.s.then the convergence takes place in the total variation distance.
The paper is organized as follows.In Section 2.1, following [5], we introduce an abstract framework which permits to obtain integration by parts formulas.In Section 2.2 we give the main estimate (the distance between two density functions) in this framework and in Section 2.3 we obtain the convergence results.In Section 2.4 we come back to the Wiener space framework, so here the objects and the notations are the standard ones from Malliavin calculus (we refer to Nualart [18] for the general theory).Section 3 is devoted to the proof of the main estimate, that is of Theorem 2.1.Finally, in Section 4 we illustrate our convergence criterion with an example of jump type equation coming from [5].

Abstract integration by parts framework
In this section we briefly recall the construction of integration by parts formulas for functionals of a finite dimensional noise which mimic the infinite dimensional Malliavin calculus as done in [5] and [6].We are going to introduce operators that represent the finite dimensional variant of the derivative and the divergence operators from the classical Malliavin calculus -and as an outstanding consequence all the constants which appear in the estimates do not depend on the dimension of the noise.So, given some constants c i ∈ R, i = 1, ..., m we denote by C(c 1 , ..., c m ) the family of universal constants which depend on c i , i = 1, ..., m only.So C ∈ C(c 1 , ..., c m ) means that C depends on c i , i = 1, ..., m but on nothing else in the statement.This is crucial in the following theorems.
On a probability space (Ω, F, P) we consider a random variable V = (V 1 , ..., V J ) which represents the basic noise.Here J ∈ N is a deterministic integer.For each i = 1, ..., J we consider two constants −∞ ≤ a i < b i ≤ ∞ that are allowed to reach ∞.We denote O i = {v = (v 1 , ..., v J ) : a i < v i < b i }, i = 1, . . ., J. The basic hypothesis is that the law of V is absolutely continuous with respect to the Lebesgue measure on R J and the density p J is smooth with respect to v i on the set O i .
The natural example which comes on in the standard Malliavin calculus is the Gaussian law on R J , in which a i = −∞ and b i = +∞.But we may also (as an example) take V i independent random variables of exponential law and here, a i = 0 and b i = ∞.
In order to obtain integration by parts formulas for functionals of V , one performs classical integration by parts with respect to p J (v)dv.But in order to nullify the border terms in a i and b i , it suffices to take into account suitable "weights" We give the precise statement of the hypothesis.But let us first set up the notations we are going to use.We set C k (R d ) the space of the functions which are continuously differentiable up to order k and C ∞ (R d ) for functions which are infinitely differentiable.We use the subscripts p, resp.b, to denote functions having polynomial growth, resp.bounded, together with their derivatives, and this gives So, throughout this paper, we assume the following assumption does hold.
Assumption.The law of the vector V = (V 1 , ..., V J ) is absolutely continuous with respect to the Lebesgue measure on R J and we denote with p J the density; we assume that p J has polynomial growth.We also assume that We define now the functional spaces and the differential operators.

Simple functionals.
A random variable F is called a simple functional if there exists f ∈ C ∞ p (R J ) such that F = f (V ).We denote through S the set of simple functionals.Simple processes.A simple process is a random variable U = (U 1 , . . ., U J ) in R J such that U i ∈ S for each i ∈ {1, . . ., J}.We denote by P the space of the simple processes.On P we define the scalar product The derivative operator.We define D : S → P by The divergence operator.Let U = (U 1 , . . ., U J ) ∈ P, so that U i ∈ S and U i = u i (V ), for some u i ∈ C ∞ p (R J ), i = 1, . . ., J. We define δ : P → S by Clearly, both D and δ depend on π so a correct notation should be D π and δ π .Since here the weights π i are fixed, we do not mention them in the notation.The Malliavin covariance matrix.For F ∈ S d , the Malliavin covariance matrix of F is defined by We also denote The Ornstein Uhlenbeck operator.We define L : S → S by L(F ) = δ(DF ). (2.4) Higher order derivatives and norms.Let α = (α 1 , . . ., α k ) be a multi-index, with α i ∈ {1, . . ., J}, for i = 1, . . ., k and |α| = k.For F ∈ S, we define recursively We set D (0) F = F and we notice that D (1) ( Moreover, we introduce the following norms for simple functionals: for F ∈ S we set and for F = (F 1 , . . ., Finally, for U = (U 1 , . . ., U J ) ∈ P, we set D (k) U = (D (k) U 1 , . . ., D (k) U J ) and we define the norm of D (k) U as We allow the case k = 0, giving |U | = U, U  Since |F | 0 = |F |, one has F 0,p,Θ = F p,Θ .In the case Θ = 1 we come back to the standard notation: (2.9) Notice also that since Θ ≤ 1 we have F 1,l,p,Θ ≤ F 1,l,p and F l,p,Θ ≤ F l,p . (2.10) For p ∈ N we set m q,p (Θ) := 1 ∨ ln Θ 1,q,p,Θ . (2.11) Since Θ > 0 almost surely with respect to P Θ the above quantity makes sense.
We will work with localization random variables of the following specific form.For a > 0, set ψ a , φ a : R → R + as follows: The function ψ a is suited to localize around zero and φ a is suited to localize far from zero.Then ψ a , φ a ∈ C ∞ b (R), 0 ≤ ψ a ≤ 1, 0 ≤ φ a ≤ 1 and we have the following property: for every p, k ∈ N there exists a universal constant C k,p such that for every We consider now Θ i ∈ S and a i > 0, i = 1, ..., l + l and define φ ai (Θ i ). (2.14) As an easy consequence of (2.13) we obtain (2.15) (2.16) Moreover, given some q ∈ N, p ≥ 1 we denote U q,p,Θ (F ) := max{1, E Θ ((det σ F ) −p )( F 1,q+2,p,Θ + LF q,p,Θ )}. (2.17) In the case Θ = 1 we have m q,p (Θ) = 1 and U q,p (F ) := max{1, E((det σ F ) −p )( F 1,q+2,p + LF q,p )}. (2.18) Notice that U q,p,Θ (F ) and U q,p (F ) do not involve the L p norm of F but only of its derivatives and of LF.
We are now able to state the main result in our paper.
A. Let F ∈ S d be such that U q,p,Θ (F ) < ∞ for every p ∈ N. Then under P Θ the law of F is absolutely continuous with respect to the Lebesgue measure.We denote by p F,Θ its density and we have p F,Θ ∈ C q−1 (R d ).Moreover there exist C, a, b, p ∈ C(q, d) such that for every y ∈ R d and every multi index α = (α 1 , ..., α k ) ∈ {1, ..., d} k , k ∈ {0, ..., q} one has ( (2.20) Remark 2.2.The above result can be written in the case Θ = 1.Here, m q+2,p (Θ) = 1 and the quantities F − G q+2,p,Θ and LF − LG q,p,Θ are replaced by F − G q+2,p and LF − LG q,p respectively.
and similarly for G.Then, the second factors in formulas (2.19) and (2.20) may be written in terms of the above inequality as follows: for every ≥ 1 and y ∈ R d , and The proof of Theorem 2.1 is the main effort in our paper and it is postponed for Section 3 (see Proposition 3.8 C. and Theorem 3.10).
As a consequence of Theorem 2.1 we obtain the following regularization result.Let γ δ be the density of the centred normal law of covariance δ × I on R d .Here δ > 0 and I is the identity matrix.

Lemma 2.5.
There exist universal constants C, p, a ∈ C(d) such that for every ε > 0, δ > 0 and every F ∈ S d one has for every bounded and measurable f : Notice that in the r.h.s. of (2.24) F 3,p is replaced by F 1,3,p so F p is not involved.The price to be paid is that we have to replace f ∞ with f ∞ + f 1 .
Proof.Along this proof C denotes a constant in C(d) which may change from a line to another.We construct the localization random variable Θ ε = φ ε (det σ F ) with φ ε given in (2.12).By (2.15) (2.25) We fix δ ∈ (0, 1) and we define F δ = F + √ δ ∆ where ∆ is a standard Gaussian random variable independent of V. We will use the result in Theorem 2.1, here not with respect to V = (V 1 , ..., V J ) but with respect to (V, ∆) = (V 1 , ..., V J , ∆).The Malliavin covariance matrix of F with respect to (V, ∆) is the same as the one with respect to V (because F does not depend on ∆) so on the set {Θ ε = 0} we have det σ F ≥ ε.We denote by σ F δ the Malliavin covariance matrix of F δ computed with respect to (V, ∆).We have σ F δ ξ, ξ = δ |ξ| 2 + σ F ξ, ξ .By Lemma 7-29, pg 92 in [9], for every symmetric non negative defined matrix Q one has where C 1 and C 2 are universal constants.Using these two inequalities we obtain det σ It is also easy to check that By using (2.25), we apply Theorem 2.1 and we obtain The r.h.s. of the above inequality does not depend on y, so its integral over R d is infinite.
In order to obtain a finite integral we use inequality (2.22) discussed in Remark 2.4 with large enough: we may find C, p, a, b But now F p comes on and this is why we have to replace F 1,3,p by F 3,p .
Moreover one can easily check using directly the definitions that (2.28) We are now ready to start the proof of our Lemma.We take We have We use (2.28) in order to obtain Using (2.26) we obtain (2.24).
In the one-dimensional case, the requests in Lemma 2.5 can be weakened: only the Malliavin derivatives up to order 2 are required.And moreover, precise estimates can be given.In fact, one has: Lemma 2.6.Let d = 1.There exists a universal constant C > 0 such that for every ε > 0, δ > 0 and every F ∈ S one has for every bounded and measurable function f : R → R.
Proof.The statement can be proved in several ways, we propose here a short proof that makes use of integration by parts formulas and weights that are developed in next Section 3.1.
We use notations as in the proof of Lemma 2.5.So, we take f with f ∞ < ∞ and we write EJP 19 (2014), paper 110.
The term I(δ, ε) is handled as before, so Now, by using the (localized) integration by parts formula in Proposition 3.5, we have where The estimate of the last expectation is developed for general values of d and general localizations Θ in Section 3.1.But for d = 1 and Θ = Θ ε , very precise estimates can be given.In fact, since Dγ F = −σ −2 F Dσ F and since on the set {Θ ε = 0} one has σ F > 2/ε, Now, by using (2.13) we have Therefore, and the statement holds.We finally note that in dimension d > 1, a similar reasoning would bring to take primitives of f − f * γ δ in all the directions of the space and in the end one has to do d integration by parts in order to remove the derivatives, and this needs Malliavin derivatives for F up to order d + 1.The use of the Riesz transform can actually overcome this difficulty.

Distances and basic estimate
In this section we discuss the convergence in the total variation distance defined by The convergence in this distance is related to the convergence of the densities of the laws: given a sequence of random variables We also consider the Fortet-Mourier distance defined by and the Wasserstein distance The convergence in d W is equivalent to the convergence in distribution plus the convergence of the first order moments.Clearly d F M (F, G) ≤ d W (F, G) so convergence in distribution plus the convergence of the first order moments implies convergence in . The aim of this section is to prove a kind of converse type inequality.
We will be interested in a larger class of distances that we define now.
Then we define We optimize over δ: we take We insert this in the previous inequality and we obtain (2.33). (2.34) Then under the hypotheses of Theorem 2.7 one has Proof.It suffices to apply Theorem 2.7 and then to optimize w.r.t.ε > 0.
Remark 2.9.When d = 1, in the proof of Theorem 2.7 and Corollary 2.8 we can use the precise estimate in Lemma 2.6.Therefore, in the one-dimensional case we set and Theorem 2.7 becomes: for k ∈ N, there exists a universal constant C > 0 such that for every F, G ∈ S and every ε > 0 one has Similarly, Corollary 2.8 can be rephrased as follows: under (2.34), for k ∈ N, there exists a universal constant C > 0 such that for every F, G ∈ S one has (2.36)

Convergence results
In the previous sections we considered a functional F ∈ S d with S associated to a certain random variable V = (V 1 , ..., V J ).So F = f (V ).But the estimates that we have obtained are estimates of the law and so it is not necessary that the random variables at hand are functionals of the same V.We may have F = f (V ) and F = f (V ) with V = (V 1 , ..., V J ).Having this in mind, for a fixed random variable V = (V 1 , ..., V J ) we denote by S(V )} the space of the simple functionals associated to V. We denote by σ F (V ) the Malliavin covariance matrix and (2.37) Here the norms F q,l and the operator LF are defined as in (2.7) and (2.4) with respect to V .
In the following we will work with a sequence (F n ) n∈N of d dimensional functionals ).We will use the following two assumptions.First, we consider a regularity assumption: (2.38) The second one is a (very weak) non degeneracy hypothesis: where λ(F n ) is the smaller eigenvalue of σ Fn (V (n) ).
Proof.The statement is trivial for d = 1, so we consider the case d We conclude that η(ε) ≤ ε Theorem 2.11.We consider a sequence of functionals and we assume that (2.38) and (2.39) hold.Suppose also that lim n (2.41) In particular if the laws of F and F n are absolutely continuous with density p F and p Fn respectively, then Since lim n F n = F in law, one has that lim sup n,m→∞ d 1 (F n , F m ) = 0, so that lim sup n,m→∞ This is true for every ε > 0. So using (2.39) we obtain lim sup n,m→∞ d 0 (F n , F m ) = 0.
As a consequence, we can state a similar result under a stronger non degeneracy hypothesis: ) be a sequence of functionals such that (2.38) holds and such that there exists α > 0 with Proof.Since (2.42) implies (2.39), the statement follows by applying Theorem 2.11.We also note that, by applying Corollary 2.8 with k = 1, we can also state that Note that (2.44) gives an estimate of the total variation distance in terms of the Fortet-Mourier distance starting from the non-degeneracy condition (2.43).A similar non-degeneracy request has been already discussed in [17], where the convergence in total variation is studied for sequences in a finite sum of Wiener chaoses that converge in distribution.We also note that (2.44) can be generalized to functionals that are not necessarily simple ones by passing to the limit (for this, it is crucial that constants are "universal", that is independent of the functionals).

Functionals on the Wiener space
Let (Ω, F, P) be a probability space where a Brownian motion W = (W 1 , ..., W N ) is defined.We briefly recall the main notations in Malliavin calculus, for which we refer to Nualart [18].We denote by D m,p the space of the random variables which are m times differentiable in Malliavin sense in L p and for a multi-index α = (α 1 , . . ., α k ) ∈ {1, . . ., N } k , k ≤ m, we denote by D α F the Malliavin derivative of F corresponding to the multi-index α.Moreover we define If σ F is invertible, we denote through γ F the inverse matrix.Finally, as usual, the notation L will be used for the Ornstein-Uhlenbeck operator and we recall that the Meyer inequality asserts that LF m,p ≤ C m,p F m+2,p , for F ∈ (D m+2,∞ ) d .
Our aim is to rephrase the results from the previous sections in the framework of the Wiener space considered here.We introduce first the localization random variable Θ.
We consider some random variables Θ i and some numbers a i > 0, i = 1, ..., l + l and we define φ ai (Θ i ).
A. Let F ∈ (D q+2,∞ ) d be such that U q,p,Θ (F ) < ∞.Then under P Θ the law of F is absolutely continuous with respect to the Lebesgue measure.We denote by p F,Θ its density and we have p F,Θ ∈ C q−1 (R d ).Moreover there exist C, a, b, p ∈ C(q, d) such that for every y ∈ R d and every multi index α = (α 1 , ..., α k ) ∈ {1, ..., d} k , k ∈ {0, ..., q} one has (2.52) ) d be such that U q+1,p,Θ (F ), U q+1,p,Θ (G) < ∞ for every p ∈ N and let p F,Θ and p G,Θ be the densities of the laws of F respectively of G under P Θ .Then there exist C, a, b, p ∈ C(q, d) such that for every y ∈ R d and every multi index Proof of Theorem 2.14.One may prove Theorem 2.14 just by repeating exactly the same reasoning as in the proof of Theorem 2.1: all the arguments are based on the properties of the norms from the finite dimensional calculus and these properties are preserved in the infinite dimensional case.However we give here a different proof: we obtain Theorem 2.14 from Theorem 2.1 by using a convergence argument.
We fix n ≥ 1.For k = 1, . . ., 2 n we denote On the distances between probability density functions So, by taking ∆ k n , k = 1, . . ., 2 n , as the underlying noise V 1 , . . ., V J and by taking the weights π k,i = 2 −n/2 , k = 1, . . ., 2 n and i = 1, . . ., N , it is easy to see that the finite dimensional Malliavin calculus in Section 2.1 and the standard Malliavin calculus coincide for simple functionals (see e.g.[1] for details).So, we set )} and we take F n , G n , Θ n,i ∈ S n , n ∈ N, i = 1, ..., l + l which approximate F, G, Θ i ∈ D q+2,∞ , i = 1, ..., l + l .We use Theorem 2.1 for them and then we pass to the limit in order to obtain the conclusion in Theorem 2.14.The fact that the constants which appear in Theorem 2.1 belong to C(q, d), so do not depend on n ∈ N, plays here a crucial role.
We give now a regularity property which is an easy consequence of the above theorem.Theorem 2.16. A. Let F ∈ D 2,p , p > d such that P(det σ F > 0) > 0.Then, conditionally to {det σ F > 0}, the law of F is absolutely continuous with respect to the Lebesgue measure and the density is lower semi-continuous.

B.
In particular the law of F is locally lower bounded by the Lebesgue measure λ in the following sense: there exist δ > 0 and an open set D ⊂ R d such that for every Borel set A P(F ∈ A) ≥ δλ(A ∩ D).
Remark 2.17. .The celebrated theorem of Bouleau and Hirsch [10] says that if F ∈ D 1,2 then, conditionally to {det σ F > 0}, the law of F is absolutely continuous.So it requires much less regularity than us.But the new fact is that the conditional density is lower semi-continuous and in particular is locally lower bounded by the Lebesgue measure.This last property turns out to be especially interesting -see the joint paper [4].
Proof of Theorem 2.16.For ε > 0 we consider the localization function ψ ε defined in (2.12) and we denote Θ ε = ψ ε (det σ F ).By Theorem 2.14 we know that under P Θε the law of F is absolutely continuous and has a continuous density p Θε .Let A be a Borel set with λ(A) = 0 where λ is the Lebesgue measure.Since Θ ε ↑ Θ := 1 {det σ F >0} we have So we may find p Θ such that so that p Θ ≥ p Θε a.e.This implies that p Θ ≥ sup ε>0 p Θε .We claim that p Θ = sup ε>0 p Θε which gives that p Θ is lower semi-continuous.In fact, set A = {x : p Θ (x) > sup ε>0 p Θε (x)}.
The assertion B is immediate: since p Θ = sup ε>0 p Θε is not identically null we may find ε > 0 and x 0 ∈ R d such that p Θε (x 0 ) > 0. And since p Θε is a continuous function we may find r, δ > 0 such that p Θε (x) ≥ δ for x ∈ B r (x 0 ).It follows that We rephrase now other consequences of Theorem 2.14.We begin with the regularization Lemma 2.5.We recall that γ δ is the centred Gaussian density with variance δ > 0.
Lemma 2.18.There exist universal constants C, p, a ∈ C(d) such that for every ε > 0, δ > 0 and every F ∈ (D 3,∞ ) d one has Proof.The proof is identical with the one of Lemma 2.5 so we skip it (an approximation procedure may also been used).We mention that due to Meyer's inequalities LF 1,p does no more appear here.
We consider now the distances d m defined in (2.31) and we rewrite Theorem 2.7: Proof.The proof is identical with the one of Theorem 2.7 so we skip it.
We give now the convergence results.Theorem 2.20.We consider a sequence of functionals (2.56) In particular if the laws of F and F n are absolutely continuous with densities p F and p Fn then Proof.The proof is identical with the one of Theorem 2.11 so we skip it.
In the framework of Wiener functionals we are able to obtain one more result: Proof.We will prove that lim n DF i n , DF j n = DF i , DF j in probability for every i, j = 1, ..., d.This implies lim n det σ Fn = det σ F in probability so that lim sup n P(det σ Fn < ε) ≤ P (det σ F < 2ε).And since lim ε→0 P (det σ F > ε) = 1, we obtain (2.56) ii) and this enables us to conclude by applying Theorem 2.20.
We denote by f k , k ∈ N, respectively by f k,n , k ∈ N, the kernels of the chaos expansion of F , respectively of F n .So we have where I k denotes the multiple integral of order k.For N ∈ N we write Since the sequence |DF n | , n ∈ N is bounded in L 1 our conclusion follows as soon as we check that lim n |D(F − F n )| = 0 in probability.We fix ε > 0 and we write Using Chebyshev's inequality .
and a similar inequality holds for E |DR N,n | 2 .We conclude that Since this is true for each N the above limit is null.Remark 2.22.As an immediate consequence of Corollary 2.21 one may obtain the following result.Let X t be a diffusion process with coefficients in C ∞ b and suppose that the weak Hörmander condition holds in x = X 0 .Consider also the Euler scheme X n t of step 1 n .Then for every q ∈ N one has d −q (µ Xt , µ X n t ) → 0. This type of result has already been obtained in [7], [8] and in [12]: there, under more restrictive assumptions (uniform Hörmander condition) one obtains the above result and moreover, one gives a development in Taylor series of the error.

Proof of Theorem 2.1
This section is devoted to the proof of Theorem 2.1.We are in the framework defined in Section 2.1 and we use all the notations introduced there.In the following subsection we recall and develop some basic results concerning integration by parts formulas from [5].

Integration by parts formulae
By using standard integration by parts formulas, one gets the duality between δ and D and the standard computation rules (see [5], Proposition 1 and Lemma 1): under our assumption, for every F ∈ S d , U ∈ P and φ : R with the convention that d = 1 in (3.1) and (3.3).Once the above equalities are done, the integration by parts formulas can be stated (see [5], Theorem 1 and 2): γ F denoting the inverse of the Malliavin covariance matrix σ F .Then, for every G ∈ S and for every smooth function φ ) Moreover, for every q ∈ N * and multi-index β = (β 1 , . . ., β q ) ∈ {1, . . ., d} q then where the weights H q β (F, G) are defined recursively by (3.7) if q = 1 and for q > 1, (3.9) EJP 19 (2014), paper 110.

Estimates of the weights
In this section we give estimates of the weights H q α (F, G) appearing in the integration by parts formulae of Theorem 3.1 using the norms introduced in (2.8).We first deal with useful estimates for the inverse of the Malliavin covariance matrix.For F ∈ S d , we set m F = max 1, 1 det σ F . (3.10) 1,l+1 ). (3.11) (3.12) Proof.A is proved in [5], Proposition 2. As for B, we use the following estimates proved in [5] (see Lemma 2 and the proof of Proposition 2): in which σ F denotes the algebraic complement of σ F .Then, by using also (3.15) Since γ r,r (F ) = (det σ F ) −1 σ r,r (F ), (3.12) follows by using the above estimates.
We define now Using (3.12) one can easily check that for every l ∈ N EJP 19 (2014), paper 110.
And by using both (3.12) and (3.18), one immediately gets where For F ∈ S d , we define the linear operator T r (F, •) : S → S, r = 1, ..., d by Moreover, for a multi-index β = (β 1 , .., β q ) we denote |β| = q and we define by induction For l ∈ N and F, F ∈ S d , we denote We notice that Θ l (F ) Then for every l ∈ N and for every multi index β with |β| = q ≥ 1 one has and where C ∈ C(l, d, q).
We can now establish estimates for the weights H q .For l ≥ 1 and F, F ∈ S d , we set (with the convention EJP 19 (2014), paper 110.
Theorem 3.4.A. For l ∈ N and q ∈ N * there exists C ∈ C(l, d, q) such that for every F ∈ S d , G ∈ S and for every multi-index β = (β 1 , .., β q ) A l+q (F ) being defined in (3.26).

B.
There exists C ∈ C(l, d, q) such that for every F, F ∈ S d , G, G ∈ S and every multi-index β = (β 1 , .., β q ) By using (3.19) and (3.24), we can write So, the statement holds for q = 1.And for q > 1 it follows by iteration and by using the fact that A l+1 (F ) ≤ A l+q (F ).

Localized representation formulas for the density
In this section we discuss localized integration by parts formulas with the localization random variable Θ defined in (2.14).We will use the norms F p,Θ and F 1,l,p,Θ , F l,p,Θ defined in (2.8).We also recall that m q,p (Θ) is defined in (2.11) and that an estimate of this quantity is given in (2.15).
We give now the integration by parts formula with respect to P Θ (that is, locally) and we study the regularity of the law starting from the results in [2].
Once for all, in addition to m q,p (Θ) we define the following quantities: for p ≥ 1, with the convention S F,Θ (p) = +∞ if the r.h.s. is not finite.Proposition 3.5.Let κ ∈ N * and assume that m κ,p (Θ) < ∞ for all p ≥ 1.Let F ∈ S d be such that S F,Θ (p) < ∞ for every p ∈ N. Let γ F be the inverse of σ F on the set {Θ = 0}.Then the following localized integration by parts formula holds: for every f ∈ C ∞ b (R d ), G ∈ S and for every multi index α of length equal to q ≤ κ one has where as r = 1, . . ., d and for a general multi index β with |β| = q H q β,Θ (F, G) = H βq,Θ F, H q−1 (β1,...,βq−1),Θ (F, G) .
Proof.For |β| = 1, the integration by parts formula immediately follows from the equality , and this gives the formula for H i,Θ (F, G).For higher order integration by parts it suffices to iterate this procedure.
We give now estimates for the weights in the integration by parts formula.Proposition 3.6.Let κ ∈ N * and l ∈ N be such that m l+κ+1,p (Θ) < ∞ for all p ≥ 1.Let F, F ∈ S d , with S F,Θ (p), S F ,Θ (p) < ∞ for every p, and G, G ∈ S. For q ≤ κ, let H q β,Θ (•, •) be the weight of the integration by parts formula as in Proposition 3.5.Then for every p ≥ 1 one may find two universal constants C, p ∈ C(κ, d) such that for every multi index and Proof.By using the same arguments as in Theorem 3.4, one gets that there exists C ∈ C(q, d) such that for every multi index β of length q then where A l (F ) and A l (F, F ) are defined in (3.26)In next Lemma we study properties of H q β,Θ (F, G) in the case G is a special function of F .We denote with B r (0) the ball with radius r centered at 0. Lemma 3.7.Let φ ∈ C ∞ b (R d ) be such that 1 B1(0) ≤ φ ≤ 1 B2(0) and set φ x (y) = φ(x − y).
For l ∈ N, p ≥ 1 and F, F ∈ S d , one has in which C ∈ C(l, p, d).Moreover for every F ∈ S d and V ∈ S one may find universal constants C, a, p ∈ C(q, l, p, d) such that for every multi index β with |β| = q where S F,Θ (p), Q F,Θ (l, p) and m l,p (Θ) are defined in (3.30).
Proof.We prove (3.37), (3.36) following with similar arguments.First, for a multi-index where C > 0 depends on l, d only and "β 1 , . . ., β l ∈ B α " means that β 1 , . . ., β l are non empty multi indexes of α running through the list of all of the (non empty) "blocks" of α.
So, straightforward computations give x is Lipschitz continuous, with a Lipschitz constant independent of x, it follows that and by using the Hölder inequality one gets (3.37).
As for (3.38), we first note that since φ x (y) ≡ 0 for |y − x| > 2 then (3.31) gives for every multi index α.So, for l ∈ N we can write Therefore, (3.38) is a consequence of the use of the Hölder inequality and of the estimate (3.32).
We recall that the Poisson kernel Q d is the solution to the equation ∆Q d = δ 0 in R d (δ 0 denoting the Dirac mass in {0}) and has the following explicit form: where a d is the area of the unit sphere in R d .By using the result in [2], we have the following Proposition 3.8.Let φ ∈ C ∞ b (R d ) be such that 1 B1(0) ≤ φ ≤ 1 B2(0) and set φ x (y) = φ(x − y).Let κ ∈ N * and assume that m κ,p (Θ) < ∞ for every p ≥ 1.Let F ∈ S d be such that S F,Θ (p) < ∞ for every p ≥ 1.
A. Let Q d be the Poisson kernel in R d given in (3.40).Then for every p > d there exists a universal constant C ∈ C(d, p) such that where k p,d = (d − 1)/(1 − d/p) and H Θ (F, 1) denotes the vector in R d whose ith entry is given be H i,Θ (F, 1).
B. Under P Θ , the law of F is absolutely continuous and has a density p F,Θ ∈ C κ−1 (R d ) whose derivatives up to order κ − 1 may be represented as for every multi index α with |α| = q ≤ κ − 1.
B. Set µ F,Θ the law of F under P Θ and let α denote a multi index with |α| = q.By using the arguments similar to the ones developed in Proposition 10 in [2] one easily gets (notations from that paper) And by recalling that (−1) EJP 19 (2014), paper 110.and, by using the Hölder inequality, for p > d we have in which we have used (3.41).Now, by using (3.32) to estimate the first term and by applying (3.38) to the second one, (3.43) follows.

The distance between density functions and their derivatives
We compare now the probability density functions (and their derivatives) of two random variables under P Θ .In a different setting, this problem has already been considered in [13].
Proposition 3.9.Let q ∈ N and assume that m q+2,p (Θ) < ∞ for every p ≥ 1.Let F, G ∈ S d be such that (3.44) Then under P Θ the laws of F and G are absolutely continuous with respect to the Lebesgue measure with density p F,Θ and p G,Θ respectively and for every multi index α with |α| = q there exist constants C, a, b, p ∈ C(q, d) such that Proof.Throughout this proof, C, p , a, b ∈ C(q, d) will denote constants that can vary from line to line.By applying Lemma 3.8, under P Θ the laws of F and G are both absolutely continuous with respect to the Lebesgue measure and for every multi index α with |α| = q one has By using (3.41), for p > d we obtain so that the Hölder inequality gives .
We study now I j .For λ ∈ [0, 1] we denote F λ = G + λ(F − G) and we use Taylor's expansion to obtain < ∞ for every p, we can use the integration by parts formula with respect to F λ , so Therefore, by taking p > d and by using again (3.41), (3.32) and (3.33), we get l, p) being given in (3.30).We use also (3.36) and, by inserting everything, we can resume by writing and the statement follows.
Using the localizing function in (2.12) and by applying Proposition 3.9 we get the following result.Theorem 3.10.Let q ∈ N. Assume that m q+2,p (Θ) < ∞ for every p ≥ 1.Let F, G ∈ S d be such that S F,Θ (p), S G,Θ (p) < ∞ for every p ∈ N. Then under P Θ , the laws of F and G are absolutely continuous with respect to the Lebesgue measure, with densities p F,Θ and p G,Θ respectively.Moreover, there exist constants C, a, b, p ∈ C(q, d) such that for every multi index α of length q one has |∂ α p F,Θ (y) − ∂ α p G,Θ (y)| ≤ CS F,Θ (p ) a S G,Θ (p ) a Q F,G,Θ (q + 3, p ) a m a q+2,p (Θ)× ×( F − G q+2,p ,Θ + LF − LG q,p ,Θ )× (3.47) For ψ a as in (2.12), we define  Hypothesis 4.1.We assume that γ, g, h and c are infinitely differentiable functions in both variables z and x.Moreover we assume that g and its derivatives are bounded and that ln h has bounded derivatives  We are now able to give our convergence result.
Proof.The proof is an easy consequence of the results from [5], we use the estimates obtained there.
Step 1.In [5] Lemma 4 one proves that X M t → X t in L 1 and then lim M →∞ d 1 (X t , X M t ) = 0.
Step 2. Following [5], we consider an alternative representation of the law of X M t .
The random variable X M t solution to (4.3) is a function of (Z 1 . . ., Z J M t ) but it is not a simple functional, as defined in Section 2.1, because the coefficient c M (z, x)1 (u,∞) (γ(z, x)) is not differentiable with respect to z.

( 2 .Theorem 2 . 7 .
31) So d F M = d 1 and d T V = d 0 .Our basic estimate is the following.For F ∈ S d we denote A l (F ) := F 3,l + LF 1,l (2.32)Let k ∈ N.There exist universal constants C, l, b ∈ C(d, k) such that for every F, G ∈ S d with A p (F ), A p (G) < ∞, ∀p ∈ N, and every ε > 0 one has

2 m|D
1 , . . .ds k and |F | (k) F | 2 .(2.46) So, D m,p is the closure of the space of the simple functionals with respect to the Malliavin Sobolev norm F p m,p = E |F | p m (2.47)We set D m,∞ = ∩ p≥1 D m,p and D ∞ = ∩ m≥1 D n,∞ .Moreover, for F ∈ (D 1,2 ) d , we let σ F denote the Malliavin covariance matrix associated to F :

Remark 2 . 15 .
The arguments used in Remark 2.4 can be applied here: the second factor in the estimates (2.52) and (2.53) can be replaced, as |y| > 4, with the queue of the law of F and G. Also, by using the Markov inequality, such factors can be over estimated by means of any power of (1 + |y|) −1 , for every y ∈ R d .

Theorem 2 . 19 .
Let k ∈ N.There exist universal constants C, p, b ∈ C(d, k) such that for every F, G ∈ (D 3,∞ ) d and every ε > 0 one has Finally lim sup n I N,ε,n = 0 for each fixed N and ε.So we obtain lim sup n
19))B.Let F, G ∈ S d be such that U q+1,p,Θ (F ), U q+1,p,Θ (G) < ∞ for every p ∈ N and let p F,Θ and p G,Θ be the densities of the laws of F respectively of G under P Θ .There exist C, a, b, p ∈ C(q, d) such that for every y ∈ R d and every multi index α = (α 1 , ..., α k ) ∈ {1, ..., d} k , 0 ≤ k ≤ q one has
(F, G) being defined in(3.34).By using (3.37) and the quantities S F,G,Θ (p) and Q F,G,Θ (k, p), for a suitable a > 1 and p > d we can write ,p Θ dλ