The realization of positive random variables via absolutely continuous transformations of measure on Wiener space

Let $\mu$ be a Gaussian measure on some measurable space $\{W=\{w\},{\mathcal{B}}(W)\}$ and let $\nu$ be a measure on the same space which is absolutely continuous with respect to $\nu$. The paper surveys results on the problem of constructing a transformation $T$ on the $W$ space such that $Tw=w+u(w)$ where $u$ takes values in the Cameron-Martin space and the image of $\mu$ under $T$ is $\mu$. In addition we ask for the existence of transformations $T$ belonging to some particular classes.


Introduction
The main topic addressed in this paper can be outlined as follows: Let (W, B(W )) be a measurable space on which we are given two probability measures ρ and ν. Assume that ν is absolutely continuous with respect to ρ. Is there a measurable transformation T : W → W such that the image of ρ under T is ν? Of course at this generality the answer will be too vague to be of some interest, hence let us provide the space (W, B(W )) with more analytical structure in such a way that we can construct at least a Gaussian measure on it.
We shall call the triple (W, H, µ) an abstract Wiener space if W is a separable Fréchet space, H is a separable Hilbert space, called the Cameron-Martin space, which is densely and continuously embedded in W . We also identify H with its continuous dual, hence the imbeddings W * ֒→ H * = H ֒→ W are all dense and continuous. (W, B(W )) is equipped with a probability measure µ such that for every e in W * , W * e, w W has the law N (0, |e| 2 H ) under µ. Although in the literature W is in general supposed to be a Banach space, it is more interesting to assume W to be a (separable) Fréchet space, since such a space is a countable projective limit of Banach spaces, all the properties of abstract Banach-Wiener spaces extend immediately to our case. Another justification of this choice lies in the fact that neither the classical Wiener space C 0 (IR + , IR) nor IR IN are Banach spaces (of course they are Fréchet!) although we need them badly in stochastic analysis 1 . Let T be a measurable transformation of W . The problem of determining whether the induced measure ν = T µ, where T µ(A) = µ(T −1 A), is absolutely continuous with respect to µ and to find the corresponding Radon-Nikodym derivative, in particular the case where T is a perturbation of identity, i.e., w → T (w) = w + u(w) where u : W → H, has been treated extensively (cf. [23] and the references therein). In this paper we consider the converse problem; namely, given a positive random variable L on (W, H, µ) such that and E[L] = 1, is there a measurable transformation T on W such that (dT µ/dµ) = L? In addition to asking for any T , we will also ask for the existence of transformation T belonging to some particular classes which answer to the latter question. To be more specific we set: Definition 1.1. Let (W, H, µ) be an abstract Wiener space and let ν be a measure on W which is absolutely continuous with respect to µ, i.e. dν = L dµ, (or a r.v. L such that L ≥ 0 a.s. and E µ [L] = 1), then: 1 Readers who prefer to consider the classical Wiener space may, throughout the paper, suppose W to be the standard d-dimensional Wiener space C 0 ([0, 1], IR d ) with (a) (t, w) → w(t) = Wt(w) being the Wiener process and W endowed with the supremum norm w = sup t∈[0 a) The measure ν (or the r.v. L) will be said to be realizable or representable if there exists a measurable transformation T such that T µ = ν. Such a T will be called a realization of L = dν/dµ. This is equivalent to the validity of the following: for any nice g on W . b) The measure ν ≪ µ or the corresponding Radon-Nikodym derivative will be said to be "realizable (or representable) by a shift" if there exists an H-valued random variable u (a "shift") such that for T w = w + u(w), T µ = ν.
In the class (b) we will also consider several particular types of realizations: (b1) The shift u is adapted to some particular filtration on W .
(b2) The transformation T is "triangular" (to be defined in section 5).
(b3) The shift u is the gradient of some random variable ϕ.
In addition to these realization problems we will also consider two closely related realization problems which will be described at the end of this section.
We turn now to an outline of the paper. In the next section we summarize the notions of gradient and divergence on the Wiener space and some associated relations. In section 3 we consider the problem of the realization by adapted shifts, i.e. case of (b-1). This was introduced and treated byÜstünel and Zakai cf. in Section 2.7 of [23]. Consider first the classical setup of the Wiener space, let F t to be the subsigma field induced by {w θ , θ ≤ t} and consider the realization of L by T w = w + u(w) where u is required to be a.s. in H and adapted to the filtration (F t , t ∈ [0, 1]). In the abstract Wiener space there is no natural filtration, however, we can adjoin to the AWS a filtration induced by a resolution of the identity on H (cf. e.g. [23]). In [23], among other results, it is proven that given L, we can find a sequence of adapted H-valued shifts, say u n , such that measures induced by them, and the corresponding Radon-Nikodym derivatives of T n µ (w.r. to µ) L n , satisfy and the supremum is taken over all finite partitions of W . In section 4 we present the result of Fernique [7]; in our context it states that for any ν ≪ µ then there is a probability space (Ω, F , P ) and three random variables T, B and Z defined on this space such that T and B are W -valued, Z is H-valued, T (P ) = ν, B(P ) = µ and T = B + Z.
Section 5 represents the work of Bogachev, Koleshnikov and Medvedev [1,2] and is based on a transformation of µ to ν by a "triangular transformation". This approach yields not only the result that every measure that is absolutely continuous with respect to the abstract Wiener space is realizable, it yields also that it is realizable by a shift without any further restrictions on L.
Section 6 follows the results of Feyel andÜstünel [9,10]. It has its origin in the modern theory of optimal transportation (cf. e.g. [3,11,17,18,24]) i.e., given an absolutely continuous probability (w.r. to the Lebesgue measure) on IR n , it can be transformed into any probability measure β on IR n by a transformation T (x) = ∇Φ(x) where Φ is a convex function, and this transformation is optimal in a certain sense. In [9] and [10] Feyel andÜstünel extended these transportation results to the infinite dimensional setting and some of these results are briefly explained here. In particular, the case of Wiener measure can be formulated as follows: if a probability measure ν on W is at finite Wasserstein distance from the Wiener measure µ, where the Wasserstein distance is defined with respect to the Hilbert norm of the Cameron-Martin space H (which is not continuous on W ), then there exists a Wiener function ϕ which is Sobolev differentiable such that the transformation defined by maps µ to ν (note that there is no necessity for the absolute continuity of ν w.r. to µ). We indicate some applications of these results to the polar factorization of the absolutely continuous perturbations of identity. We further extend these results by defining the notion of measures which are at locally finite Wasserstein distance from each other and prove that if at least one of them is a spread measure, then the other one can be written as the image of the spread one under a locally cyclically monotone map of the form of a perturbation of identity. Section 7 deals with the following problem. Let y = u(w ′ ) + w where w and w ′ represent independent Wiener processes where u(w ′ ) takes values in the Cameron-Martin space, i.e. u represent a "signal" and w is an additive "noise" which is independent of the signal. Let ν y denote the measure induced by y on W and let L(w) denote the Radon-Nikodym derivative (or likelihood ratio) dν y /dµ. This model is relevant to many communication and information theory problems which makes it interesting to consider the converse problem: given a positive random variable L(w) on (W, H, µ) satisfying E[L] = 1, under what conditions does there exist a representation of "signal plus independent noise" realization such that the Radon-Nikodym derivative of ν y with respect to µ is L. This problem was first considered in [25] and further clarified in [16]. Remark: A problem which is also related to the problems considered in this paper is that of the realization of the random variable L(w) = 1, i.e. the class of measure preserving transformation on the Wiener space. This problem is treated in [23] and [26] and will not be discussed in this paper.

Preliminaries
The gradient: Let (W, H, µ) be an AWS and let e i , i = 1, 2, . . . be a sequence of elements in W * . Assume that the image of e i in H form a complete orthonormal base in H. Let f (x 1 , . . . , x n ) be a smooth function on R n and denote by f ′ the partial derivative of f with respect to the i-th coordinate. We will denote W * e, w W by δe. For cylindrical smooth random variables F (w) = f (δe 1 , . . . , δe n ), define ∇ h F = dF (w+εh) dε ε=0 . Therefore we set the following: It can be shown that this definition is closable in L p (µ) for any p > 1. Hence it can be extended to a wider class of functions. We will restrict ourselves to p = 2, consequently the domain of the ∇ operation can be extended to all functions F (w) for which there exists a sequence of smooth cylindrical functions F m such that F m → F in L 2 and ∇F m is Cauchy in L 2 (µ, H). In this case set ∇F to be the L 2 (µ, H) limit of ∇F m . This class of r.v. will be denoted D 2,1 . It is a closed linear space under the norm Similarly let K be an Hilbert space and k 1 , k 2 , . . . a complete orthonormal base in K. Let ϕ be the smooth and denote by D 2,1 (K) the completion of ∇ϕ under the norm For the case where W = C 0 [0, 1], ∇F is of the form ∇F = · 0 α s ds. We will denote α s by D s F and therefore for any h = The divergence (the Skorohod integral): where the ρ i are orthonormal vectors in R n . Recall, first the following "integration by parts formula": where div is the divergence: div v = n 1 ∂vi ∂xi . Note that equation (2.4) deals with integration with respect to the Lebesgue measure on R n . Here we are looking for an analog of the divergence operation on R n which will yield an integration by parts formula with respect to the Wiener measure.
Definition 2.1. Let u(w) be an H-valued r.v. in (W, H, µ), u will be said to be in dom 2 δ if E|u(w)| 2 H < ∞ and there exists a r.v. say δu such that for all smooth functionals f (δe 1 , . . . , δe n ) and all n the "integration by parts" relation is satisfied. δu is called the divergence or Skorohod integral.
A necessary and sufficient condition for a square integrable u(w) to be in dom 2 δ is that for some γ = γ(u), for all smooth f . For non-random h ∈ W * , δh = h, w , setting f = 1 in (2.5) yields that Eδh = 0. It can be shown that under proper restrictions (2.6) Consequently, if E|u| Among the interesting facts about the divergence operator, let us also note that for the classical Brownian motion and if (u(w))(·) = · 0u s (w)ds whereu s (w) is adapted and square integrable then δu coincides with the Ito integral i.e. (2.9)

Realization of positive functionals by adapted shifts
Let (W, H, µ) be the classical 1-dimensional Wiener Process. The expectation operator E and "a.s." will be with respect to the measure µ. Let ν be a probability measure on W and ν ≪ µ. Set L(w) = dν/dµ. Then by the integral representation for L ∈ L 1 (µ) whereξ · is adapted, and Further assume that and The relation between the shift T w = w + · 0α s (w)ds and L is given by the Girsanov theorem as We have also the following result which assures the realization of ν = L · µ: Proposition 3.1. Let ν, L,α and T be as defined above. Assume that the process in the strong sense and that Proof. It is clear from (3.6) that T •S(w) = w almost surely. Moreover, it follows from the hypothesis (3.7), the random variable almost surely. To see that Sµ = ν, we have, for any nice function f on W , by the relation (3.9). To show that S • T = I W almost surely, note first that S is almost surely subjective: in fact, using the Girsanov theorem, we get Hence S(W ) = W almost surely. Moreover, on the set W 1 = {w ∈ W : T • S(w) = w} the map S is injective and µ(W 1 ) = 1 and this completes the proof of (i). For the proof of (ii), the first part follows from (i) and from the Girsanov theorem: for any nice f on W . For the second part we know already that L • S M = 1 and that T µ ∼ µ since M > 0 µ-p.s. Hence it suffices to compose this identity with T .
In the converse direction we have the following result which is based on the assumption invertibility of adapted absolutely continuous shift.
Assume that ν is realizable by where u · is adapted, Proof. Since T is right invertible, by (3.10) we have by the Girsanov theorem (cf. [23]) A complete characterization on the representability of Radon-Nikodym derivatives by adapted shifts is not known at present. We have, however the following result (cf. section 2.7 of [23] for the proof): The set of measures that are realizable by adapted shifts is dense in the space of measures which are absolutely continuous with respect of the Wiener measure under the total variation norm, i.e. if on an abstract Wiener space ν ≪ µ then there exists a sequence of invertible and adapted shifts T n such that dT n µ/dµ converges to dν/dµ in the L 1 norm.
In the rest of this section we shall make some further observations about the problem. First let us a necessary and sufficient condition for the representability of a positive random variable: almost surely.
Proof. Necessity: from the Girsanov theorem, we have, for any f ∈ C b (W ), This implies that θ L = 1 almost surely, hence θ which completes the proof.
has a left inverse U l = I W + u, where u is also adapted and U l realizes L. In fact U l is also a right inverse almost surely.
Proof. The hypothesis about v imply that the mapping V has a left inverse U l thanks to Leray-Schauder degree theorem (cf. [23], p. 247, Theorem 9.4.1 and Remark 9.4.4). It is clear that U l is of the form I W + u, withu adapted. Consequently, using the Girsanov theorem, we get, for any f ∈ C b (W ), To show that U l is also an almost sure right inverse let us remark that it suffices to prove the subjectivity of V since it is already injective almost surely by the existence of a left inverse. To show the subjectivity we need to prove that V (W ) = W almost surely. First the fact that V (W ) is measurable with respect to the universal completion of B(W ) is given in Theorem 4.2.1 of [23]. It then follows from the Girsanov theorem that We have also the following result: is a Wiener process under µ, with respect to its own filtration. If it is also a martingale with respect to the canonical filtration, then U • V = I W almost surely. In other words U is a left inverse of V and it realizes L. In fact U is also right inverse µ-almost surely.
Proof. For any from the Girsanov theorem and this shows that U •V is a rotation. Consequently it can be written as where B is a Wiener process with respect to its own filtration. Besides, writing in detail the above relation, we get If B is also a (continuous) martingale with respect to the canonical filtration, then the adapted and finite variation process v + u • V becomes a martingale, hence it is almost surely zero and this implies that U • V = I W almost surely.
To complete the proof of realization of L by U it suffices to reason as in the proof of Proposition 3.5. The proof of the fact that U is also a right inverse a.s. is also similar to that of Proposition 3.5.  Proof. From the Girsanov theorem, we have, for any f ∈ C b (W ), almost surely. Note that From the equation (3.13) we get Taking the expectation of both sides we obtain almost surely, i.e., I W = I W + v + u • V = U • V almost surely, which means that U is a left inverse of V . One can show that it is also a right inverse as we have already done in the preceding proofs. The results of the following lemma are not new (cf. [21,22,23] and the references there), we give a quick proof for the sake of completeness: for any λ ≥ 0.
Proof. We shall give the proof in the frame of a classical Wiener space, the abstract Wiener space can be reduced to this case as it is explained in Chapter 2 of [23]. Remark that there is no information about the integrability of F . In fact this is a consequence of the hypothesis (4.1) as explained below: Let F n = |F | ∧ n, n ∈ IN. A simple calculation shows that hence F n ∈ D p,1 for any p > 1 and |∇F n | ≤ c almost surely. We have from the Itô-Clark formula (cf. [21]), where D s F is as defined by (3.3). From the definition of the stochastic integral, we have Since F n converges to |F | in probability, and the stochastic integral is bounded in L 2 (µ), by taking the difference, we see that (E[F n ], n ∈ IN) is a sequence of (degenerate) random variables bounded in the space of random variables under the topology of convergence in probability, denoted by L 0 (µ). Therefore sup n µ{E[F n ] > t} → 0 as t → ∞. Hence lim n E[F n ] = E[|F |] is finite. Now we apply Fatou's lemma to obtain that F ∈ L 2 (µ). Since the distributional derivative of F is a bounded random variable, F ∈ D p,1 , for any p ≥ 1.  Note that, since f n ∈ p,k W p,k (IR n , µ n ), the Sobolev embedding theorem implies that after a modification on a set of null Lebesgue measure, f n can be chosen in C ∞ (IR n ). Let (B t ; t ∈ [0, 1]) be an IR n -valued Brownian motion. Then where P is the canonical Wiener measure on C([0, 1], IR n ) and Q t is the heat kernel associated to (B t ), i.e.
From the Ito formula, we have By definition The Doob-Meyer process ( M n , M n t , t ∈ IR + ) of the martingale M n can be controlled as Hence from the exponential Doob inequality, we obtain P sup

Feyel,Üstünel, Zakai/Realization of positive random variables on Wiener space 184
Consequently Since ϕ n → ϕ in probability the proof is completed. The following proposition will play the intermediate role in proof of Theorem 4.1: Proposition 4.2. Let µ be the standard Gaussian measure on IR n and let ν be another probability on IR n such that ν ≪ µ and that the Radon-Nikodym derivative dν dµ is essentially bounded by a constant A < ∞. Then there exist a Gaussian random variable B with values in IR n , and an IR n -valued random variable Z, both defined on (Ω, H, P ), satisfying and the probability measure on IR n induced by T = B + Z is ν.
Proof. For the proof we shall use basically the Kantorovitch-Rubinstein theorem: Let g : IR n → IR be Lipschitz with Lipschitz constant 1. By Lemma 4.1, and denoting by E µ , E ν the expectation with respect to the µ and ν measures respectively: then, from the Fubini theorem Therefore by (4.2) Now let N (µ, ν) be the Rubinstein distance between µ and ν: where the infinum is taken over the set Σ which consists of all the probability measures M on W × W for which M (dx, W ) = µ(dx) and M (W, dy) = ν(dy). Note that the set Σ is compact w.r.to the weak topology of the measures on W × W and the integral at the r.h.s. of (4.5) is lower semicontinuous under this topology. Hence there is at least one measure M 0 which realizes the infimum provided that the infimum is finite. Moreover, by the Kantorovitch-Rubinstein theorem (cf. e.g. It follows by weak convergence and the Skorohod representation theorem (cf. e.g. theorem 7.1.4. and sections 11.5, 11.7 of [6]), there exists a probability space (Ω, H, P ) and that there exists a sequence of random variables (T ′ N , B ′ N , N ∈ IN) which converges a.s. to (T ′ , B ′ ) such that the laws of T ′ and B ′ under the probability P are ν and µ respectively and moreover, by Fatou's lemma and by the lower semi continuity of the Cameron-Martin norm with respect to the topology of W , In order to complete the proof it remains to remove the restriction about the boundedness of the density L. Let A 0 = 0, A n = 2 n−1 and W n = {w : Applying the results for µ, ν n as above yields, for any n ≥ 1, (T ′′ n , B ′′ n ) defined on (Ω n , H n , P n ) such that T ′′ n (P n ) = ν n and B ′′ n (P n ) = µ with Let Ω be the disjoint union of (Ω n , n ≥ 1) and define P = n L n dµ × P n .

Triangular transformations
A map T : IR n → IR n is said to be triangular if . . . Proof. We proceed by induction on dimension: (a) Consider first the case n = 1. Let F µ (x) and F ν (x), x ∈ IR 1 denote the distribution functions of µ and ν respectively. Assume, that F ν is strictly increasing, then for every x ∈ IR 1 there is a unique y ∈ IR 1 such that F µ (x) = F ν (y). Set T µ,ν (x) = y then F ν (T µ,ν x) = F µ (x). Since F ν is strictly increasing, F ν is invertible and transforms µ into ν and T µ,ν is strictly increasing. If F ν is not strictly increasing, set F −1 ν (y) := inf β : F ν (β) ≥ y) .

Feyel,Üstünel, Zakai/Realization of positive random variables on Wiener space 187
Note that with this interpretation of F −1 ν , T µ,ν is increasing and equation (5.1) proves the proposition for n = 1. (b) Consider now the case n = 2, let ϕ(x 1 , x 2 ) and ψ(x 1 , x 2 ) denote the probability density with respect to the Lebesgue measure of µ and ν respectively. Let ϕ x1 (x 2 ) and ψ x1 (x 2 ) denote the regular probability densities of x 2 conditioned on x 1 , i.e.
(c) For n > 2 the result follows by induction using the conditional density of x n given x 1 , . . . , x n−1 . It follows from the proof by induction that Corollary 5.2. Proposition 5.1 holds for IR ∞ , provided that the projections of the measures µ and ν to IR n are absolutely continuous with respect to the Lebesgue measure for any n ≥ 1.

The realization of a Radon-Nikodym derivative by a triangular transformation:
Let (W, H, µ) be an AWS. Let (e i , i ≥ 1) ⊂ W * be an orthonormal basis 2 for H. By the Ito-Nisio theorem (cf. e.g. [15] or [23] as N → ∞. Let ν ≪ µ then as shown in the previous section we can construct an increasing triangular transformation T : such that the measure induced by T w on W is ν. This yields a measurable transformation which induces ν and yields another proof to the result of Section 4 that even if ν ≪ µ then ν is realizable.

The entropy associated with a triangular transformation:
Let µ denote the standard Gaussian measure on IR n and let T denote a C 1 diffeomorphic map on IR n . By the change of variables formula, it holds that for every g ∈ C b (IR n ) where Λ is given as follows: set T (x) = x + u(x) then (cf. e.g. [23]). The Carleman-Fredholm determinant of an n × n matrix I IR n + A, det 2 (I IR n + A), is defined as and Λ can be rewritten as and by (2.8): The reason of rewriting Λ with the modified Carleman-Fredholm 3 determinant lies in the fact that det 2 (I +A) is well-defined for the Hilbert-Schmidt operators, in fact A → det 2 (I + A) is analytic on the space of Hilbert-Schmidt operators on an infinite dimensional Hilbert space, while A → det(I + A) is defined only for the nuclear (or trace class) operators. Assume now that the measure induced by T is absolutely continuous with respect to µ, T µ ≪ µ, then where L is the Radon-Nikodym derivative. Hence comparing (5.7) with (5.3) and recalling that T is invertible we have where Λ is as defined by (5.3). Let L be the Radon-Nikodym derivative of ν with respect to µ and assume that L log L ∈ L 1 (µ) set L(x) log L(x) = 0 whenever L(x) = 0. The relative entropy H(ν|µ), of ν with respect to µ is defined as Proposition 5.3. Let T be a C 1 and invertible, triangular and increasing map on IR n , let µ be the standard Gaussian measure on IR n , assume that T µ ≪ µ and L = dT µ/dµ satisfies L log L ∈ L 1 (µ). Then, setting T (x) = x + u(x), it holds that Proof. By (5.8) and by (5.7) and then (5.5) Therefore we get the simple form for H(ν|µ): where the last line follows since det 2 (I + ∇u) ∈ [0, 1] almost surely as explained below: T x = x + u(x) is triangular and increasing, therefore ∇u is a triangular matrix whose diagonal elements are all non-negative. Hence setting dui(x) dxi = α i ≥ 0, yields Since for α ≥ 0, 0 ≤ (1 + α)e −α ≤ 1 it follows that 0 ≤ det 2 (I + ∇u) ≤ 1, and (5.9) follows.

The realization by H-valued shifts
Theorem 5.4. Let (W, H, µ) be an abstract Wiener space, define a probability measure ν on (W, B(W )) as dν = Ldµ where L ≥ 0 with E[L] = 1. Then there exist a measurable T on W such that T µ = ν and µ-almost surely T w − w ∈ H, i.e. T is a perturbation of identity.
Proof. Assume first that L log L ∈ L 1 (µ). Let (e i , i ≥ 1) be as in subsection 5.2, let µ n and ν n be the images of probability measures µ and ν induced on IR n by the map w → π n (w) = ( e 1 , w , . . . , e n , w ). Then ν n ≪ µ n , set l n = dν n /dµ n then l n • π n = L n is the conditional expectation of L conditioned on the σ-field Feyel,Üstünel, Zakai/Realization of positive random variables on Wiener space 190 generated by w → π n (w) = ( e 1 , w , . . . , e n , w ). Since s → s log s is a convex function on IR + , it follows by Jensen's inequality that Let t n be the triangular map such that t n µ n = ν n , define T n = t n • π n , then from the construction of subsection 5.2, lim n T n = T almost surely. Setting u n (w) = T n (w) − w, we have from Proposition 5.3 that for all n Define t m : Transform now ν(E m )µ in to ν m as in the first part of this subsection with a triangular transformation, say U m and define Define finally T as to be T m on the set D m × W ′ . Note that T m (w) − w is a.s. H-valued. Since for m 1 = m 2 , the domains of T m1 and T m2 are disjoint and so are their corresponding ranges since they are contained in the supports of ν m1 and ν m2 respectively. Consequently (T m , m ≥ 1) yields a triangular transformation transforming µ to ν and such that T w − w in a.s. in the Cameron-Martin space.

The realization by gradient shifts
Let W be a separable Fréchet space with its Borel sigma algebra B(W ) and assume that there is a separable Hilbert space H which is injected densely and continuously into W , thus the topology of H is, in general, stronger than the topology induced by W . The cost function c : W × W → IR + ∪ {∞} is defined as c(x, y) = |x − y| 2 H , we suppose that c(x, y) = ∞ if x − y does not belong to H. Clearly, this choice of the function c is not arbitrary, in fact it is closely related to Ito Calculus, hence also to the problems originating from Physics, quantum chemistry, large deviations, etc. Since for all the interesting measures on W , the Cameron-Martin space is a negligible set, the cost function will be infinity very frequently. Let Σ(ρ, ν) denote the set of probability measures on W × W with given marginals ρ and ν. It is a convex, compact set under the weak topology σ(Σ, C b (W × W )). The problem of Monge consists of finding a measurable map T : W → W , called the optimal transport of ρ to ν, i.e., T ρ = ν which minimizes the total cost between all the maps U : W → W such that U ρ = ν. On the other hand the Monge-Kantorovitch problem consists of finding a measure on W × W , which minimizes the function θ → J(θ), defined by where θ runs in Σ(ρ, ν). Note that inf{J(θ) : θ ∈ Σ(ρ, ν)} is the square of Wasserstein metric d H (ρ, ν) with respect to the Cameron-Martin space H. Any solution γ of the Monge-Kantorovitch problem will give a solution to the Monge problem provided that its support is included in the graph of a map. Let us recall some notions of convexity on the Wiener space (cf. [8,23]). Let K be a Hilbert space, a subset S of K × K is called cyclically monotone if any finite subset {(x 1 , y 1 ), . . . , (x N , y N )} of S satisfies the following algebraic condition: where ·, · denotes the inner product of K. It turns out that S is cyclically monotone if and only if for any permutation σ of {1, . . . , N } and for any finite subset {(x i , y i ) : i = 1, . . . , N } of S. Note that S is cyclically monotone if and only if any translate of it is cyclically monotone. By a theorem of Rockafellar, any cyclically monotone set is contained in the graph of the subdifferential of a convex function in the sense of convex analysis ( [19]) and even if the function may not be unique its subdifferential is unique. A measurable function defined on (W, H, µ) with values in IR ∪ {∞} is called 1-convex if the map is convex on the Cameron-Martin space H with values in L 0 (µ). Note that this notion is compatible with the µ-equivalence classes of random variables thanks to the Cameron-Martin theorem. It is proven in [8] that this definition is equivalent the following condition: Let (π n , n ≥ 1) be a sequence of regular, finite dimensional, orthogonal projections of H, increasing to the identity map I H . Denote also by π n its continuous extension to W and define π ⊥ n = I W − π n . For x ∈ W , let x n = π n x and x ⊥ n = π ⊥ n x. Then f is 1-convex if and only if is π ⊥ n µ-almost surely convex. Definition 6.1. Let ξ and η be two probabilities on (W, B(W )). We say that a probability γ on (W × W, B(W × W )) is a solution of the Monge-Kantorovitch problem associated to the couple (ξ, η) if the first marginal of γ is ξ, the second one is η and if where Σ(ξ, η) denotes the set of all the probability measures on W ×W whose first and second marginals are respectively ξ and η. We shall denote the Wasserstein distance between ξ and η, which is the positive square-root of this infimum, with d H (ξ, η).
Remark: By the weak compactness of probability measures on W × W and the lower semi-continuity of the strictly convex cost function, the infimum in the definition is attained even if the functional J is identically infinity.
The following result, whose proof is outlined below (cf. also [9,10]) is an extension of an inequality due to Talagrand [20] and it gives a sufficient condition for the Wasserstein distance to be finite: Proof. Let us remark first that we can take W as the classical Wiener space W = C 0 ([0, 1]) and, using the stopping techniques of the martingale theory, we may assume that L is upper and lower bounded almost surely. Then a classical result of the Ito calculus implies that L can be represented as an exponential martingale motion under ν, hence β = (U × I)ν ∈ Σ(µ, ν). Let γ be any optimal measure, then where the last equality follows also from the Girsanov theorem and the Ito stochastic calculus.
Combining Theorem 6.1 with the triangle inequality for the Wasserstein distance gives: Corollary 6.2. Assume that ν i (i = 1, 2) have Radon-Nikodym densities L i (i = 1, 2) with respect to the Wiener measure µ which satisfy Let us give a simple application of the above result Then we have in other words Similarly if A and B are H-separated, i.e., if A ε ∩ B = ∅, for some ε > 0, where and consequently Remark 6.1. We already know that, from the 0 − 1-law, q A is almost surely finite, besides it satisfies |q A (x + h) − q A (x)| ≤ |h| H , hence the hypothesis of Lemma 4.1 are satisfied. Consequently E[exp λq 2 A ] < ∞ for any λ < 1/2 (cf. the Appendix B.8 of [23] and [22]). In fact all these assertions can also be proved with the technique used below.
Proof. Let ν A be the measure defined by Let γ A be the solution of the Monge-Kantorovitch problem, it is easy to see that the support of γ A is included in W × A, hence |x − y| H ≥ inf{|x − z| H : z ∈ A} = q A (x) , γ A -almost surely. This implies in particular that q A is almost surely finite. It follows now from the inequality (6.2) hence the proof of the first inequality follows. For the second let B = A c ε and let γ AB be the solution of the Monge-Kantorovitch problem corresponding to ν A , ν B . Then we have from the Corollary 6.2, Besides the support of the measure γ AB is in A × B, hence γ AB -almost surely |x − y| H ≥ ε and the proof follows.
We call optimal every probability measure 4 γ on W × W such that J(γ) < ∞ and that J(γ) ≤ J(θ) for every other probability θ having the same marginals as those of γ. We recall that a finite dimensional subspace F of W is called regular if the corresponding projection is continuous. Similarly a finite dimensional projection of H is called regular if it has a continuous extension to W .
The proof of the next theorem, for which we refer the reader to [10], can be done by choosing a proper disintegration of any optimal measure in such a way that the elements of this disintegration are the solutions of finite dimensional Monge-Kantorovitch problems. The latter is proven with the help of the sectionselection theorem [4,5].
The notion of spread measure is key to all the results about the measure transportation: Definition 6.2. A probability measure m on (W, B(W )) is called a spread measure if there exists a sequence of finite dimensional regular projections (π n , n ≥ 1) converging to I H such that the regular conditional probabilities m( · |π ⊥ n = x ⊥ n ), (where π ⊥ n = I W − π n ) which are concentrated in the n-dimensional spaces π n (W ) + x ⊥ n vanish on the sets of Hausdorff dimension n − 1 for π ⊥ n (m)-almost all x ⊥ n and for any n ≥ 1. Remark 6.2. Clearly any measure absolutely continuous with respect to µ is spread. Theorem 6.4 (General Monge-Kantorovitch transportation). Suppose that ρ and ν are two probability measures on W such that d H (ρ, ν) < ∞ and that ρ is spread. Then there exists a unique solution of the Monge-Kantorovitch problem, denoted by γ ∈ Σ(ρ, ν) and γ is supported by the graph of a Borel map T which is the solution of the Monge problem. T : W → W is of the form T = I W + ξ, where ξ ∈ H almost surely. Besides we have and, with the notations of Definition 6.2, for π ⊥ n ρ-almost almost all x ⊥ n , the map u → ξ(u + x ⊥ n ) is cyclically monotone on (π ⊥ n ) −1 {x ⊥ n }, in the sense that π ⊥ n ρ-almost surely, for any cyclic sequence {u 1 , . . . , u N , u N +1 = u 1 } from π n (W ). Finally, if ν is also a spread measure then T is invertible, i.e, there exists S : W → W of the form S = I W + η such that η ∈ H satisfies a similar cyclic monotononicity property as ξ and that In particular we have The case where one of the measures is the Wiener measure and the other is absolutely continuous with respect to µ is the most important one for the applications. Consequently we give the related results separately in the following theorem where the tools of the Malliavin calculus give more information about the maps ξ and η of Theorem 6.4: Theorem 6.5 (Gaussian case). Let ν be the measure dν = Ldµ, where L is a positive random variable, with E[L] = 1. Assume that d H (µ, ν) < ∞ (for instance L ∈ IL log IL). Then there exists a 1-convex function φ ∈ D 2,1 , unique up to a constant, such that the map T = I W + ∇φ is the unique solution of the