GLUING LEMMAS AND SKOROHOD REPRESENTATIONS

Let (X , E), (Y,F) and (Z,G) be measurable spaces. Suppose we are given two probability measures γ and τ , with γ defined on (X ×Y, E ⊗F) and τ on (X × Z, E ⊗ G). Conditions for the existence of random variables X, Y, Z, defined on the same probability space (Ω,A, P ) and satisfying (X,Y ) ∼ γ and (X,Z) ∼ τ, are given. The probability P may be finitely additive or σ-additive. As an application, a version of Skorohod representation theorem is proved. Such a version does not require separability of the limit probability law, and answers (in a finitely additive setting) a question raised in [2] and [4].


Introduction and motivations
This paper is split into two parts.The first focus on gluing lemmas, while the second deals with Skorohod representation theorem.The second part is the natural continuation of some previous papers (see [1]- [4]) and the main reason for investigating gluing lemmas.
In the sequel, a gluing lemma is meant as follows.Let (X , E), (Y, F) and (Z, G) be measurable spaces.Suppose we are given two probability measures γ and τ , with γ defined on (X × Y, E ⊗ F) and τ on (X × Z, E ⊗ G).A gluing lemma gives conditions for the existence of three random variables X, Y, Z defined on the same probability space and satisfying (X, Y ) ∼ γ and (X, Z) ∼ τ.
It can be assumed that X, Y, Z are the coordinate projections X(x, y, z) = x, Y (x, y, z) = y, Z(x, y, z) = z, where (x, y, z) ∈ X ×Y ×Z.Under this convention, the question reduces to whether there is a probability measure P on the product σ-field E ⊗ F ⊗ G such that P (X, Y ) ∈ A = γ(A) and P (X, Z) ∈ B = τ (B) (1) whenever A ∈ E ⊗ F and B ∈ E ⊗ G.
Gluing lemmas occur in various frameworks, mainly in connection with optimal transport, coupling and related topics; see e.g.[19].Another application of gluing lemmas, as discussed below, concerns Skorohod representation theorem.
An obvious necessary condition for ( 1) is In this paper, it is shown that condition (2) is not enough for (1), even if (X , E) = (Y, F) = (Z, G) with X separable metric and E the Borel σ-field.However, condition (2) suffices for (1) under some extra assumption.For instance, (2) implies (1) if one between γ and τ is disintegrable, or else if all one-dimensional marginals of γ and τ are perfect.See Example 1, Lemma 4 and Corollary 5.
In dealing with gluing lemmas, one naturally comes across with finitely additive probabilities.We substantiate this claim with two results; see Lemma 2. Suppose the probability P involved in condition ( 1) is only requested to be finitely additive.Then, (1) admits a simple characterization.Indeed, (1) holds if and only if where γ * and τ * are the inner and outer measures induced by γ and τ .Next, suppose X and Z are topological spaces (equipped with the Borel σ-fields E and G).Then, (2) suffices for (1) provided B is restricted to be a continuity set for τ , in the sense that B ∈ E ⊗ G and τ * (∂B) = 0. We next turn to Skorohod representation theorem (SRT).In addition to [1]- [4], related references are [10], [14], [16], [17].
Let S be a metric space, B the Borel σ-field on S, and (µ n : n ≥ 0) a sequence of probability measures on B. Recall that a law µ on B is separable if µ(A) = 1 for some separable set A ∈ B. According to SRT, if µ n → µ 0 weakly and µ 0 is separable, on some probability space there are S-valued random variables (X n : n ≥ 0) such that X n ∼ µ n for all n ≥ 0 and X n → X 0 almost uniformly.See [8], [15] and [20]; see also [9, page 130] and [18, page 77] for historical notes.
In [2] and [4], the separability assumption on µ 0 is investigated.Suppose where d is the distance on S. Also, say that the sequence (µ n ) admits a Skorohod representation if On some probability space, there are S-valued random variables X n such that X n ∼ µ n for all n ≥ 0 and X n → X 0 in probability.
If non separable laws on B actually exist, then: (i) It may be that µ n → µ 0 weakly but (µ n ) fails to admit a Hence, separability of µ 0 can not be dropped from SRT (by (i)) and almost uniform convergence is too much (by (ii)).On the other hand, because of (iii), a possible conjecture is where ρ is some discrepancy measure between probability laws.If true for a reasonable ρ, such a conjecture would be a (nice) version of SRT not requesting separability of µ 0 .
Two popular choices of ρ are ρ = L and ρ = W , where L is the bounded-Lipschitz-metric and W the Wasserstein distance.The definition of L is recalled in Subsection 3.1.As to W , if µ and ν are any probability measures on B, then where inf is over those probability measures γ on B ⊗ B with marginals µ and ν.
It is not hard to prove that lim n W (µ n , µ 0 ) = 0 if (µ n ) has a Skorohod representation.Thus, ρ = W is an admissible choice.Also, since L ≤ 2 W , if the conjecture works with ρ = L then it works with ρ = W as well.Accordingly, we let ρ = W .
Suppose lim n W (µ n , µ 0 ) = 0.By definition, there is a sequence (γ n : n ≥ 1) of probability measures on B ⊗ B such that each γ n has marginals µ 0 and µ n and lim n γ n (x, y) : d(x, y) > = 0 for all > 0.
Thus, one automatically obtains a Skorohod representation for (µ n ) if, on some probability space, there are S-valued random variables (X n : n ≥ 0) such that This is exactly the point where gluing lemmas come into play.Roughly speaking, they serve to paste the γ n among them in order to get condition (3).Unfortunately, Example 1 precludes to obtain (3) for an arbitrary sequence (γ n ) such that However, something can be said.Our main result is that lim n W (µ n , µ 0 ) = 0 if and only if, on a finitely additive probability space (Ω, A, P ), there are S-valued random variable X n such that X n P −→ X 0 , P (X 0 ∈ A) = µ 0 (A) for all A ∈ B, and To sum up, in a finitely additive setting, the above conjecture is true with ρ = W provided X n ∼ µ n is meant as P (X n ∈ A) = µ n (A) if A ∈ B and µ n (∂A) = 0, or equivalently as We refer to Theorem 8 for details.

Gluing lemmas
In the sequel, the abbreviation "f.a.p." stands for finitely additive probability.A σ-additive f.a.p. is referred to as a probability measure.
Let (X , E), (Y, F), (Z, G) be (arbitrary) measurable spaces, γ a f.a.p. on E ⊗ F and τ a f.a.p. on E ⊗ G. Recall that, if Q is a f.a.p. on a field U on some set Ω, the outer and inner measures are defined by We begin with an example where condition (2) holds while condition (1) fails for any f.a.p.P , despite γ and τ are probability measures and (X , E) = (Y, F) = (Z, G) with X separable metric and E the Borel σ-field.
Example 1.Let m be the Lebesgue measure on the Borel σ-field on [0, 1].Take I ⊂ [0, 1] such that m * (I) = 1 and m * (I) = 0 and define J = [0, 1] \ I and Then, X is a separable metric space under the distance Hence, condition (2) holds.However, condition (1) fails for any f.a.p.P .Define in fact h(x, r) = x for all (x, r) ∈ S. If (1) holds for some f.a.p.P , then Because of Example 1, some condition for (1) looks potentially useful.Next lemma is actually fundamental for Theorem 8.
Lemma 2. Let γ be a f.a.p. on E ⊗ F and τ a f.a.p. on E ⊗ G.There is a f.a.p.P on E ⊗ F ⊗ G satisfying condition (1) if and only if Moreover, if condition (2) holds and X and Z are topological spaces (equipped with the Borel σ-fields E and G) there is a f.a.p.P on E ⊗ F ⊗ G such that Proof.Suppose that (1) holds for some f.a.p.P .Let Q be a f.a.p. on the power set of X × Y × Z such that Q = P on E ⊗ F ⊗ G.By definition of inner and outer measure and since Q extends P , it follows that Conversely, suppose γ * A×Y ≤ τ * A×Z for all A ⊂ X .We need the following result by Bhaskara Rao and Bhaskara Rao [5, Theorem 3.6.1].
(BR) For j = 1, 2, let U j be a field on a set Ω and P j a f.a.p. on U j .There is a f.a.p.P on the power set of Ω such that P = P 1 on U 1 and P = P 2 on U 2 if and only if Let Ω = X × Y × Z and to be the projection of A onto X .Since A 0 × Z ⊂ B, then Therefore, in view of (BR), condition (1) holds for some f.a.p.P .Finally, suppose condition (2) holds and X and Z are topological spaces equipped with the Borel σ-fields E and G. Define the field An application of (BR) concludes the proof.
Remark 3. Other statements, similar to Lemma 2, can be proved by the same argument.As an example, under condition (2), there is a f.a.p.
Usually, γ and τ are probability measures (and not merely f.a.p.'s) and a natural question is whether condition (1) holds under a (σ-additive) probability measure P .To address this issue, we first recall some definitions.
Let µ be a probability measure on (X , E). Say that µ is perfect if, for each Emeasurable function f : X → R, there is a Borel set B ⊂ R such that B ⊂ f (X ) and µ(f ∈ B) = 1.If X is separable metric and E the Borel σ-field, then µ is perfect if and only if it is tight.In particular, µ is perfect if X is a universally measurable subset of a Polish space (in particular, a Borel subset) and E the Borel σ-field.
Let γ be a probability measure on E ⊗ F. Say that γ is disintegrable if γ admits a regular conditional distribution given the sub-σ-field {A×Y : A ∈ E}.Equivalently, there is a collection {α(x, •) : x ∈ X } such that: , where µ is the marginal of γ on E and H x = {y ∈ Y : (x, y) ∈ H}.The collection {α(x, •) : x ∈ X } is said to be a disintegration for γ.
A disintegration can fail to exist.However, for γ to admit a disintegration, it suffices that F is countably generated and the marginal of γ on F is perfect.
We refer to [6], [12] and references therein for more on disintegrations (meant in a larger sense).Here, exploiting ideas from [2], we prove that disintegrability works for getting condition (1) under a σ-additive P .Lemma 4. Let γ be a probability measure on E ⊗ F and τ a probability measure on E ⊗ G.If condition (2) holds and one between γ and τ is disintegrable, then condition (1) holds with P a probability measure on E ⊗ F ⊗ G.
Then, P is a probability measure on E ⊗ F ⊗ G and P (X, Z) ∈ B = τ (B) for all B ∈ E ⊗ G.Because of (2), γ and τ have a common marginal on E, say µ.Fix A ∈ E ⊗ F and take H = {(X, Y ) ∈ A}.Since H x,z = {y ∈ Y : (x, y) ∈ A} = A x for all (x, z) ∈ X × Z, it follows that This concludes the proof if γ is disintegrable.If τ is disintegrable, it suffices to take a disintegration {β(x, •) : x ∈ X } for τ and to let P (H) = β(x, H x,y ) γ(dx, dy) where H x,y = {z ∈ Z : (x, y, z) ∈ H}.
A quick consequence of Remark 3 and Lemma 4 is the following.
Corollary 5. Suppose condition (2) holds for the probability measures γ on E ⊗ F and τ on E ⊗ G.Then, condition (1) holds with a σ-additive P if at least one of the following conditions is satisfied: (j) F is countably generated and the marginal of γ on F is perfect; (jj) G is countably generated and the marginal of τ on G is perfect; (jjj) The marginals of γ and τ on E, F and G are all perfect.
Proof.Under (j) or (jj), one between γ and τ is disintegrable, and the conclusion follows from Lemma 4. Suppose (jjj) holds, and let R be the field on X × Y × Z generated by the rectangles A × B × C with A ∈ E, B ∈ F and C ∈ G.By Remark 3, there is a f.a.p.P 0 on R such that P 0 X ∈ A, Y ∈ B = γ(A × B) and P 0 X ∈ A, Z ∈ C = τ (A × C) whenever A ∈ E, B ∈ F and C ∈ G. (Just take P 0 to be the restriction of Q to R, with Q as in Remark 3).In view of (jjj), the marginals of P 0 on E, F and G are all perfect.Hence, P 0 is σ-additive by a result of Sazonov [13,Theorem 6].Thus, it suffices to take P to be the σ-additive extension of P 0 to σ(R) = E ⊗ F ⊗ G.
We close this section with a last result related to disintegrability and perfectness.Assume condition (2) and let µ denote the (common) marginal of γ and τ on E. If where the first inequality depends on the definition of inner measure while the second is trivial.By Lemma 2, condition (1) holds for some f.a.p.P .Similarly, condition (1) holds for some f.a.p.P whenever Thus, it may be useful to have conditions for (4).Lemma 6.Let γ be a probability measure on E ⊗ F with marginals µ and ν on E and F, respectively.Condition (4) holds provided, for each H ∈ E ⊗ F, there are sub-σ-fields E 0 ⊂ E and F 0 ⊂ F such that H ∈ E 0 ⊗ F 0 and γ is disintegrable on E 0 ⊗ F 0 .In particular, condition (4) holds if ν is perfect.
Proof.It suffices to prove that µ * (A) ≤ γ * (A × Y) (the opposite inequality follows from the definition of outer measure).Fix A ⊂ X and take Finally, suppose ν is perfect.Then, it suffices to note that each H ∈ E ⊗ F actually belongs to E ⊗ F 0 , for some countably generated sub-σ-field F 0 ⊂ F, and γ is disintegrable on E ⊗ F 0 for ν is perfect and F 0 countably generated.

3.1.
A Wasserstein-type "distance".In this Section, (S, d) is a metric space, B the Borel σ-field on S and P the set of probability measures on B. For each n ≥ 1, B n = B ⊗ . . .⊗ B denotes the product σ-field on S n = S × . . .× S. Similarly, B ∞ is the product σ-field on S ∞ , where S ∞ is the set of sequences ω = (ω 0 , ω 1 , . ..) with ω n ∈ S for each n ≥ 0. Also, for µ, ν ∈ P, we let F(µ, ν) be the collection of those probability measures γ on B 2 such that γ(A × S) = µ(A) and γ(S × A) = ν(A) for each A ∈ B.
If (S, d) is not separable, B 2 may be strictly smaller than the Borel σ-field on S 2 , and this could be a problem for defining a Wasserstein-type "distance".Accordingly, we assume ( 5) that is, the function d : S 2 → [0, ∞) is measurable with respect to B 2 .Condition ( 5) is trivially true if (S, d) is separable, as well as in various non separable situations.For instance, (5) holds if d is the uniform distance on some space S of cadlag functions, or if d is the 0-1 distance and card(S) = card(R).A necessary condition of ( 5) is that B ⊃ C for some countably generated σ-field C including the singletons.Hence, (5) yields card(S) ≤ card(R).

3.2.
A finitely additive Skorohod representation.To state our main result, we let X n denote the n-th coordinate projection on S ∞ , namely X n (ω) = ω n for all n ≥ 0 and ω = (ω 0 , ω 1 , . ..) ∈ S ∞ .Theorem 8. Suppose σ(d) ⊂ B 2 and (µ n : n ≥ 0) is a sequence of probability measures on B. Then, lim n W (µ n , µ 0 ) = 0 if and only if there is a f.a.p.P on B ∞ such that Moreover, for each n ≥ 1, one also obtains Proof.We first recall a known fact.Let f : S 2 → R be a bounded continuous function such that σ(f ) ⊂ B 2 .Given a f.a.p. γ on B 2 , define the field U = {A ∈ B 2 : γ * (∂A) = 0}.Since ∂{f ≤ t} ⊂ {f = t} ∈ B 2 for all t ∈ R, then {f ≤ t} ∈ U except possibly for countably many values of t.Hence f k → f , uniformly, for some sequence (f k ) of U-simple functions.