A Strong Duality Principle for Equivalence Couplings and Total Variation

We introduce and study a notion of strong duality for two classes of optimization problems commonly occurring in probability theory. That is, on an abstract measurable space $(\Omega,\mathcal{F})$, we say that an equivalence relation $E$ on $\Omega$ satisfies ``strong duality'' if $E$ is $(\mathcal{F}\otimes\mathcal{F})$-measurable and if there exists a sub-$\sigma$-algebra $\mathcal{G}$ of $\mathcal{F}$ such that for all probability measures $\mathbb{P},\mathbb{P}'$ on $(\Omega,\mathcal{F})$ we have \[ \sup_{A\in\mathcal{G}}\vert \mathbb{P}(A)-\mathbb{P}'(A)\vert = \min_{\tilde{\mathbb{P}}\in\Gamma(\mathbb{P},\mathbb{P}')}(1-\tilde{\mathbb{P}}(E)), \] where $\Gamma(\mathbb{P},\mathbb{P}')$ denotes the space of couplings of $\mathbb{P}$ and $\mathbb{P}'$, and where ``min'' asserts that the infimum is in fact achieved. The results herein, which can be seen as a surprising extension of Kantorovich duality to a class of irregular costs, give wide sufficient conditions for strong duality to hold. These conditions allow us to recover many classical results and to prove novel results in stochastic calculus, random sequence simulation, and point process theory.


Introduction
The objects of interest in this paper are two optimization problems commonly occurring in probability theory.To state them, we consider two probability measures P and P ′ on an abstract measurable space (Ω, F ).For an equivalence relation E on Ω (which, when viewed as a subset of Ω × Ω, is (F ⊗ F )-measurable) we often aim to solve minimize 1 − P(E) over all couplings P of P, P ′ , (P) which we refer to as the equivalence coupling problem for E or simply the E-coupling problem.Alternatively, for a sub-σ-algebra G of F , we might aim to solve maximize |P(A) − P ′ (A)| over A ∈ G, (D) which we refer to as total variation problem for G or simply the G-total variation problem.While one is typically interested in the primal problem (P), our goal in this paper is to show that, in great generality, it is in fact equivalent to the dual problem (D), which in many settings is much simpler to analyze.
To state this concept more precisely, consider an abstract measurable space (Ω, F ), and write P(Ω, F ) for the set of probability measures on (Ω, F ).For P, P ′ ∈ P(Ω, F ), write Γ(P, P ′ ) for the space of all couplings of P, P ′ , that is, all probability measures on (Ω × Ω, F ⊗ F ) with marginals given by P and P ′ , respectively.Then, we consider pairs (E, G), where E is an equivalence relation on Ω and where G is a sub-σ-algebra of F .We say that E is measurable if E ∈ F ⊗ F when viewed as a subset of Ω × Ω, and we say that a pair (E, G) satisfies strong duality if E is measurable and if we have (1 − P(E)) for all P, P ′ ∈ P(Ω, F ); the appearence of "min" implies that there indeed exists minimizer to the right side.Our goal is to understand which pairs (E, G) satisfy strong duality.
1.1.Review of Prior Work.There already exist a few particular instances of strong duality which motivate the potential success of such a general approach; the most classical example will certainly be familiar to the reader.It states that, if Ω is Polish space with B(Ω) is its Borel σ-algebra, and if ∆ = {(ω, ω) ∈ Ω × Ω : ω ∈ Ω} denotes the diagonal in Ω × Ω, we have for all Borel probability measures P, P ′ on Ω.This result has certainly been known for a long time, at least for countable sets Ω, so its exact source is difficult to track down; nonetheless, its broad usefulness is believed to have been promoted by Doeblin [26].
A second known example concerns the space of binary sequences Ω 0 := {0, 1} N with the Borel σ-algebra of its product topology, and E 0 the equivalence relation of eventual equality.(Actually, {0, 1} can be replaced with any Polish space.)A coupling of two Borel probability measures P, P ′ on Ω 0 is called successful if E 0 has probability one, that is, if the two random sequences are eventually equal almost surely.It was shown in a series of works [28,11,36,15], culminating in [14], that the existence of a successful coupling of two Borel probability measures P, P ′ on Ω 0 is closely related to P and P ′ assigning the same probability to all elements of the tail σ-algebra T .More precisely [14,Theorem 2.1], one has (1.3) sup A∈T |P(A) − P ′ (A)| = 0 if and only if min P∈Γ(P,P ′ ) (1 − P(E 0 )) = 0, for all Borel probability measures P, P ′ on Ω 0 .We of course regard (1.3) as a sort of qualitative version of duality.
A third example, from ergodic theory, shows that eventual equality and the tail σ-algebra in (1.3) can be replaced with the analogous objects for the notion of shift-invariance.Adopting the notation of the preceding paragraph, write I for the shift-invariant σ-algebra and E S for the equivalence relation of equality modulo shifts.A coupling of two probability measures P, P ′ on Ω 0 is called a successful shift-coupling if E S has probability one, that is, if the random sequences are almost surely equivalent modulo a random shift.Then, it was shown [2] that one has (1.4) sup A∈I |P(A) − P ′ (A)| = 0 if and only if min P∈Γ(P,P ′ ) (1 − P(E S )) = 0, for P, P ′ any two Borel probability measures on Ω 0 .Again, we see that (1.4) is a sort of qualitative duality.Subsequent work also saw many generalizations of (1.4).For example, in [13] it is shown that a similar statement holds for measurable actions of very many groups and semigroups on Polish spaces.We also have [24,Theorem 1] which establishes an analogous statement for unimodular random networks, where the shift operation is replaced by a root-change operation; this example is particularly interesting since the underlying measurable space is not standard Borel [24, Proposition 1].
1.2.Relation to Kantorovich Duality.The theory of Monge-Kantorovich optimal transport and Kantorovich duality can also be seen as an important precedent for our work; see [39,38] for the standard background, and [30] for a more comprehensive treatment which is better suited to our setting.Since (P) is exactly a Monge-Kantorovich optimal transport problem with cost c := 1 − 1 E , it is natural to try to prove strong duality (1.1) as a consequence of some sufficiently general theorem in the vast literature on Kantorovich duality [39,38,23,31,33,4,5].In this subsection we address the limitations of such an approach.
First, we consider the case that Ω is Polish and E is closed in Ω × Ω.Then the cost function c = 1−1 E is lower-semicontinuous, so a standard form of Kantorovich duality [39,Theorem 5.10] where the right side is achieved and the left side is taken over, say, all bounded measurable f, f ′ : Ω → R with f ⊕ f ′ ≤ 1 − 1 E .In fact, it is standard that one can equivalently take the supremum over function classes which are assumed to have further structure, hence massaging the left side in the following ways: first, one can assume that f is P-integrable and c-convex; second, one can assume that f ′ is equal to f c , the c-transform of f ; third, one can use that c satisfies the triangle inequality (this follows from E being transitive) to get that f c = −f ; fourth, one can use that c satisfies the bounds 0 ≤ c ≤ 1 to assume that f satisfies the bounds 0 ≤ f ≤ 1; fifth, one can use that c is symmetric (this follows from E being reflexive) to replace the parentheses with absolute values.At this point some straightforward additional analysis shows that one can take f to be of the form 1 A for A ranging over a suitable sub-σ-algebra G of F .(Alternatively, one can begin by directly appling a form of Kantorovich duality which is specifically tailored to cost functions which only take on values in {0, 1}, like [38,Theorem 1.27].)Thus, Kantorovich duality implies that closed equivalence relations satisfy our strong duality.However, for the applications most of interest in probability theory, one needs to consider the case that Ω is Polish and E is F σ in Ω × Ω.Since c = 1 − 1 E is no longer guaranteed to be lower semicontinuous, the standard form of Kantorovich duality does not apply.Nonetheless, one has [30, Corollary 2.3.9] for all measurable costs, which guarantees sup where again the left side is taken over all bounded measurable f, f While this may give the impression that strong duality is within reach, there are two crucial ways in which we become stuck: for one, there are not sufficient results in the literature which guarantee that the left side can be massaged into the form of a total variation norm; for another, which we explain further in the next paragraph, the existence of primal minimizers is not guaranteed.Thus, existing forms of Kantorovich duality do not appear to imply that F σ equivalence relations satisfy our strong duality.
In our opinion, the question of the existence of primal minimizers has received surprisingly little attention.In the classical settings where Ω is a Polish space and the cost c is lower semi-continuous, existence of course follows immediately from topological considerations.Yet, many authors do not attempt to expand the scope of existence results; for instance, in [5] it is stated: "If c fails to be lower semicontinuous, there is little reason why a primal optimizer should exist".Nonetheless, there are, as far as we know, exactly two known results outside the classical setting: The first is [30,Remark 2.12(a)] in which it is shown that primal minimizers always exist if one works in the world of finitely-additive probability measures, but this is not useful to us since we are interested in the more standard setting of countablyadditive probability measures.The second is [30,Theorem 2.3.10](originally proven in [23]) in which it is shown that primal minimizers exist for costs which are limits of regular costs, with respect to a certain metric; unfortunately, this condition is still not general enough to cover our setting of interest.
For all these reasons, the state-of-the-art technology on Kantorovich duality does not appear to imply our notion of strong duality in a sufficiently general setting.Nonetheless, our strong duality can certainly be seen as a form of Kantorovich duality, at least in a formal sense.In this way, our positive results can be seen as rather surprising in that they establish a form of Kantorovich duality for highly irregular costs.

Statement of Main Results.
Having established these precedents for duality between (P) and (D), we now state the main results of the paper.Throughout, let (Ω, F ) be an abstract measurable space.
We begin with an important simplification.Recall that we work in the setting where equivalence coupling problems are primal and total variation problems are dual; that is, the equivalence relation E is given, and we want to find a sub-σalgebra G such that (E, G) satisfies strong duality.In this case, there is actually a natural choice for G: By taking P = δ ω and P ′ = δ ω ′ in (1.1) for all ω, ω ′ ∈ Ω, we are motivated to introduce (1.5) which is always a sub-σ-algebra of F .Moreso, it can be shown (Proposition 3.7) that (E, E * ) satisfies strong duality as soon as (E, G) is satisfies strong duality for some sub-σ-algebra G of F .Since E * is thus the canonical choice, we can say that E itself is strongly dualizable whenever the pair (E, E * ) satisfies strong duality.
Our main results, as is the common practice in mathematical optimization, come in the form of sufficient conditions for strong duality to hold.To state the first of these, let us say that an equivalence relation Theorem 3.12.Every basic equivalence relation is strongly dualizable.
For example, on a Polish space, every equivalence relation which is "generated by equality of Polish-space-valued random variables" is basic (Lemma 3.13), and every closed equivalence relation is basic (Lemma 3.14).While these conditions appear to be rather general, there are important examples of strongly dualizable equivalence relations which are not basic (Example 3.15).Many such cases are instead covered by the following closure result: Theorem 3.16.Every countable increasing union of strongly dualizable equivalence relations is strongly dualizable.
It is now useful to make a few remarks about these main theorems.First, we note that Theorem 3.12 and Theorem 3.16 are together powerful enough to establish strong dualizability for most "reasonable" equivalence relations appearing in probability theory; for instance, they recover almost all of the examples given in Subsection 1.1, and they cover several more complicated settings of interest which we explore in Section 2. For the remaining remarks, let (Ω, F ) denote an abstract measurable space and (E, G) a pair satisfying strong duality.
It is crucial to emphasize that the statements and proofs of our main results are purely measure-theoretic.Even though expanding the generality to this setting forces one to leave behind many useful tools (existence of regular condition probabilities, Prokhorov's theorem, etc.), our proofs reveal that only a few only nonelementary results (for example, the Hahn-Jordan decomposition for finite signed measures) are needed.Keeping this generality in mind, it is highly non-trivial that primal minimizers can even be guaranteed to exist.Nonetheless, it is exactly working in the measure-theoretic perspective that allows us to push beyond the reach of standard Kantorovich duality.
Next, we comment on the structure of primal solutions.First of all, we observe that the objective P → 1 − P(E) is affine and the feasible region Γ(P, P ′ ) is convex, so it follows that the solution set of (P) is convex and, by strong duality, non-empty.We remark that, while our results give a very wide guarantee of the existence of a primal minimizer, concrete settings of interest may still require further analysis in order to show that there exist primal minimizers with desirable properties.(See [17] for an example involving couplings of stochastic processes.)Second of all, we recall the main result of [3] in which it is shown that a coupling P ∈ Γ(P, P ′ ) is optimal for (P) if and only if it is concentrated on a (1 − 1 E )-cyclically monotone set.We believe that interesting future work would be to specifically study (1 − 1 E )-cyclical monotonicity for equivalence relations E to see if the structure of primal minimizers can be understood at some general level.
Finally, we note that there is a slight asymmetry between the primal problem and the dual problem considered herein, in that strong duality can fail if the roles are reversed.More specifically, we may want to work in the alternate setting where total variation problems are primal and equivalence coupling problems are dual; that is, we have a sub-σ-algebra G given, and we want to find an equivalence relation E such that (E, G) is strongly dual.We exhibit a simple example (Example 3.8) showing that this form of strong duality can fail.
With these main results as motivation, we formulate a rather strong conjecture.Recall that by a standard Borel space we mean a measurable space which arises as the Borel σ-algebra of some Polish topology, and that by a Borel equivalence relation we mean a measuable equivalence relation on a standard Borel space.Conjecture 3.17.On a standard Borel space, all Borel equivalence relations are strongly dualizable.
The results above establish, in the standard language of Borel equivalence relations as outlined in [16] or [21], that all hypersmooth equivalence relations are strongly dualizable.Towards the conjecture, we also prove the following: Theorem 3.18.On a standard Borel space, all countable Borel equivalence relations are strongly dualizable.
While Conjecture 3.17 is certainly interesting in its own right, we believe that the sufficient conditions of this paper are already powerful enough to prove strong dualizability for most applications of interest to probabilists.As an illustration of this point, we explore three applications of the duality theory in the following section.

Applications
In this section we apply our main results to three different areas of probability theory, and we hope that our notion of strong duality and our sufficient conditions will inspire similar work in the future.Throughout this section, if Ω is a Polish space, we write P(Ω) for the space of Borel probability measures on Ω.
2.1.Stochastic Calculus.Our main application is a problem about Brownian motion, so for this subsection we assume that the reader has familiarity with the basics of stochastic calculus, as outlined in, say, [32].For the setting, let us write Ω := C 0 ([0, ∞), R) for the usual Wiener space of continuous real-valued functions vanishing at zero, endowed with the topology of uniform convergence on compact sets.Also write W for the Wiener measure on Ω, that is, the law of a standard Brownian motion.We introduce the following notion, which is our object of interest for the remainder of the subsection.Definition 2.1.A probability measure P ∈ P(Ω) is said to have the Brownian germ coupling property (Brownian GCP) if one can construct a probability space ( Ω, F , P) on which are defined (i) a standard Brownian motion B = {B t } t≥0 , (ii) a stochastic process X = {X t } t≥0 with law P, and (iii) a random time T with P(T > 0) = 1, such that P(X t = B t for all 0 ≤ t ≤ T ) = 1.A stochastic process is said to have the Brownian GCP if its law has the Brownian GCP.
In words, a stochastic process with the Brownian GCP is a process which can be coupled to almost surely travel alongside a Brownian motion for some positive amount of time.Our interest in this property is inspired by the recent paper [9] in which it is shown that all Brownian motions with drift satisfy the Brownian GCP; as a natural follow-up to that work, we were motivated to understand how large this class of stochastic processes really is.Thus, the goal of this subsection is to show that in fact very many stochastic processes have the Brownian GCP.
Our contribution is to show that our duality theory presents a robust approach to this problem.In contrast, while the work in [9] is satisfyingly concrete (being based on William's excursion theory for Brownian motions with drift), its generalizability is quite limited.Towards this end, if we write ω = {ω t } t≥0 for the canonical coordinate process in Ω and if we define the usual germ σ-algebra via then we are led to the following fundamental result: Proposition 2.2.A probability measure P ∈ P(Ω) has the Brownian GCP if and only if we have Proof.Define the for each t > 0 the equivalence relation s } 0≤s≤t } and the σ-algebra F t := σ(ω s : 0 ≤ s ≤ t).For each t > 0, the equivalence relation E t is closed in Ω × Ω, so it follows from Theorem 3.12 and Lemma 3.14 that the pair (E t , F t ) satisfies strong duality.Moreover, we can write F 0+ = n∈N F 2 −n , and it follows that E 0+ := n∈N E 2 −n satisfies E * 0+ = F 0+ .(In fact, we can equivalently write E 0+ = t>0 E t .)Thus, Theorem 3.16 guarantees that E 0+ is strongly dualizable, hence that (E 0+ , F 0+ ) satisfies strong duality.
Finally, we claim that a probability measure P ∈ P(Ω) has the Brownian GCP if and only if there exists some coupling P ∈ Γ(W, P) with P(E 0+ ) = 1.For one direction, observe that, if P satisfies the GCP, then the joint law P of (B, X) is exactly a coupling P ∈ Γ(W, P) with P(E 0+ ) = 1.For the other direction, suppose that P ∈ Γ(W, P) is some coupling with P(E 0+ ) = 1.We claim that the desired space ( Ω, F , P) is given by Ω := Ω × Ω, and F := B(Ω) ⊗ B(Ω), and P as given.Indeed, one can define T := sup{t ≥ 0 : ω s = ω ′ s for all 0 ≤ s ≤ t}, which is clearly a measurable functions of the pair (ω, ω ′ ).Thus, the result holds by the definition of strong duality.
On the one hand, this provides an authoritative answer to the question of which stochastic processes have the Brownian GCP.On the other hand, it remains to show that this equivalent condition is easy to verify in some concrete cases.In fact, the value of Proposition 2.2 is that the classical tools of stochastic calculus have very much to say about the germ σ-algebra F 0+ , and thus many soft arguments suddently become available to us.
For example, Proposition 2.2 allows us to study the Brownian GCP for Brownian motions with possibly time-inhomogeneous drifts.That is, for h ∈ Ω, let us write W h for the law of {B t + h(t)} t≥0 where B is a standard Brownian motion, and let us venture to understand which h ∈ Ω are such that W h has the Brownian GCP.Towards answering this, let us write, for t > 0: Then define, H 1 0+ := t>0 H 1 t , which is a sort of "local" Cameron-Martin space near zero, and we have the following: Proof.Suppose that h ∈ H 1 t for some t > 0. By the Cameron-Martin theorem, we know that W and W h are mutually absolutely continuous on [0, t], with Radon-Nikodym given by Moreover, we know by Blumenthal's zero-one law that both W and W h assign only probabilities in {0, 1} to all events in F 0+ .In particular, we have W h (A) = W(A) for all A ∈ F 0+ .By Proposition 2.2, it follows that W h has the Brownian GCP.
We do not know whether the sufficient condition of the preceding result is in fact necessary.In contrast, we have a seemingly more complicated example in which our analysis is more complete.This is the setting of studying the Brownian GCP for time-homogeneous one-dimensional diffusions within the realm of classical conditions ensuring the existence of strong solutions to a given SDE [32, Chapter IX, Theorem 2.1].Since the Brownian GCP is a local property, we lose no generality in restricting our attention to a finite time interval.
has the Brownian GCP if and only if σ ≡ 1 on some neighborhood of x 0 .
Proof.For one direction, let ( Ω, F , P) be a probability space witnessing the fact that X satisfies the Brownian GCP.Then consider the event where ω, ω denotes the Itô quadratic variation of ω.Since A ∈ F 0+ and P(B ∈ A) = 1, Proposition 2.2 implies that we must have P(X ∈ A) = 1.Now note that the quadratic variation of X is given by X, X t = t 0 (σ(X s )) 2 ds, so we conclude that there almost surely exists some n ∈ N with σ(X s ) = 1 for all 0 ≤ s ≤ 2 −n .In particular, we conclude σ(x 0 ) = 1.
Next we define the random times τ − := inf{t > 0 : X t < x 0 } and τ + := inf{t > 0 : X t > x 0 } on ( Ω, F , P), which are stopping times with respect to the natural filtration of X.We recall that the standard small-time approximation for diffusions with Lipschitz coefficients guarantees that (X t − x 0 )/ √ t converges in distribution as t → 0 to a Gaussian random variable with mean 0 and variance σ(x 0 ) = 1, hence P(τ − = 0) = lim t→0 P(X s < x 0 for some 0 ≤ s ≤ t) By Blumenthal's zero-one law applied to the strong Markov process X, we see that P(τ − = 0) > 0 implies P(τ − = 0) = 1.The same argument applies to show P(τ Finally, we put all the pieces together.By the almost sure continuity of X and the fact that P(τ − = 0) = P(τ + = 0) = 1, we conclude that {X s : 0 ≤ s ≤ T } contains an open neighborhood of x 0 for any random variable T which is almost surely strictly positive.Since we already know that there exists an N-valued random variable with σ(X s ) = 1 for all 0 ≤ s ≤ 2 −n , we conclude that we have σ ≡ 1 on some neighborhood of x 0 .
For the other direction, suppose that σ ≡ 1 on some neighborhood U of x 0 .Then get reals a, b ∈ R with a < b and x 0 ∈ (a, b) ⊆ [a, b] ⊆ U .Since (2.1) admits strong solutions, we can construct X = {X t } t≥0 on the probability space (Ω, B(Ω), W) with its natural filtration {F t } t≥0 defined via F t := σ(ω s : 0 ≤ s ≤ t) for t ≥ 0. We write E for the expectation on this space.
To begin, we claim that the exit time τ X a,b := inf{t > 0 : X t / ∈ [a, b]} has finite exponential moments of all orders.To see this, define m := max −a≤x≤x µ(x), which is finite by the continuity of µ on U .Since we have µ(X t ) ≤ m for all 0 ≤ t ≤ τ X a,b almost surely, it follows that one can construct B m = {B m t } t≥0 a Brownian motion with drift m on the same probability space in such a way that we have X t ≤ B m t for all 0 ≤ t ≤ τ X a almost surely.Then define the stopping time τ B m a := inf{t > 0 : B m t ≤ a}, and note that we have τ X a,b ≤ τ B m a almost surely.It is known that τ B m a has finite exponential moments of all orders, so the same must be true of τ X a,b .In particular, we have shown This stopping-time version of Novikov's condition implies a stopping-time version of Girsanov's theorem which states that the law of X is mutually absolutely continuous with the law of by the continuity of sample paths.Since Blumenthal's zero-one law implies that all events in F 0+ have probability in {0, 1} under the laws W(X ∈ •) and W(X Finally, observe that σ ≡ 1 on U implies that X ′ is a standard Brownian motion up to the stopping time τ U := inf{t > 0 : we have by Proposition 2.2 that W(X ′ ∈ A) = W(A) for all A ∈ F 0+ .Consequently, W(X ∈ A) = W(A) for all A ∈ F 0+ .Therefore, one last application of Proposition 2.2 implies that X has the Brownian GCP.
Observe that both Theorem 2.3 and Theorem 2.4 recover the result of [9], since Brownian motion with drift is included as a special case.Nonetheless, we would be interested in understanding whether there exist "concrete" couplings for these processes which have just been proven, via very abstract methods, to have the Brownian GCP.
2.2.Random Sequence Simulation.Our next application of interest concerns simulating random sequences with complicated dependency structures.For this section, let X be a Polish space, and set Ω := X N with the Borel σ-algebra of the product topology on Ω.We write ω = {ω n } n∈N for an arbitrary element of Ω.
As motivation, let us make a simple observation.Suppose that µ is a fullysupported Borel probability measure on a finite set X and that P := n∈N µ is the law of an i.i.d.sequence of X-valued data each with law µ.We claim that if C ⊆ Ω is a cylinder set C = {ω n0 = a 0 , . . ., ω n k = a k } for some k ∈ N, some strictly increasing sequence n 0 , . . ., n k ∈ N, and some values a 0 , . . ., a k ∈ X, then there exists an algorithm which, when applied to a random sequence ω from P, outputs a random sequence ω ′ from P(• | C).Indeed, it is easy to construct such an algorithm directly: First, swap the character at position n 0 with the character at position min{m 0 > n k : ω m0 = a 0 }.Then, recursively for i ∈ {1, . . ., k}, swap the charater at position n i with the charater at position min{m i > m i−1 : ω mi = a i }.Note that this algorithm terminates in an almost surely finite number of steps (reads and swaps) and does not require any external randomness.
Our goal is to understand which laws of random sequences can be simulated in this way; that is, which random sequences can arise by applying a random sorting algorithm to an i.i.d.sequence?Towards making this precise, let us define some objects of interest.First, write S for the group of all permutations of N (that is, all bijections σ : N → N), and let us endow S with the topology of pointwise convergence.Then for n ∈ N, write Π n for the subgroup of all permutations σ ∈ S with σ(i) = i for i > n; these are the permutations that fix all elements except possibly those in {1, . . ., n}.We also define Π := n∈N Π n ⊆ S; it is known that the topology inherited from S makes Π into a Polish group [22,Theorem 6.26].We also let Π act on Ω in the natural way, by setting σ In particular, observe that the action is jointly continuous when viewed as a map a : Π × Ω → Ω.When we wish to emphasize that one argument is fixed, we will equivalently write a(σ, ω) =: a ω (σ) =: a σ (ω) for ω ∈ Ω and σ ∈ Π.Now we introduce our main object of interest.
Definition 2.5.For P, P ′ ∈ P(Ω), an algorithmic reassortment from P to P ′ is a transition kernel K : Ω × B(Π) → [0, 1] such that we have for all A ∈ B(Ω); we say that P ′ is an algorithmic reassortment of P if there exists an algorithmic reassortment from P to P ′ .
An algorithmic reassortment from P to P ′ exactly corresponds to an algorithm of interest to us: For each ω ∈ Ω, we have K(ω, •) which represents the required (possibly-random) permutation bringing ω to ω ′ .Since a random element from K(ω, •) fixes all but finitely elements of N almost surely, it can be regarded as (in fact, decomposed into) a finite sequence of transpositions on N, and this means that an algorithm represented by an algorithmic reassortment K must terminate in a finite number of swaps.However, such algorithms are not required to terminate in a finite number of reads; they may need to read the entire infinite sequence ω before deciding which swaps to apply.
Of course, algorithmic reassortment can be exactly characterized via our duality theory.To do this, let us define the usual exchangeable σ-algebra via Then we have the following: Theorem 2.6.For probability measures P, P ′ ∈ B(Ω), we have that P ′ is an algorithmic reassortment of P if and only if P(A) = P ′ (A) for all A ∈ I Π .
Proof.For each n ∈ N, define and observe that E Πn is closed in Ω × Ω.Thus, by Theorem 3.12 and Lemma 3.14, we have that E Πn is strongly dualizable for each n ∈ N. Therefore, Theorem 3.16 implies that is strongly dualizable.(Alternatively, one can directly apply Theorem 3.18.)Finally, observe that E * Π = I Π , so it follows that (E Π , I Π ) satisfies strong duality.By the definition of strong duality, it therefore suffices to show that P ′ being a rearrangement of P is equivalent to the existence of some P ∈ Γ(P, P ′ ) with P(E Π ) = 1.
For the first direction, suppose that P ′ is a rearrangement of P, and let K be the guaranteed kernel.Now construct a probability space on which are defined ω a random element of Ω with law P and σ a random element of Π whose law conditional on ω is K(ω, •).Then, setting, ω ′ := σ • ω, it is easy to see that the joint law P of (ω, ω ′ ) satisfies P(E Π ) = 1.
For the second direction, we claim that there exists a measuable map ψ : , where E Π is given the Borel σalgebra of the topology inherited from Ω × Ω. (Note that E Π is not Polish and that (E Π , B(E Π )) is not standard Borel.)Indeed, let us first consider the set-valued map Ψ : Then, the definition of the topology of pointwise convergence on Π implies that Ψ(ω, ω ′ ) is a closed subset of Π for all (ω, ω ′ ) ∈ E Π .Moreover, since all open sets U ⊆ Π are countable, we have Therefore, the Kuratowski and Ryll-Nardzewski measurable selection theorem guarantees the existence of our desired map.Finally, we suppose that there exists P ∈ Γ(P, P ′ ) with P(E Π ) = 1.Then, the probability space (Ω × Ω, B(Ω) ⊗ B(Ω), P) where σ 0 ∈ Π is any arbitrary fixed element.Since Ω and Π are both Polish, there exists a regular conditional distribution K : Ω × B(Π) → [0, 1] of σ with respect to ω, and this completes the proof.
Our main application of this characterization is as follows: Proposition 2.7.If P = n∈N µ for some µ ∈ P(X) and V : Ω → R is measurable, non-negative, and has 0 < Ω V dP < ∞, then P ′ ∈ P(Ω) defined via Ω V dP is an algorithmic reassortment of P.
Proof.By the Hewitt-Savage zero-one law, we have P(A) ∈ {0, 1} for all A ∈ I Π .Moreover, P ′ and P are mutually absolutely continuous by construction.Therefore, P ′ (A) = P(A) for all A ∈ I Π , so the result follows from Theorem 2.6.
When V = 1 C for a cylinder set C ⊆ Ω, we see that Proposition 2.7 recovers our initial observation that algorithmic reassortment can encode conditioning on cylinder sets.However, it significantly generalizes this, since X is not required to be finite, and V is neither required to be an indicator function nor to depend on finitely many coordinates.Thus, algorithmic reassortment can encode conditioning on arbitrary non-null measurable sets, and it can encode tilting by arbitrary nonnull integrable non-negative functions.
There are still a number of interesting questions in this direction.For instance, while we have given a rather abstract existence proof of such algorithms, we would be interested in understanding whether they can be constructed, in some generality, in a more concrete way.Moreover, we would be interested in understanding when such algorithms can be guaranteed to require only finitely many reads, in addition to the existing guarantee of only finitely many writes.
2.3.Point Process Theory.Our last application concerns point processes, for which [19] is the standard reference.To set things up, let X denote a locally compact Polish space.We write Ω for the space of non-negative integer-valued simple Borel measures (here, a non-negative integer-valued Borel measure µ on X is called simple if µ({x}) ≤ 1 for all x ∈ X), which is known to be a standard Borel space when endowed with the Borel σ-algebra of the weak topology [29,Proposition 1.11].Since every element ω ∈ Ω is a purely atomic measure on X, it can also be useful to regard ω as a random countable subset of X.Each P ∈ P(Ω) is called (the law of) a point process.
Our interest is in understanding the large-scale structure of point processes.To make this precise, we have the following notion.Definition 2.8.Two point processes P, P ′ ∈ P(Ω) are said to be identical at infinity if there exists P ∈ Γ(P, P ′ ) with P(ω| X\K = ω ′ | X\K for some compact K ⊆ X) = 1.
In words, two point processes are identical at infinity when they become equal upon removing all of the point in a sufficiently large (possibly random) compact set.To understand the name, note that, if one considers the points processes in the one-point compactification X † := X ∪ { †} of X, where † / ∈ X represents the state at infinity, then they are identical at infinity if and only if they can be coupled to agree on some neighborhood of †.
As the reader can probably anticipate, our next step is to use the duality theory to prove a more concrete characterization of when two point processes are identical at infinity.Indeed, for every compact set K ⊆ X we define the σ-algebra Now recall that a locally compact Polish space is σ-compact, so we can get a sequence {U n } n∈N of open sets with U n ↑ X as n → ∞ and such that U n has compact closure K n := U n for all n ∈ N. Finally, we define the tail σ-algebra via (In fact, it is straightforward to show that T does not depend on the choice of sequence {U n } n∈N .)Then we have the following: Proposition 2.9.Two point processes P, P ′ ∈ P(Ω) are identical at infinity if and only if P(A) = P ′ (A) for all A ∈ T .
Proof.For each n ∈ N, define the equivalence relation and observe E n is smooth.Thus, E n is strongly dualizable for all n ∈ N by Theorem 3.12 and Lemma 3.13, and therefore Theorem 3.16 implies that E := n∈N E n is strongly dualizable.Also, notice that we have E * n = G Kn for all n ∈ N, and E * = T ; this means (E, T ) satisfies strong duality.Finally, we claim that we have which completes the proof and also establishes the measurability of the event appearing in the definition of identity at infinity.Indeed, one direction is obvious, and the other follows from the fact that, by construction, for every compact set K ⊆ X there exists some n ∈ N with K ⊆ K n .
It remains to interpret Proposition 2.9 in a concrete way for a more specific class of point processes, and Poisson point processes of course provide the simplest setting.Recall that every non-negative σ-finite Borel measure µ on X gives rise to a Poisson point process on X with intensity µ; we regard this as a random element of Ω and write P µ for its law.We also need to consider the Hellinger distance between two non-negative σ-finite Borel measures µ, µ ′ , which is defined to be where λ is any non-negative σ-finite Borel measure which dominates both µ and µ ′ ; this value does depend on the choice of λ, so one can, for instance, take λ = µ + µ ′ .Then we have the following: Theorem 2.10.If µ and µ ′ are non-negative σ-finite Borel measures on X and there exists a compact set K ⊆ X such that d H (µ| X\K , µ ′ | X\K ) < ∞, then the Poisson point processes P µ , P µ ′ ∈ P(Ω) are identical at infinity.
Proof.If K ⊆ X is a compact set as given, then a classical result [35] shows that the point processes P µ| X\K and P µ ′ | X\K are mutually absolutely continuous.Moreover, it is immediate from Kolmogorov's zero-one law that any Poisson point process P ν assigns trivial probabilities to all events in T .Therefore, we have P µ| X\K (A) = P µ ′ | X\K (A) for all A ∈ T .Since we also have P µ| X\K (A) = P µ (A) and P µ ′ | X\K (A) = P µ ′ (A) for all A ∈ T , this implies P µ (A) = P µ ′ (A) for all A ∈ T .Therefore, the result follows from Proposition 2.9.
Another setting of interest, where we believe that Proposition 2.9 may find a different concrete interpretation, is the world of determinantal point processes.Since it is known that such processes also have trivial tail σ-algebra [27], it suffices to understand when two such laws are mutually absolutely continuous.We could not find this stated in the literature and we believe it would be an interesting avenue for future work.

Proofs of Main Results
In this section we develop the abstract duality theory as outlined in the introduction.Throughout this section, (Ω, F ) denotes a fixed measurable space.
3.1.Preliminaries.In this subsection we review some notation, definitions, and results that will be needed throughout the paper.
The first concepts concern the elementary notions of relations.By a relation R on Ω we mean any subset R ⊆ Ω × Ω.By an equivalence relation E on Ω we mean a relation Ω which is reflexive, symmetric, and transitive.There is a well-known correspondence between equivalence relations on Ω and partitions of Ω, where an equivalence relation gives rise to a partition by dividing the space into equivalence classes, and where a partition gives rise to an equivalence relation by declaring two points to be equivalent if and only if they occur in the same element of the partition.
Next we consider basic measure theory, as found in, say [18] or [8].For this part, we let (S, S) denote an abstract measurable space; the results herein will typically be applied when (S, S) is taken to be (Ω, F ) or (Ω × Ω, F ⊗ F ), but some other cases will also be used.For example, write R for the set of real numbers and B(R) for the Borel σ-algebra of its standard topology.We write bS for the space of bounded measurable functions from (S, S) to (R, B(R)).The Cartesian product of S with itself is denoted S × S, and the product σ-algebra of S with itself, that is the σ-algebra generated by the S × S, is denoted S ⊗ S. We write π, π ′ : S × S → S for the projection maps onto the first and second coordinates, respectively.Now, we write M(S, S) for the space of finite signed mesaures on (S, S).We endow M(S, S) with the partial order ≤ where µ, µ ′ ∈ M(S, S) have µ ≤ µ ′ if and only if we have µ(A) ≤ µ ′ (A) for all A ∈ S. For µ, µ ′ ∈ M(S, S) there exists [18, Corollary 2.9] a unique element of M(S, S) which is ≤-maximal among all elements which are ≤-bounded above by both µ and µ ′ , and we denote this by µ ∧ µ ′ .In fact, if P, N ∈ S are respectively the positive part and negative part of the Jordan decomposition of the signed measure µ−µ ′ , then we have µ∧µ ′ = µ(•∩N )+µ ′ (•∩P ).
Finally, we review some concepts related to Polish spaces; these ideas can be found in standard sources on measure theory like [8], or in more specialized treatments like [20].By a Polish space we mean a seaparable topological space (X, τ ) such that there exists a complete metric d on X which generates the topology τ .We write B(τ ) for the Borel σ-algebra of a Polish space, that is, the σ-algebra generated by the open sets.By a standard Borel space we mean a measurable space (X, X ) such that there exists a topology τ on X making (X, τ ) into a Polish space and such that we have B(τ ) = X Let us also review our central definition; let (Ω, F ) denote an abstract measurable space, and consider any pair (E, G) where E is a relation on Ω and G is a subset of F .We say that E is measurable if E ∈ F ⊗ F .We say that (E, G) satisfies strong duality if E is measurable and if we have for all P, P ′ ∈ P(Ω, F ).A coupling P ∈ Γ(P, P ′ ) achieving the minimum on the right side is called (E, G)-optimal, or simply optimal if (E, G) is clear from context.A coupling P ∈ Γ(P, P ′ ) with P(E) = 1 is called E-successful, or simply successful if E is clear from context.It turns out that it will also be useful along the way to consider two further notions of duality; we briefly introduce them now, and in the following subsections we study them more carefully.
First, let us say that a pair (E, G) satisfies weak duality if E is measurable and if we have for all P, P ′ ∈ P(Ω, F ). Importantly, observe that the right side need not be achieved, in that (E, G)-optimal couplings are not guaranteed to exist.Moreover, observe that strong duality obviously implies weak duality.Second, let us say that a pair (E, G) satisfies pseudo-strong duality if, for all P, P ′ ∈ P(Ω, F ), the following are equivalent: (i) For all A ∈ G, we have P(A) = P ′ (A).
(ii) There exists some P ∈ Γ(P, P ′ ) and some N ∈ F ⊗ F with P(N ) = 0 and (Ω × Ω) \ E ⊆ N .Crucially, observe that the measurability of E is not required in order for (E, G) to satisfy pseudo-strong duality; if E happens to be measurable, then (ii) above is simply equivalent to the existence of an E-successful coupling.The primary desire to generalize beyond measurable relations is to be able to say something about analytic equivalence relations on standard Borel spaces, which are a common object of study in ergodic theory [13].

Weak Duality.
In this subsection we study weak duality as a stepping stone to strong duality.The results herein provide some reductions which simplify our later work.
To begin, we state a fundamental set-theoretic correspondence between equivalence relations and sub-σ-algebras.Definition 3.1.For any relation E on Ω, define the sub-σ-algebra of F , and, for any subset G of F , define the equivalence relation on Ω.
It is now useful to make a few remarks.First, we note that E * is a σ-algebra even when E is not an equivalence relation, and likewise that G * is an equivalence relation even when G is not a σ-algebra.Although these operations will typically be applied when E and G are, respecively, an equivalence relation and a σ-algebra the added generality will be useful in some cases.Second, we note that the map E → E * depends on the ambient σ-algebra F with which Ω is endowed.Contrarily, the map G → G * does not depend on F .This is a first indication (there will be many more) of the point that E and G do not play totally symmetric roles in the duality theory.
Another important collection of remarks concerns the algebraic structure of this correspondence.While such algebraic terminology does not immediately pay dividends for our work, we believe that it is nonetheless useful to highlight the perspective; we direct the reader to [7, Section 1.6] for further information on abstract order theory.
as needed.Second, note that E * * = E implies that E is an equivalence relation, since E * * is always an equivalence relation.
As we see in the following, it is not true that G being a σ-algebra implies G * * = G.
Example 3.4.Let Ω be an uncountable set, let F be the σ-algebra of all subsets of Ω, and let G be the σ-algebra of all sets which are countable or whose complements are countable.Then G * = ∆ and G * * = F G.
Having clarified some aspects of the Galois correspondence, our next goal is to show that it is indeed useful in studying weak and strong duality.As a first indication of this, let us show how the Galois correspondence leads to a necessary and sufficient condition for weak duality.Proposition 3.5.For E a relation on Ω and G a subset of F with Ω ∈ G, the pair (E, G) satisfies weak duality if and only if E is measurable and E ⊆ G * and (equivalently, or) G ⊆ E * .
Proof.First suppose that (E, G) satisfies weak duality.By definition E is measurable, so we only need to show G ⊆ E * and E ⊆ G * .These are equivalent by part (ii) of Lemma 3.2, so it suffices to show G ⊆ E * .Indeed, take any (ω, ω ′ ) ∈ E, and note that by setting P := δ ω and P ′ := δ ω ′ , we have sup This says that for any A ∈ G we have ω ∈ A if and only if ω ′ ∈ A. In other words, G ⊆ E * .
Conversely, suppose that E is measurable and that G ⊆ E * and E ⊆ G * .Note that E ⊆ G * implies that for any A ∈ G we have Thus, for any P, P ′ ∈ P(Ω, F ) and P ∈ Γ(P, P ′ ), we can bound:

Now take the supremum over
Finally take the infimum over P ∈ Γ(P, P ′ ) to get: This says that (E, G) satisfies weak duality, as desired.
Corollary 3.6.For any measurable relation E on Ω, the pair (E, E * ) satisfies weak duality.Proof.By part (i) of Lemma 3.2, we have E ⊆ E * * , and we of course also have Ω ∈ E * .Thus, the result follows from Proposition 3.5.
For strong duality, we have the following necessary condition: Proposition 3.7.For E a relation on Ω and G a subset of F with Ω ∈ G, if the pair (Ω, G) satisfies strong duality, then E is an equivalence relation and (E, E * ) satisfies strong duality.
Proof.For arbitrary ω, ω ′ ∈ Ω we consider strong duality applied to P := δ ω and From here let us show that Therefore, E = G * .Since G * is always an equivalence relation, this immediately implies that E must be an equivalence relation.Moreover, note by part (ii) of Lemma 3.2 that E = G * implies G ⊆ E * .To see that this implies that (E, E * ) satisfies strong duality let P, P ′ ∈ P(Ω, F ) be arbitrary, and use the strong duality of (E, G) to bound: At the same time, Corollary 3.6 gives the bound: Combining these completes the proof.
Motivated by the previous result, it is reasonable to focus our attention on equivalence relations E and sub-σ-algebras G := E * .More precisely, let us say that a relation E on Ω is strongly dualizable if (E, E * ) satisfies strong duality; further, let us say that E is weakly dualizable if (E, E * ) satisfies weak duality, and that it is pseudo-strongly dualizable if (E, E * ) satisfies pseudo-strong duality.In this terminology, Corollary 3.6 guarantees that every measurable relation is weakly dualizable, and Proposition 3.7 guarantees that a relation is strongly dualizable only if it is a measurable equivalence relation.
To conclude this subsection, let us give an example to highlight the inherent asymmetry between equivalence relations E and sub-σ-algebras G in our duality theory.That is, while we typically think of E-coupling problems as primal and we wish to find a dual G-total variation problem, one may alternatively be interested in treating G-total variation problems as primal and finding a dual E-coupling problem.As we see in the following, this change in perspective is not always possible: Example 3.8.Take Ω := [0, 1] with F its Borel σ-algebra, and let G be the σalgebra of all subsets of Ω which are countable or whose complements are countable.Now suppose that E is an equivalence relation on Ω such that the pair (E, G) satisfies strong duality.By Proposition 3.7, we have G = E * .Since G separates points, we also have G * = ∆.Thus, by Lemma 3. Since this contradicts the assumption that (E, G) satisfies strong duality, there can be no equivalence relation E on Ω for which the pair (E, G) satisfies strong duality.

Pseudo-Strong Duality.
In this brief subsection we study pseudo-strong duality as it relates to strong duality.We begin with a simple but useful auxiliary result.
Lemma 3.9.For any P, P ′ ∈ P(Ω, F ), if M ∈ Γ s (P, P ′ ) is a sub-coupling of P and P ′ , then there exists a coupling P ∈ Γ(P, P ′ ) with M ≤ P.
If γ = 0, we take P := M. To see that it is a probability measure, compute To see that it has the correct marginals, observe that P • π −1 ≤ P with P a probability measure implies that P • π −1 = P; likewise for P • (π ′ ) −1 .
If γ > 0, we take To see that it is a probability measure, compute To see that it has the correct marginals, compute and likewise for P • (π ′ ) −1 .
From this we get the following.
Proposition 3.10.An equivalence relation is strongly dualizable if and only if it is measurable and pseudo-strongly dualizable.
Proof.Let E be an equivalence relation on Ω and G a sub-σ-algebra of G.It is clear that (E, G) satisfying strong duality implies that it satisfies pseudo-strong duality.Conversely, suppose that E is measurable and that (E, G) satisfies pseudo-strong duality.To see that (E, G) satisfies strong duality, let us show that an (E, G)optimal coupling exists for all P, P ′ ∈ P(Ω, F ).To do this, consider µ := P| E * and µ ′ := P ′ | E * as probability measures on the measurable space (Ω, E * ), and set ν := µ ∧ µ ′ .(Note that ν is not in general equal to (P ∧ P ′ )| E * .)Next, write K : bF → bE * for the conditional expectation operator of P with respect to E * , and K ′ : bF → bE * for the conditional expectation operator of P ′ with respect to E * .Finally, define sub-probability measures M, M ′ ∈ P s (Ω, F ) via M(A) := Ω K(1 A ) dν and M ′ (A) := Ω K ′ (1 A ) dν for all A ∈ F .By construction we have M(Ω) = M ′ (Ω) = ν(Ω) and M(A) = M ′ (A) for all A ∈ E * .Thus, by applying pseudo-strong duality (to a suitable rescaling of M and M ′ ), we know that there must exist some M ∈ Γ(M, M ′ ) and some N ∈ F ⊗ F with M(N ) = 0 and (Ω × Ω) \ E ⊆ N .Since E is measurable, this implies hence M • π −1 ≤ P. Likewise we have M • (π ′ ) −1 ≤ P ′ .This gives M ∈ Γ s (P, P ′ ), so we can apply Lemma 3.9 to get some P ∈ Γ(P, P ′ ) with M ≤ P.
Finally, let P, N ∈ E * denote the positive and negative part of the Jordan decomposition of the signed measure µ − µ ′ on (Ω, E * ).It is well known [18,Corollary 2.9] that we have for all A ∈ E * , and thus Combining this all, we have shown Since Corollary 3.6 establishes the reverse inequality, we have shown that P is (E, G)-optimal, hence that (E, G) satisfies strong duality.
The value of Proposition 3.10 is twofold.First, it allows us to simplify the proofs of some later results by demonstrating pseudo-strong duality in place of strong duality.Second, it allows us to immediately upgrade many results from the existing literature, which primarily come in the form of pseudo-strong duality, into statements of strong duality.For example, it is known (see [13,Proposition 3.1] and the remarks thereafter) that if G countable group acting measurably on a standard Borel space (Ω, F ), then the orbit equivalence relation is measurable, the invariant σ-algebra The result of [13] in fact guarantees pseudo-strong duality for a more general collection of semigroups acting measurably on a standard Borel space, and thus we get an analogous statement of strong duality for free.However, the precise sufficient conditions of [13] are rather cumbersome to state, so we omit them for brevity's sake.

Strong Duality.
In this subsection, we finally devote our attention to strong duality.In particular, we state and prove our main sufficient conditions for strong dualizability for measurable equivalence relations.
The first is a simple condition that covers many cases of interest.Recall that an equivalence relation E is called basic if E ∈ E * ⊗ E * , and note that every basic equivalence relation is measurable.Theorem 3.12.A basic equivalence relation is strongly dualizable.Proof.By Proposition 3.10, it suffices to show that every basic equivalence relation E is pseudo-strongly dualizable.Further, note that Corollary 3.6 guarantees that (ii) implies (i) in the definition of pseudo-strong duality, so it only remains to show that (i) implies (ii).That is, for any P, P ′ ∈ P(Ω, F ) satisfying P(A) = P ′ (A) for all A ∈ E * , we must construct an E-successful coupling.
To do this, define ν := P| E * as a probability measure on (Ω, E * ), and note by assumption that we also have ν = P ′ | E * .Next define K : bF → bE * to be the conditional expectation operator of P with respect to E * , and K ′ : bF → bE * to be the conditional expectation operator of P ′ with respect to E * .From these define the map K : bF , and note that this can be extended by the monotone class theorem to a map K : First, we claim that P is a coupling of P and P ′ .Indeed, for any A ∈ F : This finishes the proof.
In light of the previous result, it is useful to develop some concrete sufficient conditions for basicity.Adopting terminology from descriptive set theory, let us say that an equivalence relation E on a Polish space Ω with F its Borel σ-algeba is smooth if there exists a standard Borel space (X, X ) and some measurable map φ : (Ω, F ) → (X, X ) such that ω, ω ′ ∈ Ω have (ω, ω ′ ) ∈ E if and only if φ(ω) = φ(ω ′ ).Roughly speaking, a smooth equivalence relation is one that is "generated by" equality of Polish-space valued random variables.Note by [34,Exercise 5.1.10]that an equivalence relation E on a Polish space is smooth if and only if the σalgebra E * is countably generated.Then we have the following.Lemma 3.13.On a Polish space, every smooth equivalence relation is basic.
One may be tempted to think that Theorem 3.12, combined with Lemma 3.13 and Lemma 3.14, is powerful enough to establish strong dualizability most cases of interest in probability theory.However, many equivalence relations of central importance to probabilists are not basic.For example, we have the following: Example 3.15.Consider the space of binary sequences Ω 0 := {0, 1} N with F := B(Ω 0 ) the Borel σ-algebra of its product topology, and write an arbitrary element of Ω 0 as ω = {ω n } n∈N .If we define the equivalence relation of eventual equality via and the tail σ-algebra via then it is straightforward to check that we have E * 0 = T .Since E 0 is measurable and it is known that (E 0 , T ) satisfies pseudo-strong duality [14], we conclude by Proposition 3.10 that (E 0 , T ) satisfies strong duality.In particular, E 0 is strongly dualizable.However, we will show that E 0 is not basic.To show this, we will construct a Borel probability measure P on Ω 0 × Ω 0 such that we have both P( Ã) ∈ {0, 1} for all Ã ∈ T ⊗ T and P(E 0 ) ∈ (0, 1).
The preceding example shows that basicity is not necessary for strong duality, and that basicity does not provide a wide enough sufficient condition to guarantee strong duality for important equivalence relations like E 0 .Thus, our goal is to obtain even wider sufficient conditions for strong duality to hold.Towards filling this gap, we establish our next main result: Theorem 3.16.A countable increasing union of strongly dualizable equivalence relations is strongly dualizable.
Proof.Suppose that (Ω, F ) is an abstract measurable space, that {E n } n∈N is a nondecreasing sequence of strongly dualizable equivalence relations on Ω, and write E := n∈N E n .Now take any P, P ′ ∈ P(Ω, F ) and let us construct an E-optimal coupling.
To begin, set P 0 := P and P ′ 0 := P ′ .Then, inductively for n ∈ N, use the strong dualizability of E n to let Pn be an E n -optimal coupling of P n and P ′ n , and set To ensure that this construction is well-defined, we must verify that P n and P ′ n are, for each n ∈ N, sub-probability measures with the same total mass.Indeed, we claim by induction on n ∈ N that Pm (E m ).
The base case n = 0 holds because P 0 and P ′ 0 are probability measures, and the inductive step for n ∈ N follows from combining We claim that the sum in the last line above is equal to zero.In fact, we claim that all summands are equal to zero, in that for all m = 0, 1, . . ., n we have Since by Tychonoff we know that {0, 1} Ω is compact in its product topology, we can get some subsequence {n j } j∈N and some set A ∞ ⊆ Ω such that 1 An j → 1 A∞ pointwise.Of course, A nj ∈ F for each j ∈ N implies A ∞ ∈ F .Moreover, suppose that (ω, ω ′ ) ∈ E have ω ∈ A ∞ .Then we have both (ω, ω ′ ) ∈ E nj and 1 An j (ω) = 1 for sufficiently large j ∈ N.For such j ∈ N, it follows from A nj ∈ E * nj that ω ′ ∈ A nj .In particular we get 1 A∞ (ω ′ ) = lim j→∞ 1 An j (ω ′ ) = 1.This gives ω ′ ∈ A, hence A ∞ ∈ E * .Combining this all, we have shown which is evidently a sub-probability measure on (Ω × Ω, F ⊗ F ). Next, we claim that M is a sub-coupling of P and P ′ .Indeed, for any A ∈ F and n ∈ N: Thus, taking n → ∞ we get as claimed.We also get M • π −1 ≤ P by the same calculation.Lastly, we apply Lemma 3.9 to get some P ∈ Γ(P, P ′ ) with M ≤ P. It only remains to show that P is optimal.To see this, note that for all n ∈ N: Pm (E m ∩ E n+1 ) = (1 − P(E)), it follows that P is E-optimal.Thus, the result is proved.
3.5.Future Work.The main question remaining at the end of this work concerns the generality of strong duality.Quite embarassingly, we were unable to come up with a single example of a measurable equivalence relation which is not strongly dualizable.However, we believe that it should be possible cook one up by adapting the example given in [31,Section 4].Since that example is constructed in a rather pathological setting, we are motivated to form the following rather strong conjecture: Conjecture 3.17.On a standard Borel space, all Borel equivalence relations are strongly dualizable.
While we are unable to verify the conjecture in its full generality, we believe that the results so far provide some evidence towards the affirmative.For instance, Theorem 3.12, Lemma 3.13, and Theorem 3.16 together establish that all hypersmooth Borel equivalence relations are strongly dualizable.Additionally, we have the following: Theorem 3.18.On a standard Borel space, all countable Borel equivalence relations are strongly dualizable.
Proof.The Feldman-Moore theorem [21,Theorem 2.3] states that, for any countable Borel equivalence relation E on a standard Borel space (Ω, F ), there exists a countable group G acting measurably on Ω such that E = E G .Therefore, Theorem 3.11 guarantees that E is strongly dualizable.

Lemma 3 . 2 .
For E a relation on Ω and G a subset of F , we have (i) E * * ⊇ E and G * * ⊇ G, and (ii) E ⊆ G * if and only if G ⊆ E * .In other words, the correspondence given by * is an antitone Galois correspondence.As is the case in many antitone Galois correspondences, it is instructive to understand the operation arising by applying the correspondence twice.(In general, the resulting operation is not the identity.)In what follows, we write E * * := (E * ) * and G * * := (G * ) * for relations E on Ω and subsets G of F .Lemma 3.3.For E a relation on Ω we have E * * = E if and only if E is an equivalence relation.Proof.First, suppose that E is an equivalence relation, and let us show that E * * = E. Since we have E ⊆ E * * by Lemma 3.2, it suffices to show E * * ⊆ E. Indeed, take any (ω, ω ′ ) ∈ E * * and recall that for any A ∈ E * that ω ∈ A if and only if ω ′ ∈ A. Now combine E ∈ F ⊗ F with Fubini's theorem to get [ω] E := {ω ′′ ∈ Ω : (ω, ω ′′ ) ∈ Ω} ∈ F , then use the symmetry and transitivity of E to get [ω] E ∈ E * .By the reflexivity of E we have 3, we have E = E * * = G * = ∆.Then, let λ denote the Lebesgue measure on (Ω, F ), and define probability measures P and P ′ via dP dλ (x) = 2x and dP ′ dλ (x) = 2(1 − x) for x ∈ Ω.Finally, some easy calculations give sup A∈G |P(A) − P ′ (A) and the pair (E G , I G ) satisfies pseudo-strong duality.Thus we conclude: Theorem 3.11.If G is a countable group acting measurably on a standard Borel space, then (E G , I G ) satisfies strong duality.