On the characterization of exchangeable sequences through reverse-martingale empirical distributions

It is a well-known fact that an exchangeable sequence has empirical distributions that form a reverse-martingale. This paper is devoted to the proof of the converse statement. As a byproduct of the proof for the binary case, we introduce and discuss the notion of two-coloring exchangeability


Introduction
The basic idea behind exchangeability is to remove the independence assumption from an independent and identically distributed (iid) sample, while preserving the same marginals and the symmetry properties that make these objects so convenient to deal with.The symmetric property is easily seen to be equivalent to the notion of invariance under permutation of indices.These distributions have been extensively studied and were first shown in 1930 by Bruno de Finetti, in a special instance in [Fin30], to be equivalent to the mixing of iid samples, and later in 1937 for a more general case in [Fin37].Refer to [Ald85] for a good probabilistic introduction to the subject of exchangeability.This kind of symmetry also plays a role in the philosophical foundation of the Bayesian paradigm.
Let ξ = (ξ n ) n≥1 be a finite or infinite random sequence defined on a probability space (Ω, F , P ) and taking values in a standard Borel space (S, S).The sequence ξ is said to be exchangeable if for every positive integer n (ξ π(1) , . . ., ξ π(n) ) d = (ξ 1 , . . ., ξ n ), for all π ∈ S n , where S n denotes the set of permutations of {1, ..., n}.The seminal de-Finneti Theorem asserts that any infinite exchangeable sequence can be decomposed to a convex combination of iid sequences.Formally speaking, let us denote by P(S) the set of probability measures on (S, S).Equip P(S) with the σ-field A, which is generated by the mappings κ → κ(B) from P(S) → [0, 1], where B ∈ S.Then, de-Finneti's Theorem assures for any infinite exchangeable sequence ξ = (ξ n ) ∞ n=1 the existence of a unique probability measure ν on (P(S), A), so that The probability measure ν is called the de-Finneti measure of ξ.
To simplify matters, let us first consider a finite or infinite real-valued exchangeable sequence ξ.Denote F n the σ-field generated by S k = k i=1 ξ i , k = n, n + 1, . . . .We have by symmetry that is almost surely the same random variable for all ℓ ≤ n + 1.In particular, we get is a reverse martingale.In fact, one can associate a stronger reverse-martingale property with exchangeable sequences which take values in a standard Borel space (S, S) .For any positive integer n denote by η n = n −1 n i=1 δ ξi the n'th empirical distribu- tion of a random (finite or infinite) sequence ξ, and let T n be the σ-field generated by η n , η n+1 , .... Definition 1.1 (Reverse, measure-valued martingale).The sequence of empirical distributions η := (η n ) n≥1 is said to be a reverse, measure-valued martingale if for every bounded and measurable f : S → R it holds that where η n f := S f (s)η n (ds) = n −1 n i=1 f (ξ i ).Because for exchangeable sequences ξ, the prefix (ξ 1 , ..., ξ n ) remains exchangeable over T n for every n, similarly to the real-valued case, one can easily derive the following result.
Theorem 1.2.Let ξ be a finite or infinite exchangeable random sequence with empirical distributions η.Then η forms a reverse, measure-valued martingale.
In Theorem 2.4 of [Kal05], Kallenberg states that the result of Theorem 1.2 is not only necessary, but also sufficient for exchangeability to hold.Formally, Theorem 2.4 in [Kal05] states the following: Theorem 1.3.Let ξ be a finite or infinite random sequence with empirical distributions η.Then ξ is exchangeable if and only if the η form a reverse, measure-valued martingale.
Kallenberg's proposed proof of sufficiency turned out to be incomplete, and not mendable along the original lines.Thus, the present contribution has the principal aim of providing the first complete proof of Theorem 1.3, using different techniques than that of Kallenberg.The paper is organized as follows.In Section 2 we give a short proof to the case where ξ is a binary sequence using a Markov chain approach.This proof gives rise to the notion of two-coloring exchangeable process which we introduce and discuss in Subsection 2.2.Section 3 is devoted to the proof of Theorem 1.3 in the general case and builds on a discretization argument coupled with combinatorial arguments.We end the paper with Section 4, in which we give a wider perspective to the main result of the paper.This section, which can be read independently right after this introduction, surveys recent developments in the theory of exchangeable sequences which are closely connected to the current work.In Appendix A we discuss the original proof of Kallenberg and point to its incompleteness.
2 Binary Case and Two-Coloring Exchangeability In preparation for the proof of the above reuslt, we first establish some preliminary lemmas.Let Y n := n i=1 ξ i , n ≥ 1, be the sequence of partial sums of ξ.Lemma 2.2.ξ is exchangeable if and only if for all n and y ∈ {0, . . ., n} Said differently, Lemma 2.3 asserts that (ξ 1 , ...ξ n ) is exchangeable if and only if Y = (Y 1 , . . ., Y n ) forms a reverse Markov chain with transition probabilities given by (2.2).
Proof of Lemma 2.3.If ξ is exchangeable, this is a simple consequence of (2.1).If Y is a reverse Markov chain with these transition probabilities, multiplying them together and using that ξ ) exactly describes sampling without replacement from an urn with n balls, y n of which are white.
Proof of Theorem 2.1.As already discussed in the introduction we only need to show sufficiency.For this purpose, fix some n ∈ N, and introduce for each By the definition of Y we have that (2.3) On the other hand, since the ξ ℓ 's are binary, the empirical distributions η ℓ can be identified with Y ℓ /ℓ for each 1 ≤ ℓ ≤ n.Hence, by the reverse martingale assumption, for every such ℓ we have (2.4) Combining Eqs.(2.3) and (2.4) and yields Hence, in view of Lemma 2.3, we obtain that (ξ 1 , ..., ξ n ) is exchangeable, as desired.

Two-Coloring Exchangeability
Let us now introduce the notion of two-coloring exchangeable sequences.
Definition 2.4.A finite or an infinite random sequence ξ = (ξ n ) n≥1 taking values in a standard Borel space (S, S) is said to be two-coloring exchangeable if for every measurable f : S → {0, 1} the binary sequence (f (ξ n )) n≥1 is exchangeable.
In words, a sequence is two-coloring exchangeable if for any coloring of S with two colors, the colored sequence must be exchangeable.
Proposition 2.5.If the sequence of empirical distributions (η n ) n≥1 of an underlying random sequence ξ = (ξ n ) n≥1 taking values in a standard Borel space S is a reverse, measure-valued martingale then ξ is two-coloring exchangeable.
Proof of Proposition 2.5.Take some f : S → {0, 1} and denote for each n the σ-field Since the empirical distributions (η n ) of (ξ n ) determine the sequence of empirical distributions of the sequence (f (ξ n )) we have for every n ∈ Since f is binary-valued, the above relation establishes that the reverse martingale property holds for the empirical distribution of (f (ξ n )) n≥1 .Thus (f (ξ n )) must be exchangeable by Theorem 2.1.

Question 2.6. Is every finite or infinite two-coloring exchangeable sequence exchangeable?
To the best of our knowledge, Question 2.6 has not yet been introduced in the literature.Although the answer to this question remains an open problem, let us now describe an interesting connection with the well-known marginal problem (e.g., Strassen [Str65]).
Assume that (ξ n ) ∞ n=1 is an infinite two-coloring exchangeable sequence taking values in the finite set {1, ..., d} for some integer d ≥ 3. Consider for each i ∈ {1, ..., d} the binary function f i which assigns 1 to i, and 0 otherwise.For each such i, denote by µ i the de-Finneti measure of the exchangeable sequence (f i (ξ n )) ∞ n=1 .The measure µ i is a probability measure on the unit interval [0, 1].For the sake of the discussion, assume that (ξ n ) ∞ n=1 is also exchangeable.Then, its de-Finneti's measure, denoted µ, is a probability measure on the d-dimenstional simplex, denoted ∆ d , and defined by The marginals of the measure µ (i.e., the projected measures to each coordinate of R d ) are exactly the measures µ i , i = 1, ..., d.Thus, the measure µ is solution to the marginal problem on R d with marginals µ i and domain ∆ d .Therefore, we obtain that for finite valued infinite sequences (ξ n ) ∞ n=1 a necessary condition for a positive answer to Question 2.6, is that the marginal problem on R d with marginals µ i and domain ∆ d admits a solution.An equivalent formulation of this necessary condition can be written as follows (e.g., Theorem 3.4 in [Edw78]): where C ([0, 1]) denotes the space of continuous functions on the unit interval.

General Case
Notice that exchangeability is defined in terms of finite-dimensional distributions.
Proof of Proposition 3.1.We begin with the introduction of notation and definitions.Let Γ be a countable alphabet so that ξ n ∈ Γ for every n.For every n, the set Γ n corresponds to all words of length n over the alphabet Γ. Denote H := n≤m Γ n the set of all words of length at most m.Let µ be the probability measure induced on Γ m (equipped with the discrete σ-algebra) by the process (ξ 1 , ..., ξ m ).That is, For every two words h ∈ Γ n and w ∈ Γ ℓ , n + ℓ ≤ m, denote by hw the word obtained from the concatenation of h and w, i.e., hw = (h 1 , ..., h n , w 1 , ..., w ℓ ) ∈ Γ n+ℓ .Consider the set X ⊂ (N ∪ {0}) Γ , defined by The set X can be thought of as the set of counting measures on Γ having finite support, whose total measure is at most m.For each x ∈ X let s(x) = {γ ∈ Γ : x γ > 0} be the support of x.Denote for every γ ∈ Γ by e γ the sequence in X that assigns 1 to γ and 0 for all other letters.For each x = (x γ ) γ∈Γ ∈ X consider H[x] = {h ∈ H : γ appears x γ times in h, ∀γ ∈ Γ}.
A simple combinatorial argument implies that for every x = (x γ ) γ∈Γ ∈ X, Lastly, for every n < ℓ < m positive integers let ξ ℓ n := (ξ n , ξ n+1 , ..., ξ ℓ ).Our strategy to prove Proposition 3.1 will be to prove by induction the following statement.For every n ≤ m and permutation π ∈ S n it holds The base case n = 1 follows immediately.Let us show the induction step n − 1 ⇒ n.
We may now proceed with the proof of Theorem 1.3.
Proof of Theorem 1.3.From the introductory remarks, we may concentrate on proving sufficiency.Thus assume that η forms a reverse, measure-valued martingale.By the Borel Isomorphism Theorem (cf.p. 49-50 in Aldous [Ald85]) it suffices to consider the case S = R, as the extension of the result to any standard Borel space follows immediately.Let µ be the probability measure induced by (ξ 1 , ..., ξ m ) on (R m , B(R m )).
For each d ∈ N, consider the step function g d : R → R given by be the sequence of empirical measures of the sequence Y d , and denote and the latter is a reverse, measure-valued martingale, we have for every bounded and measurable f : R → R and any positive integer n < m that Thus, the sequence (η d n ) n≥1 also forms a reverse, measure-valued martingale.Hence, by Proposition 3.1 we obtain that Y d is exchangeable for any d ∈ N.
The latter yields that the sequence (ξ 1 , ..., ξ m ) is exchangeable on dyadic cubes1 .Indeed for every d ∈ N and any permutation π ∈ S m we get Since µ is a regular measure, and as any open set in R m can be decomposed into a disjoint union of dyadic cubes, we have the following identity: (3.5) Take some π ∈ S m and define the map π : R m → R m by π(x 1 , ..., x m ) := (x π(1) , ..., x π(m) By Eq. (3.4) we have that for each dyadic cube (3.6) Take some events A 1 , ..., A m ∈ B(R) and consider A 1 × ...
We end the paper with a perspective, which in our view, places the main result of the current paper in a broader context, involving random probability measures, martingales, and exchangeability.

Exchangeability, Random Probability Measures and Martingales
Consider a probability space (Ω, F , P ) and a standard Borel space (S, S).Denote by P(S) the space of all Borel probability measures on (S, S).Equip P(S) with the σ-field A generated by the functions κ → κ(B), κ ∈ P(S), where B ranges over S. A random probability measure (r.p.m) ν is a measurable mapping from (Ω, F ) to (P(S), A).Thus, for each ω ∈ Ω, ν(ω) is a Borel probability measure on (S, S).Throughout the paper we studied the r.p.m.'s (η n ) n≥1 corresponding to the empirical measures induced by a sequence of random variables (ξ n ) n≥1 defined on (Ω, F , P ), and taking values in S.
The martingale property of (p n ) ∞ n=1 turned out to be important in its own right.Berti, Pratelli, and Rigo [BPR04] showed that infinite random sequences (ξ n ) ∞ n=1 for whom (p n ) forms a measure-valued martingale (also known as the c.i.d.condition) posses a rich probabilistic structure, admitting strong laws and central limit type theorems.In particular, they generalized such known results on infinite exchangeable sequence to a wider class, that of c.i.d.sequences.A recent part of the literature on exchangeable sequences is devoted to the proximity of the r.p.m.'s η n and p n for large values of n (see [BRP17] for a survey of methodologies and new results).To give an example of such proximity result, assume that S = R d for some d ∈ N, and denote by B d the set of closed euclidean balls in R d .Berti, Pratelli, and Rigo [BRP18] showed that (e.g., Corollary 3 in [BRP18]) if (ξ n ) n≥1 is exchangeable then r n sup B∈B d |η n (B) − p n (B)| → 0 a.s.whenever r n log log n n → 0.

2.1 The Binary Case Theorem 2.1. Let
ξ be a finite or infinite sequence of binary random variables with empirical distributions η 1 , η 2 , .... Then ξ is exchangeable if and only if the η n form a reverse, measure-valued martingale.