Characterization of exchangeable measure-valued P´olya urn sequences

Measure-valued P´olya urn sequences (MVPS) are a generalization of the observation processes generated by k -color P´olya urn models, where the space of colors X is a complete separable metric space and the urn composition is a ﬁnite measure on X , in which case reinforcement reduces to a summation of measures. In this paper, we prove a representation theorem for the reinforcement measures R of all exchangeable MVPSs, which leads to a characterization result for their directing random measures ˜ P . In particular, when X is countable or R is dominated by the initial distribution ν , then any exchange-able MVPS is a Dirichlet process mixture model over a family of probability distributions with disjoint supports. Furthermore, for all exchangeable MVPSs, the predictive distributions converge on a set of probability one in total variation to ˜ P . A ﬁnal result shows that ˜ P can be decomposed into an absolutely continuous and a mutually singular measure with respect to ν , whose support is universal and does not depend on the particular instance of ˜ P .


Introduction
The classical two-color Pólya urn model, which describes the evolution of an urn reinforced with one ball of the observed color, has a fundamental role in the predictive construction of prior distributions for Bayesian inference [7].Suppose we have an urn that initially contains w 0 , w 1 > 0 balls of colors 0 and 1, respectively.Let us denote by X n ∈ {0, 1} the color of the sampled ball at step n.Then X 1 = 1 with probability w1 w0+w1 and, for every n = 1, 2, . .., P(X n+1 = 1|X 1 , . . ., X n ) = w 1 + n i=1 X i w 0 + w 1 + n a.s., (1.1) which is the proportion of 1-balls in the urn at time n.It is well-known that the process (X n ) n≥1 is exchangeable, that is, its law is invariant under finite permutations of the indices, and satisfies, as n → ∞, where p has a Beta distribution with parameters (w 0 , w 1 ), see, e.g., [16, p.8]; therefore, by de Finetti representation theorem for exchangeable sequences [1, Theorem 3.1], given p, the X n are conditionally independent and identically distributed with probability of "success" p.The converse is also true -any exchangeable sequence of Bernoulli random variables with a Beta prior distribution has the predictive structure (1.1).
There has been much interest in generalizing the construction (1.1) to obtain greater model flexibility and, at the same time, retain tractability of the process dynamics, see [7,13,16] and references therein.Various extensions of the Pólya urn scheme consider time-dependent, randomized, or generalized reinforcement mechanisms where colors other than those observed are added to the urn.More recently, [2,10,14] have proposed an extended class of measure-valued Pólya urn processes that are very general in nature, yet retain characteristic features of urn processes and contain, as a special case, most k-color urn models.The idea is to consider the urn composition as a finite measure µ on the space of colors X, in the sense that, for any measurable set B ⊆ X, the quantity µ(B) records the total mass of balls in the urn whose colors lie in B. Reinforcement is then reduced to a summation of measures, so that the updated urn composition, given that a ball with color x has been observed, becomes µ + R x , where R is a (random) transition kernel on X, i.e. a map x → R x from X to the space of (random) measures on X.Assuming µ + R x as the new urn composition, the next draw proceeds in the same way as before, independent of all previous draws.Thus, a sequence of finite measures (µ n ) n≥0 is a measure-valued Pólya urn process (MVPP) if it is a Markov process with the aforementioned additive structure for a given finite reinforcement kernel R. In this case, Theorem 1 in [9] implies the existence of a companion observation process (X n ) n≥1 such that X 1 ∼ µ0 µ0(X) =: µ ′ 0 and, for each n = 1, 2, . .., In particular, when the reinforcement R is non-random, the above becomes which corresponds to the normalized urn composition at time n in the urn analogy.We call such a sequence (X n ) n≥1 a measure-valued Pólya urn sequence (MVPS) to distinguish it from the process (µ n ) n≥0 .Analysis of MVPPs has been centered around two very distinct cases.In [9,17], the authors study the so-called "diagonal" model, where only the observed color is reinforced, i.e.R x = w(x) • δ x , with w(x) > 0, for x ∈ X, and δ x is the unit mass at x.It follows under certain conditions that there exists a random probability measure P on X such that, as n → ∞, (i) the normalized urn composition µ ′ n converges almost surely (a.s.) in total variation to P ; and (ii) P is concentrated on a subset of so-called "dominant" colors.On the other hand, [2,14,15] consider MVPPs for which there exists an underlying Markov chain with kernel R that satisfies some "irreducibility"-type conditions (see [15,Section 1.2] for more details).Then, under additional assumptions, they prove that µ ′ n converges a.s.weakly to a (deterministic) probability measure on X.
In this work, we consider MVPPs that generate an exchangeable observation sequence (X n ) n≥1 via (1.2).Standard results in exchangeability theory then guarantee that the predictive distributions of (X n ) n≥1 converge a.s.weakly to a random probability measure P on X, called the directing random measure, whose distribution determines the law of the exchangeable process, see, e.g., [1,Section 3].Therefore, the only exchangeable "irreducible" MVPSs are sequences of i.i.d.random variables.Our goal is to characterize all R for which (X n ) n≥1 is exchangeable, and in the process prove some new facts about P and hence about the distribution of (X n ) n≥1 .This will help us assess the limitations of the whole class of exchangeable MVPSs when used as models for Bayesian analysis.Important to our understanding are the kernel based Dirichlet sequences studied by Berti et al. [4], which are exchangeable MVPSs whose reinforcement R is a regular version of the conditional distribution for µ ′ 0 given some sub-σ-algebra G, see Section 2 for a rigorous definition.This assumption greatly simplifies the subsequent analysis and allows [4] to obtain a complete characterization of the directing random measure of any kernel based Dirichlet sequence.In [4, p.18], the authors tentatively raise the conjecture that (1.3) holds true for every exchangeable MVPS such that R x (X) = 1, not excluding the possibility of counterexamples.Our main theorem states that a normalized version of condition (1.3) is indeed true for every member of the class of exchangeable MVPSs, regardless of whether R x (X) = 1.That such a representation exists is not obvious, and its proof requires the use of techniques that go beyond those typical of the area and involve intensive use of abstract measure-valued objects.We also give additional results that do not make use of (1.3).In particular, we prove that when (X n ) n≥1 is not i.i.d., then R x (X) is constant for almost every x, which implies that a "diagonal" MVPS will be exchangeable only if w(x) is constant, see also Example 2.2.In addition, we show that, for fixed x, the reinforcement R x is either absolutely continuous or mutually singular with respect to ν.On the other hand, using a refined version of (1.3), we provide a complete description of all possible exchangeable k-color urn models with positive time-homogeneous reinforcement, which to our knowledge has not been done before.
The rest of the paper is structured as follows.In Section 2, we provide notation, state some general facts about exchangeable sequences, and formally define our model.Results and examples are given in Section 3, with proofs postponed to Section 4. We state our representation theorem for general R x in Section 3.1, while in Section 3.2 we study the case when R x is dominated by µ ′ 0 , which includes k-color urns, and in Section 3.3 the case when R x and µ ′ 0 are mutually singular.A final section concludes the paper.
2 The model

Preliminaries
Let (Ω, H, P) be a probability space, X a complete separable metric space, and X the associated Borel σalgebra on X.Standard results imply that X is countably generated.A transition kernel on X is a function ) is a measure on X, for all x ∈ X. Equivalently, R can be represented as a measurable function from X to the space of measures on X.In addition, a transition kernel R is said to be finite if R x (X) < ∞, for all x ∈ X, non null if R x (X) > 0, for all x ∈ X, and is called a probability kernel if R x (X) = 1, for all x ∈ X.A random (probability) measure is a transition (probability) kernel P : Ω × X → R+ from Ω to X.
Let ν be a probability measure on X , R a transition probability kernel on X, and G ⊆ X a sub-σ-algebra on X.Then R is said to be a regular version of the conditional distribution (r.c.d.) for ν given G, which for emphasis we will denote by for ν-a.e.x, , for all A ∈ G and B ∈ X .It follows from the assumptions on (X, X ) that a r.c.d. for ν under G exists and is unique up to a ν-null set.Let X be an X-valued random variable with marginal distribution X ∼ P X .Depending on the context, we would work with conditional distributions of the type P(• | X) or P(• | X = x), which are related by {X∈A} P(B|X)(ω)P(dω) = A P(B|X = x)P X (dx), for all A ∈ X and B ∈ H.
Recall that an X-valued sequence of random variables (X n ) n≥1 is exchangeable if for every n ≥ 2 and all permutations σ of {1, . . ., n}.In that case (see [1,Section 3]), there exists a random probability measure P on X, called the directing random measure of the sequence, such that, given P , the random variables X 1 , X 2 , . . .are conditionally independent and identically distributed (i.i.d.) with marginal distribution P , where Q is the (nonparametric) prior distribution of P .Moreover, for every A ∈ X , P(X n+1 ∈ A|X 1 , . . ., X n ) a.s.

−→ P (A). (2.1)
Since we take a predictive approach to model building, the following result from [8], which provides necessary and sufficient conditions for the system of predictive distributions of any stochastic process to be consistent with exchangeability, becomes our starting point of analysis.

Exchangeable MVPS
We will call any sequence (X n ) n≥1 of X-valued random variables on (Ω, H, P) a measure-valued Pólya urn sequence with parameters θ, ν and R, denoted MVPS(θ, ν, R), if X 1 ∼ ν and, for each n = 1, 2, . .., where R is a non null1 finite transition kernel on X, called the replacement/reinforcement kernel of the process, ν is a probability measure on X, known as the base measure, and θ > 0 is a positive constant.We will say that the MVPS is balanced if there is, in addition, some m > 0 such that R x (X) = m for ν-a.e.x, which in the urn analogy means that we add the same total number of balls each time.Moreover, unlike (1.2), we decompose the initial urn composition into two parameters, θ and ν, each one having a separate effect on the process (see, e.g., the construction in (2.7)).Consider an exchangeable MVPS(θ, ν, R) with directing random measure P .It follows from (2.1) and (2.4) that the reinforcement R directly influences the form of P .On the other hand, by Theorem 2.1, R itself must satisfy certain conditions that make it admissible under exchangeability.Note, however, that equation (2.3) is always true for MVPSs.Hence, information about R can only be retrieved from the invariance property of the two-step-ahead predictive distributions in (2.2).In fact, [4,Theorem 7] show in their study of kernel based Dirichlet sequences that, for a balanced MVPS to be exchangeable, it is sufficient that equation (2.2) holds for n = 0, 1. Remarkably, when the MVPS is unbalanced, then (2.2) for n = 2 is also needed (see the proof of Theorem 3.2).Using this information, we show that, when normalized, R must be a r.c.d. for ν given some sub-σ-algebra G of X , and in the process we prove that exchangeable MVPSs are necessarily balanced unless i.i.d.It turns out that the representation (2.5) has major consequences for the shape and distribution of P , including the fact that the convergence in (2.1) is in total variation.We give one important example of an exchangeable MVPS next.
Example 2.2 (Pólya sequence).Let (X n ) n≥1 be an MVPS with reinforcement kernel i.e., we only reinforce the observed color with one additional ball.This process, also known as a Pólya sequence, was first studied by Blackwell and MacQueen [6] as an extension of the Pólya urn scheme (1.1) to general Polish spaces X.In [6], the authors prove that (X n ) n≥1 is exchangeable and its directing random measure P has a Dirichlet process distribution with parameters (θ, ν), denoted DP(θ, ν).We recall that P ∼ DP(θ, ν) if, for every finite partition B 1 , . . ., B n ∈ X of X, the random vector ( P (B 1 ), . . ., P (B n )) has a Dirichlet distribution with parameters (θν(B 1 ), . . ., θν(B n )), An equivalent statement (see, e.g., [12, p.112]) is that P is equal in law to where In this case, the reinforcement (2.6) has the anticipated structure (2.5), since 3 Main results

General case
We begin this section with a simple characterization result for i.i.d.MVPSs.
The next theorem states that MVPSs that are exchangeable, but not i.i.d. are necessarily balanced.This fact is essential since it implies that the predictive distributions (2.4) are a linear combination of reinforcement kernels and leads via (2.2) to certain identities for R.
x and some constant m > 0. It then follows from the particular form of the predictive distributions (2.4) that we can equivalently reformulate the process as an MVPS with parameters ( θ, ν, R), where θ = θ m and Rx = Rx m with Rx (X) = 1, for ν-a.e.x.On the other hand, it is not hard to see that any i.i.d.MVPS(θ, ν, R) with unbalanced R is also an i.i.d.MVPS with parameters (θ, ν, ν).Thus, when we consider i.i.d.MVPSs below, we will implicitly assume that they are balanced.The above remarks are combined in the next corollary.
We now consider the issue of representing R as in (2.5).For completeness, we first state the converse result, which is found in [4].Theorem 3.5 (Theorem 7 in [4]).Any balanced MVPS(θ, ν, R) whose reinforcement kernel R is, up to normalization, a r.c.d. for ν given some sub-σ-algebra G of X , is exchangeable.Remark 3.6.Although Berti et al. [4] focus only on the balanced case, it follows from Theorem 3.2 that no unbalanced MVPS, having replacement kernel as in Theorem 3.5, can be exchangeable unless G = {∅, X}, in which case the observation sequence is i.i.d.
Our main result studies the necessity of such a representation under exchangeability.In addition, it states that we may take G to be countably generated (c.g.) under ν, in the sense that there exists C ∈ G such that ν(C) = 1 and G ∩ C is c.g.In fact (see [3, p.649] and [5]), G is c.g. under ν if and only if there exists a regular version Moreover, we can find a sub-σ-algebra G of X such that (3.2) holds and G is c.g. under ν.
and thus exchangeable by Theorem 3.5, then there exists by Theorem 3.7 a sub-σ-algebra G ′ which is c.g. under ν and ν( A major consequence of Theorems 3.2 and 3.7 is that one can now characterize the directing random measure P of any exchangeable MVPS using results developed in [4] under the stronger assumptions that R is balanced and of the form (3.2).In that case, Berti et al. [4,Theorem 10] show that the predictive distributions (2.4) converge almost surely in total variation to P .By virtue of Theorems 3.2 and 3.7, the following result, which is based on Theorems 9 and 10 in [4], is now true for the entire class of exchangeable MVPS.
Theorem 3.9.Let (X n ) n≥1 be an exchangeable MVPS(θ, ν, R) with directing random measure P .Then, as n → ∞, sup Moreover, P is equal in law to where (V j ) j≥1 and (Z j ) j≥1 are as in (2.7), using as parameters ( θ m , ν), and m > 0 is some constant such that m = R x (X) for ν-a.e.x.

Absolutely continuous case
In this section, we consider MVPSs such that, for ν-a.e.x, the replacement R x is absolutely continuous with respect to the base measure ν, denoted R x ≪ ν.Special cases include k-color urn models and MVPSs with discrete ν.The next theorem states that, under the assumption R x ≪ ν, the σ-algebra G in (3.2) is c.g. under ν w.r.t. a countable partition.Theorem 3.10.Let (X n ) n≥1 be an exchangeable MVPS(θ, ν, R).Then R x ≪ ν for ν-a.e.x if and only if there exists a countable partition for ν-a.e.x.
Note that the existence of the countable partition in (3.3) does not depend on the enumerability of the color space X or on the form of the base measure ν.The implications of Theorem 3.10 are illustrated in the next Example 3.11, which describes k-color urn models with positive time-homogeneous reinforcement.

Example 3.11 (k-color urn). When the space of colors
where p i ≥ 0 represents the initial fraction of balls with color x i in the urn.Furthermore, it is customary to state R in terms of a reinforcement matrix [a ij ] 1≤i,j≤k , where a ij ≥ 0 denotes the number of balls of color x j that will be added to the urn after color x i has been observed.
If the observation process (X n ) n≥1 is exchangeable, then we have from (2.4) that ; thus, if p i = 0 (i.e., there are initially no x i balls in the urn), then a ji = 0 for all j : p j > 0, so color x i never appears in the urn.The same argument can be repeated for any color that is not initially present in the urn, so we may assume p i > 0 for every i, without loss of generality.In that case, R x ≪ ν and (X n ) n≥1 satisfies the conditions of Theorem 3.10.Assuming R x (X) = 1, it follows from the particular form (3.3) of R that the reinforcement matrix of an exchangeable urn scheme with a finite number of colors has a block-diagonal design, i.e.X can be partitioned into subsets of colors D 1 , . . ., D l , where a color reinforces and is reinforced only by the colors belonging to the same subset.Furthermore, the rows in each block are identical and equal to the conditional probability of ν given that a color from the same block has been observed.In other words, each D j can be viewed as representing different nuances of the same color, so that the reinforcement (i) will be identical for, say, light or dark red; and (ii) will be equal to the initial number of balls for each nuance of red, normalized by the total number of red balls.
To illustrate the above points, let X = {x 1 , x 2 , x 3 } and suppose that the urn has initial composition (w 1 , w 2 , w 3 ), with w 1 , w 2 , w 3 > 0, so that ν(•) = 3 i=1 wi w δ xi (•) where w = 3 i=1 w i .Put wi = j =i w j , i = 1, 2, 3. Then any MVPS (X n ) n≥1 on X with initial composition (w 1 , w 2 , w 3 ) is exchangeable if and only if its associated reinforcement matrix R has one of the following forms, up to some multiplicative constants m, m 1 , m 2 , m 3 > 0, Example 3.12 (Discrete base measure).The previous example can be extended to MVPSs on more general spaces X when the base measure ν is discrete, which is true especially if X is countable.In this case, ν(•) = x∈X0 p(x)δ x (•) for some countable subset X 0 ⊆ X and positive weights p(x) > 0 such that x∈X0 p(x) = 1.If (X n ) n≥1 is one such process, then it satisfies, by virtue of (2.4), thus, R x (X c 0 ) = 0, and so R x ≪ ν for every x ∈ X. Theorem 3.10 then implies the existence of a countable partition on X such that Rx Rx(X) is a r.c.d for ν given the σ-algebra generated by said partition.
The form of the replacement kernel in (3.3) has further implications on the directing random measure P of the exchangeable process (X n ) n≥1 .In particular, it follows from (2.1) that, for every A ∈ X , on a set of probability one, thus, as measures, P (•) as X is countably generated.The next theorem is a direct consequence of this fact and the distribution results in Theorem 3.9 (see also Example 15 in [4] and Section 4.1.2in [2]).
The directing random measure (3.4) implies that the observations X 1 , X 2 , . . .will form clusters on the level of the sets . By Theorem 3.13, any exchangeable MVPS with absolutely continuous reinforcement is a Dirichlet process mixture model

Singular case
In general, we have the decomposition of the reinforcement for some finite measures R ⊥ x and R a x on X such that R ⊥ x and ν are mutually singular, denoted by R ⊥ x ⊥ ν, and R a x ≪ ν.The next theorem shows that, for fixed x, the replacement R x of any exchangeable MVPS is either mutually singular or absolutely continuous with respect to ν.In addition, the support of any mutually singular component, R ⊥ x , does not intersect the support of any absolutely continuous one, R a y , implying a complete separation of the two regimes.Theorem 3.14.Let (X n ) n≥1 be an exchangeable MVPS(θ, ν, R).Then there exists a set S ∈ X such that x (S c ) = 0, for ν-a.e.x in S, and R x = R a x and R a x (S) = 0, for ν-a.e.x in S c .
Remark 3.15.It follows from Theorem 3.14 that whenever (X n ) n≥1 is not i.i.d., then for ν-a.e.x, the support of R x , say S x ∈ X such that R x (S x ) = 1, has a ν-measure strictly smaller than one, ν(S x ) < 1.Indeed, this is obvious if we have an S ∈ X as in Theorem 3.14 such that 0 < ν(S) < 1.If instead ν(S) = 0, then R x ≪ ν for ν-a.e.x and R x has a "block-diagonal" design by Theorem 3.10 with at least two disjoint blocks.On the other hand, if ν(S) = 1, then R x ⊥ ν and there exists S x ∈ X such that R x (S x ) = 1 and ν(S x ) = 0, for ν-a.e.x.
The following simple example demonstrates one such case.
Let A, B ∈ X and x ∈ X.It is an easy exercise to check that Therefore, by [4, Theorem 7], the balanced MVPS (X n ) n≥1 with parameters (θ, ν, R) is exchangeable.
Then G is a σ-algebra.
) and ν(A c ∩ S c ) = 0, from where ν(A c ∩ B ∩ S c ) = 0, and so On the other hand, for all t ∈ [0, 1] and B ∈ X , the set {x ∈ X : R x (B) ≤ t} is equal to one of B c ∩ S, for ν-a.e.x.
The decomposition implied in Theorem 3.14 can be used to analyze separately the singular and the absolutely continuous parts of R. Concentrating on R ⊥ x , let us assume that R x ⊥ ν for all x ∈ X.By Theorem 3.2, such an MVPS is necessarily balanced, say R x (X) = 1 for ν-a.e.x.Moreover, ν has to be diffuse; else, if ν({z}) > 0 for some z ∈ X, then R x ({z}) = 0, and using (2.4), absurd.We show in Proposition 3.17 below that if R x is further discrete, then x is an atom of R x for ν-a.e.
x. Examples include the Pólya sequence from Example 2.2 with diffuse ν, and Example 2 in [4] where the authors consider an MVPS with R x = 1 2 (δ x + δ −x ).In addition, Proposition 3.17 states for discrete R that either R x = R y or R x ⊥ R y , which is reminiscent of the "block-diagonal" design in Section 3.2.Proposition 3.17.Let (X n ) n≥1 be an exchangeable MVPS(θ, ν, R).If R x ⊥ ν and R x is discrete, then R x ({x}) > 0 for ν-a.e.x.Moreover, for ν-a.e.x and y, either R x = R y or R x ⊥ R y .
Proof of Theorem 3.2.Let (X n ) n≥1 be an exchangeable MVPS(θ, ν, R).Define and let H x be the x-section of H.It follows from exchangeability and the form of the predictive distributions (2.4) that, for every A, B ∈ X , We proceed by making some important observations from (4.1).Let C ∈ X be such that ν(C) = 0. Applying (4.1) to C × X, we get ν(C) = X Rx(C) θ+f (x) ν(dx), so Rx(C) θ+f (x) = 0 for ν-a.e.x, and hence On the other hand, (4.1) implies the equivalence of the measures In particular, restricting the above measures to H where f (x) = f (y), we obtain ½ H (x, y)R x (dy)ν(dx) = ½ H (x, y)R y (dx)ν(dy). (4.5) We now consider the implications that the invariance of the two-step-ahead predictive distributions has on R. Let A, B ∈ X .Since X 1 ∼ ν, it follows from (2.2) in Theorem 2.4 that, for ν-a.e.x, Since X is countably generated, we may combine the non-null sets and get, for ν-a.e.x, the equivalence of the measures θν(dz Arguing as in (4.4), we get for ν-a.e.x θ which after simplification2 becomes θ 2 f (z)ν(dy)ν(dz) + R y (dz)ν(dy) Using (4.4) w.r.t.(y, z), we may collect θf (z)R y (dz)ν(dy) and the two terms with θ 2 on the left-hand side and cancel them with the sum of θf (y)R z (dy)ν(dy) and the two terms with θ 2 on the right-hand side to obtain thus, restricting the above measures to H where f (y) = f (z), ½ H (y, z)θf (x)R y (dz)ν(dy) + ½ H (y, z) θ + f (x) + f (y) R y (dz)R x (dy) = ½ H (y, z)θf (x)R z (dy)ν(dz) + ½ H (y, z) θ + f (x) + f (y) R z (dy)R x (dz).
Summing this with (4.11), we get that, for ν-a.e.x, R x (dy) = f (x)ν(dy), which implies from Proposition 3.1 that (X n ) n≥1 is i.i.d.If instead ν(D * ) = 0, then ν(H c x ) = 0 for ν-a.e.x.It follows for some fixed x ′ with ν(H x ′ ) = 1 that implying that the model is balanced.This concludes the proof of the theorem.
Remark 4.1.The proof of Theorem 3.2 largely remains the same even if we assume R x (X) = 0 for all x in some Z ∈ X such that 0 < ν(Z) < 1.The only place of concern is in "dividing" (4.13) by R x (X), so by restricting (4.13) on Z c we get ) > 0}, we obtain under ν(D * ∩ Z c ) > 0 that (X n ) n≥0 is i.i.d., using the same logic as above.If, on the other hand, ν(D * ∩ Z c ) = 0, then ν((D * ) c ) = ν(Z c ), since (D * ) c ⊆ Z c , and so thus showing that R x (X) = m > 0 for ν-a.e.x in Z c and some m > 0, when (X n ) n≥1 is not i.i.d.
As X is countably generated, there exists C 0 ∈ X such that ν(C 0 ) = 1 and, for all It is not hard to see that G is a σ-algebra.Next, fix t ∈ [0, 1] and B ∈ X .Denote e. y by (4.18), and so R Let A ∈ G and B ∈ X .It follows from (4.15) and ν( Therefore, R is a regular version of ν(• | G).Moreover, by construction, thus, R is a.e.proper and G is c.g. under ν (see [3, p.649] and [5, Theorem 1]).
Proof of Theorem 3.14.If (X n ) n≥1 is i.i.d., then Proposition 3.1 implies that R x = R a x , for ν-a.e.x.Suppose that (X n ) n≥0 is exchangeable, but not i.i.d.Assume, without loss of generality, that R x (X) = 1 for x ∈ X.By Lebesgue decomposition (Theorem 1 in [11]), for some finite transition kernels R ⊥ and R a on X such that R ⊥ x ⊥ ν for some S x ∈ X with ν(S x ) = 0 = R ⊥ x (S c x ), and R a x ≪ ν, for x ∈ X.Moreover, Fix one such x.Since ν(S x ) = 0, then R a y (S x ) = 0 for all y ∈ X, and integrating (4.32) on z ∈ S x and y ∈ X, we get where we have used in the last equality that R ⊥ x (S c x ) = 0. On the other hand, from (4.30),Moreover, 1 = R ⊥ y (S x ) ≤ R ⊥ y (X) = R y (S y ), so R a y (X) = 0, by (4.29), and R a y (dz)R ⊥ x (dy) = 0 for ν-a.e.x, which implies the splitting of the measures R y (dz)R x (dy)ν(dx) = R ⊥ y (dz)R ⊥ x (dy)ν(dx) + R y (dz)R a x (dy)ν(dx).
Let us define S := {x ∈ X : R ⊥ x (X) = 1}, which is X -measurable since R ⊥ x is a transition kernel.By (4.29), we have R x (S x ) = R ⊥ x (X) = 1, for x ∈ S. Therefore, using that R x (S x ) ∈ {0, 1} for ν-a.e.x, we get for ν-a.e.x that: if x ∈ S, then R x = R ⊥ x ; if x ∈ S c , then R x = R a x .Moreover, (4.29) and (4.34) imply for ν-a.e.x that y ∈ S for R ⊥ x -a.e.y; in other words, R ⊥ x (S c ) = 0 for ν-a.e.x.Finally, using (4.15) and the fact that R a x = 0 for ν-a.e.x ∈ S and R ⊥ x = 0 for ν-a.e.x ∈ S c , we get Proof of Proposition 3.17.Since R x ⊥ ν, then (X n ) n≥1 is not i.i.d., and hence not unbalanced by Theorem 3.2, so from Corollary 3.4 we may assume that R x (X) = 1, without loss of generality.Let us define, for x ∈ X,

Discussion
The results in this paper allow us to state some universal facts about exchangeable MVPSs that were previously unknown: 1) all models are necessarily balanced; 2) the normalized reinforcement kernels are a.e.proper regular conditional distributions; 3) the prior distributions have the stick-breaking construction of a Dirichlet process.When X is countable or the reinforcement is dominated by ν, then every MVPS is a Dirichlet process mixture model over a family of probability distributions with disjoint supports derived from ν. Relaxing parts of this structure while retaining exchangeability would lead to a very different sampling scheme from the one that (1.2) entails.On the other hand, the fact that, for fixed x, R x is either absolutely continuous or mutually singular with respect to ν, though surprising at first, seems very natural under exchangeability.Therefore, we expect that R can be further decomposed along other lines, e.g., R ⊥ x might be discrete only for some x and diffuse for the rest.
Note that the first matrix corresponds to the case of an i.i.d.sequence of random variables with marginal distribution ν, and the last one to the three-color Pólya urn model with a Dirichlet prior distribution with parameters ( w1 m , w2 m , w3 m ).
follows from (4.31), and for (b) we have used again that ν(S x ) = 0 and R a y (S x ) = 0 for all y ∈ X. Plugging this into (4.33),we get X R ⊥ y (S x )R ⊥ x (dy) = R ⊥ x (X) for ν-a.e.x, and thusR ⊥ y (S x ) = 1 for R ⊥ x -a.e.y. (4.34)

D
x := y ∈ X : R x ({y}) > 0 and D := {x ∈ X : x ∈ D x }.Then R x (•) = y∈Dx p x (y)δ y (•) for some p x (y) > 0 such that y∈Dx p x (y) = 1.It follows from (4.18) and the form of R x that there exists C ∈ X such that ν(C) = 1 and, for all x ∈ C, R x (dz) = R y (dz) for y ∈ D x ; thus, D x = D y for all y ∈ D x , which implies that y ∈ D y , i.e. y ∈ D. Therefore, by (4.15), ν(D) = X R x (D)ν(dx) = C y∈Dx p x (y)δ y (D) ν(dx) = C y∈Dx p x (y) ν(dx) = 1.For the second part, let x, y ∈ C ∩ D be such that y / ∈ D x .Assume D x ∩ D y = ∅.Take z ∈ D x ∩ D y .It follows from above that D x = D z = D y .But y ∈ D, so y ∈ D y = D x , absurd.As a result, for ν-a.e.x and y, either D x = D y or D x ∩ D y = ∅.
Proof of Theorem 3.7.If (X n ) n≥1 is i.i.d., then Proposition 3.1 implies (3.2) with G = {∅, X}.Suppose that (X n ) n≥1 is exchangeable, but not i.i.d.It follows from Theorem 3.2 that the model is balanced, so we may assume, without loss of generality, that R x (X) = 1, for x ∈ X; otherwise, we can normalize θ and R (see Remark 3.3) and proceed from there.In this case, (4.5) reduces to .29) Arguing as in (4.19), there exists some measurable function r :X 2 → R + such that