COUNTABLE REPRESENTATION FOR INFINITE DIMENSIONAL DIFFUSIONS DERIVED FROM THE TWO-PARAMETER POISSON-DIRICHLET PROCESS

This paper provides a countable representation for a class of inﬁnite-dimensional diffusions which extends the inﬁnitely-many-neutral-alleles model and is related to the two-parameter Poisson-Dirichlet process. By means of Gibbs sampling procedures, we deﬁne a reversible Moran-type population process. The associated process of ranked relative frequencies of types is shown to converge in distribution to the two-parameter family of diffusions, which is stationary and ergodic with respect to the two-parameter Poisson-Dirichlet distribution. The construction provides interpretation for the limiting process in terms of individual dynamics.

In this case, the number of distinct values or species in the n-sized vector is bounded above by m, and Π mκ,−κ is m-dimensional symmetric Dirichlet. If n 0 = min{n ∈ : K n = m}, then for all n > n 0 the new samples are just copies of past observations. When σ = 0, (3) reduces to the Blackwell-MacQueen Pólya-urn scheme (see [4]; see also [15]), which generates a sequence of random variables fromΠ θ ,0 . The Blackwell-MacQueen case is also obtained when (4) holds by taking the limit for m going to infinity for fixed θ = mκ. The two-parameter Poisson-Dirichlet distribution has been recently shown to be the stationary measure of a certain class of diffusion processes taking values in the closure ∇ ∞ of the infinite dimensional ordered simplex See [18]. More specifically, a class of infinite dimensional diffusions with infinitesimal operator on an appropriately defined domain, is obtained as the limit of certain Markov chains, defined on the space of partitions of the natural numbers, based on the two-parameter generalisation of the Ewens sampling formula due to [19]. When σ = 0, (6) is the infinitesimal operator of the infinitely-many-neutral-alleles model, studied by [9], which is an unlabeled version of the Fleming-Viot measure-valued diffusion without selection nor recombination, but the diffusion with operator L θ ,σ seems to fall outside the class of Fleming-Viot processes. See [12] for a review.
Fleming-Viot processes also arise naturally as limits in distribution of certain Markov processes, often referred to as countable constructions or particle processes, which retain local information, i.e. relative to single individuals, rather than pooling it into a probability measure. Examples are [6], [7], [8] and [23]. The aim of this paper is to provide interpretation for (6) in terms of a countable construction of particles, which specifies individual dynamics. By means of simple ideas related to the Gibbs sampler (see, e.g., [14]), we construct a fixed-size right-continuous population process, driven by Pitman's prediction scheme (3), which is reversible with respect to the joint law of a sequence sampled from (3). The associated process of ranked relative frequencies of types is shown to converge in distribution, under suitable conditions, to the diffusion with operator (6). The paper is organised as follows. In Section 2 the Gibbs sampler is briefly introduced. Section 3 defines the particle process, the associated process of relative frequencies of types, and proves weak convergence. In Section 4 we deal with the stationary properties of both the particle and the simplex-valued diffusion.

The Gibbs sampler
The Gibbs sampler (see, e.g., [14]), also known as "heat bath" or "Glauber dynamics", is a special case of the Metroplis-Hastings algorithm, which in turn belongs to the class of Markov chain Monte Carlo (MCMC) procedures. These are often applied to solve integration and optimisation problems in large dimensional spaces. Suppose for example that an integral of some function f : → d with respect to some distribution π ∈ ( ) is to be evaluated, and Monte Carlo integration turns out to be unfeasible. Then MCMC methods provide a way of constructing a stationary Markov chain with π as the invariant measure. One can then run the chain, discard the first, say, N iterations, and regard the successive output from the chain as approximate correlated samples from π. The size of N is determined according to the convergence properties of the chain. The Gibbs sampler is one of the most widely used MCMC schemes, and has found a wide range of applications in Bayesian computation. The construction of a Gibbs sampler is as follows. Consider a law π = π(dx 1 , . . . , dx n ) defined on ( n , ( n )), and assume that the conditional distributions π(dx i |x 1 , . . . , x i−1 , x i+1 , . . . , x n ) are available for every 1 ≤ i ≤ n. Then, given an initial set of values (x 0 1 , . . . , x 0 n ), the vector is iteratively updated as follows: and so on. Under some regularity conditions, this algorithm produces a Markov chain with equilibrium law π(dx 1 , . . . , dx n ). The above updating rule is known as a deterministic scan. If instead the components are updated in a random order, called random scan, one also gets reversibility with respect to π.

Countable representation
For n ≥ 2, define a Markov chain on n as follows. Given any initial state of the chain, at each transition an index 1 ≤ i ≤ n is chosen uniformly and the component x i is updated with a sample of size one from the predictive distribution for x i derived from the Pitman urn scheme, leaving all other components unchanged. From (3), by the exchangeability of the sequence, this predictive is where θ and σ are as above, . , x n ) and K n−1,i denotes the number of distinct values in the subvector x (−i) . We are thus constructing a stationary chain on n via a Gibbs sampler performed on x = (x 1 , . . . , x n ) by means of a uniform random scan. Embed now the chain in continuous time by superimposing it to a Poisson process of intensity λ n > 0, dependent on the vector size, which governs the holding times between successive updates. This simple construction yields a continuous-time pure-jump Markov process corresponding to a contraction semigroup {T θ ,σ n (t)}, on the setĈ( n ) of continuous functions on n which vanish at infinity, given by whereT : [0, ∞) × n × ( ) n → [0, 1] is a transition function defined in terms of (7). The infinitesimal generator of the process is It can be easily checked that {T θ ,σ n (t)} is also positive, conservative, and strongly continuous in the supremum norm, hence (9) is the generator of a Feller process. Let µ n : n → ( ), given by be the empirical measure associated to the vector (x 1 , . . . , Also, let the intensity rate of the Poisson process underlying the holding times be which is positive for θ > −σ and n ≥ 2. This provides the correct rescaling in (9). Alternatively we could take any λ n = (n 2 ) and get the same result in the limit (see also discussion after equation (14) for the rescaling choice). Then, taking where P g(x) = g( y)ν 0 (d y), for g ∈Ĉ( ), and P i f denotes P applied to the i-th coordinate of f . This can be written as the sum θ ,σ Here, the operator Q n,i is defined, for g ∈Ĉ( ), as which, as n tends to infinity, converges to This is twice the generator of the Fleming-Viot process without selection nor recombination and with parent independent mutation with rate θ /2. Note that by taking λ n = λ n /2 instead of (11), yields θ /2. Of course, θ is also obtained as the infinite population limit of θ ,σ n when σ = 0 (and F (µ) = 〈 f , µ (m) 〉, m ≤ n). Thus the special case of the Pitman urn scheme with σ = 0, i.e. the Blackwell-MacQueen urn, provides, via a Gibbs sampler construction, the neutral diffusion model.
which, again, converges when σ = 0 to the familiar generator of the neutral diffusion model (cf., e.g., [11]). Now, define the first and second derivatives of F (µ) as Then θ ,σ n can also be written is the unit rate mutation operator, C (n) is and R n (F ) is a bounded remainder. The operator θ ,σ n does not seem to be well-behaved in the limit, due to the multiplicative term in the σ n part. An inspection of (7), which generates the particles, reveals the heuristics underlying this phenomenon. The probability of sampling a new species can be split into two terms, θ /(θ + n − 1) and σK n−1,i /(θ + n − 1). For large n, the two terms are of order n −1 and n −1+σ respectively, since K n is of order n σ (see [20]). With appropriate changes, similar considerations can be made for the empirical part of (7). The point here is that it is seemingly unfeasible to rescale the process with a rate able to retain, in the limit, all terms as well-defined infinite-dimensional genetic mechanisms. For instance, choosing λ n = (n 2−σ ), yields in the limit a degenerate measure-valued process with constant mutation rate and no resampling. Conversely, letting λ n = (n ), with > 2 − σ, in the attempt to preserve the resampling, leads to a process with infinite mutation rate. Note that we have no degrees of freedom on σ, which cannot depend on n by definition of the two-parameter Poisson-Dirichlet process. This makes the characterization of the limit of (10) a difficult task. A way of overcoming this problem is to restrict the framework. When we have a vector of size n ≥ 1, let F (µ) in (12) be given by F (µ) = g(〈φ 1 , µ〉, . . . , 〈φ n , µ〉), where g ∈ C 2 ( n ) and φ j (·) = 1 x * j (·) is the indicator function of x * j , 1 ≤ j ≤ n. That is 〈φ i , µ〉 = µ({x * j }) = z j is the relative frequency of the j-th observed type. Hence we can identify ( ) with the simplex Note that g has n − K n null arguments when there are K n different types in the vector. In this case we regard ∆ K n as a subspace of ∆ n and g(z 1 , . . . , z K n , 0, . . . , 0) as C(∆ n )-valued rather than C(∆ K n )-valued, since K n is a function of (x 1 , . . . , x n ). Within this more restricted framework, (12) reduces to the operator HereK n−1,i denotes the number of non null components in the vector (z 1 , . . . , z K n ) after z i is up- is the intensity of a mutation from type j to type i when there are K n different types, with b is the analog for the operator (14).

Remark 3.1.
Recall from the introduction that the prediction scheme (3) is non degenerate also when σ = −κ < 0 and θ = mκ for some κ > 0 and m ≥ 2. It can be easily seen that in this case (16) becomes˜ which, for n tending to infinity, sinceK n−1,i eventually reaches m with probability one, converges to˜ This is the neutral-alleles-model with m types, which can be dealt with as in [9]. In particular, for m going to infinity and θ = mκ kept fixed, one obtains, under appropriate conditions, the infinitely-many-neutral-alleles-model, whose stationary distribution is the one parameter Poisson-Dirichlet distribution. This is consistent with the fact that the same limit applied to (3) yields the Blackwell-MacQueen urn scheme.
When the mutation is governed by (13) we have b i j = ν 0 ({ j}) − δ i j (cf., e.g., [11]). Also, from (14) we have When the distribution ν 0 of the allelic type of a mutant is diffuse, these parameters yield Alternatively we could take the mutation to be symmetric, that is b This choice yields a different operator θ ,σ n but is equivalent in the limit for n → ∞.
The remainder of the section is dedicated to prove the existence of a suitably defined limiting process, which will coincide with that in [18], and the weak convergence of the process of ranked frequencies. In the following section we will then show that the limiting process is stationary and ergodic with respect to the two-parameter Poisson-Dirichlet distribution.
Proof. Let { n (t)} be the Feller semigroup corresponding to Z (n) (·). Then the proof is the same as that of Proposition 2.4 of [9]. In particular, it can be shown that { n (t)} maps the set of permutation-invariant continuous functions on ∆ n into itself. This, together with the observation that for every such f there is a unique g ∈ C(∇ n ) such that g = f • ρ −1 n and g • ρ n = f , allows to define a strongly continuous, positive, conservative, contraction semigroup { n (t)} on C(∇ n ) by n . Then ρ n (Z (n) (·)) inherits the strong Markov property from Z (n) (·), and is such that [ f (ρ n (Z (n) (t + s)))|ρ n (Z (n) (u)), u ≤ s] = n (t) f (ρ n (Z (n) (s))).

Define now the operator
with domain defined as Here ∇ ∞ is the closure of ∇ ∞ , namely which is compact, so that the set C(∇ ∞ ) of real-valued continuous functions on ∇ ∞ with the supremum norm f = sup x∈∇ ∞ | f (x)| is a Banach space. Functions ϕ m are assumed to be evaluated in ∇ ∞ and extended to ∇ ∞ by continuity. We will need the following result, whose proof can be found in the Appendix. Then we have the following.
We are now ready to prove the convergence in distribution of the process of ranked relative frequencies of types.

Stationarity
Denote with the joint law of an n-sized sequential sample from the Pitman urn scheme (3), and with p n (dx i |x (−i) ) the conditional distribution in (7).
Integrating out with respect to x both sides of (25) immediately yields the following.
The following proposition shows that the limiting diffusion is ergodic.
Proposition 4.6. Let Y (·) be as in Theorem 3.6. Then Y (·) is ergodic in the sense that as t → ∞, where µ is the unique stationary distribution.
Proof. See Appendix.
Since the two-parameter Poisson-Dirichlet distribution is concentrated on ∇ ∞ (cf., e.g., [22]), it follows that (26) can be modified to Hence eventually the process ends up in ∇ ∞ with probability one for any initial state belonging to ∇ ∞ .
For ϕ m 1 · · · ϕ m k we have The first term on the right-hand side equals As for the second term, we have The two following proofs are modifications of proofs in [9] adapted for the two-parameter case.

Proof of Proposition 4.3
If µ is a stationary distribution for the Markov process with generator θ ,σ we have and assuming ∇ ∞ ϕ m−1 dµ is determined, we have (1 + θ ) (m−1) .

Proof of Proposition 4.6
It suffices to show the result for all ϕ m 1 . . . ϕ m k , for k ≥ 1, ϕ m (z) = i≥1 z m i and z ∈ ∇ ∞ , and use the fact that the algebra generated by (1, ϕ 2 , ϕ 3 , . . .) is dense in C(∇ ∞ ), as shown in the proof of