Kingman's coalescent with erosion

Consider the Markov process taking values in the partitions of N such that each pair of blocks merges at rate one, and each integer is eroded, i.e., becomes a singleton block, at rate d. This is a special case of exchangeable fragmentation-coalescence process called Kingman's coalescent with erosion. We provide a new construction of the stationary distribution of this process as a sample from a standard flow of bridges. This allows us to give a representation of the asymptotic frequencies of this stationary distribution in terms of a sequence of hierarchically independent diffusions. Moreover, we introduce a new process called Kingman's coalescent with immigration, where pairs of blocks coalesce at rate one, and new blocks of size one immigrate at rate d. By coupling Kingman's coalescents with erosion and with immigration, we are able to show that the size of a block chosen uniformly at random from the stationary distribution of the restriction of Kingman's coalescent with erosion to {1,...,n} converges to the total progeny of a critical binary branching process.


Motivation
In evolutionary biology, speciation refers to the event when two populations from the same species lose the ability to exchange genetic material, e.g.due to the formation of a new geographic barrier or accumulation of genetic incompatibilities.Even if speciation is usually thought of as irreversible, related species can often still exchange genetic material through exceptional hybridization, migration events or sudden collapse of a geographic barrier (Roux et al., 2016).This can lead to the transmission of chunks of DNA between different species, a phenomenon known as introgression, which is currently considered as a major evolutionary force shaping the genomes of groups of related species (Mallet et al., 2016).
Our study of Kingman's coalescent with erosion was first motivated by the following simple model of speciation incorporating rare migration events, depicted in Figure 1.Consider a set {1, 2, 3} {1, 2}, {3} {1, 2, 3} {1, 3}, {2} past present Figure 1: Illustration of the model with N = 5 species, represented by grey tubes, and n = 3 genes, represented by the colored lines inside the tubes.A species can split into two, simultaneously replicating its genome (speciation).A gene can replicate and move from one species to another and then replace its homologous copy in the recipient species (introgression).At present time a randomly chosen species is sampled: the ancestral lineages of its genes are represented with bolder colors.The green lineage is first subject to an introgression event and jumps to a new species.It is then brought back to the same species as the other genes by a coalescence event.The corresponding partition-valued process obtained by assigning the labels 1, 2 and 3 to the red, blue and green gene respectively is given.
of N monomorphic species, each harboring a genome of n genes indexed by {1, . . ., n}.We model speciation by assuming that the dynamics of the species is described by a Moran model: at rate one for each pair of species (s 1 , s 2 ), species s 2 dies, s 1 gives birth to a new species, replicates its genome and sends it into the daughter species.We also model introgression by assuming that at rate d for each gene g ∈ {1, . . ., n} and each pair of species (s 1 , s 2 ), g is replicated, the new copy of g is sent from s 1 to s 2 and replaces its homolog in s 2 .This assumption is justified by the following view in terms of individual migrants.Each time a migrant goes from s 1 to s 2 , if recombination is sufficiently strong, its genome rapidly gets washed out by that of the resident species due to the frequent backcrosses (crosses between descendants of the migrant and local residents) so that at most one gene among n reaches fixation.Now consider a fixed large time T , and sample uniformly one species at that time.We follow backwards in time the ancestral lineages of its genes and the ancestral species to which those genes belong.This induces a process valued in the partitions of {1, . . ., n} by declaring that i and j are in the same block at time t if the ancestral lineages of genes i and j sampled at T lied in the same ancestral species at time T − t.
At first (t = 0), all genes belong to the same ancestral species.Eventually this species receives a successful migrant from another species.Backwards in time, the gene that has been transmitted during this event is removed from its original species and placed in the migrant's original species.Such events occur at rate (N − 1)d for each gene, and the migrant species is then chosen uniformly in the population.Once genes belong to separate species, they can be brought back to the same species by coalescence events.Any two species find their common ancestor at rate one, and at such an event the genes from the two merging species are placed back into the same species.
This informal description shows that the partition-valued process has two kinds of transitions: each pair of blocks merges at rate one; each gene is placed in a new uniformly chosen species at rate (N − 1)d.Setting the introgression rate to d N = d/N and letting N → ∞, introgression events occur at rate d for each gene.At each such event the gene is sent to a new species that does not contain any of the other n − 1 ancestral gene lineages, i.e., it is placed in a singleton block.This is the description of Kingman's coalescent with erosion, that we now more formally introduce.

Kingman's coalescent with erosion
representation theorem (see e.g.Bertoin, 2006) shows that for any i, the following limit exists a.s.
Let (β i ) i≥1 be the non-increasing reordering of the sequence (f i ) i≥1 .We call (β i ) i≥1 the asymptotic frequencies of Π.The sequence (β i ) i≥1 is such that Such sequences are called mass-partitions.Mass-partitions are studied because exchangeable partitions are entirely characterized by their asymptotic frequencies.The partition Π can be recovered from its asymptotic frequencies (β i ) i≥1 through what is known as a paintbox procedure.Conditionally on (β i ) i≥1 , let (X i ) i≥1 be an independent sequence such that for k ≥ 1, P(X i = k) = β k , and P(X i = −i) = 1 − k≥1 β k .Then the partition Π of N defined as i ∼ Π j ⇐⇒ X i = X j is distributed as Π (see e.g.Bertoin, 2006).We see that i is in a singleton block iff X i = −i.
The set of all singleton blocks is referred to as the dust of Π, and the partition Π has dust iff i≥1 β i < 1.
The main characteristics of the asymptotic frequencies of fragmentation-coalescence processes have already been derived in Berestycki (2004), see Theorem 8.In the case of Kingman's coalescent with erosion, these results specialize to the following theorem.
Theorem 1.2 (Berestycki 2004).Let (β i ) i≥1 be the asymptotic frequencies of Π, the stationary distribution of Kingman's coalescent with erosion.Then In other words, the partition Π has infinitely many blocks, and no dust.
Before stating our main two results, let us motivate them.Consider a partition Π obtained from a paintbox procedure on a random mass-partition ( βi ) i≥1 , and denote Πn its restriction to [n].There are two sources of randomness in Πn .One originates from the fact that ( βi ) i≥1 is random.Moreover, conditionally on ( βi ) i≥1 , Πn is obtained by sampling a finite number of variables with distribution ( βi ) i≥1 .Thus, in addition to the randomness of ( βi ) i≥1 , Πn is subject to a finite sampling randomness.
Suppose that Π has finitely many blocks, say N , with asymptotic frequencies ( β1 , . . ., βN ).When n gets large, the finite sampling effects vanish and the sizes of the blocks of Πn resemble (n β1 , . . ., n βN ).However, when Π has infinitely many non-singleton blocks, there always exists a large enough i such that the size of the block with frequency βi remains subject to finite sampling effects in Πn .In this case it is not entirely straightforward to go from the asymptotic frequencies ( βi ) i≥1 to the size of the blocks of Πn , as this involves a non-trivial sampling procedure.
In this work our task will be twofold.First, we will investigate the size of the "large blocks" of Π n by describing the distribution of the asymptotic frequencies (β i ) i≥1 .In order to get an insight into the distribution of the "small blocks" of Π n , we will rather study the empirical distribution of the size of the blocks of Π n , for large n.Let us now state the corresponding results.

Main results
We show two main results in this work.One is concerned with the size of the large blocks of Kingman's coalescent with erosion, and gives a representation of its asymptotic frequencies in terms of an infinite sequence of hierarchically independent diffusions.The other is concerned with the size of the small blocks and provides the limit of the distribution of the size of a block chosen uniformly from the stationary partition when n is large.Let us start with the former result.
Size of the large blocks.Let (Y i ) i≥1 be an i.i.d.sequence of diffusions verifying started from 0, and where (W i ) i≥1 are independent Brownian motions.It is known, see e.g.Lambert (2008) Proposition 2.3.4,that each Y i is distributed as a Wright-Fisher diffusion conditioned on hitting 1, and thus we have Accordingly, we set Y i (∞) = 1.We build inductively a sequence of processes (Z i ) i≥1 and time-changes (τ i ) i≥1 as follows.Set Then, suppose that (Z 1 , . . ., Z i ) and (τ 1 , . . ., τ i ) have been defined, and set ds.
Then we have the following representation of the asymptotic frequencies of the stationary distribution of Kingman's coalescent with erosion.
Theorem 1.3.Let (Z i ) i≥1 be the sequence of diffusions defined previously.Then the nonincreasing reordering of the sequence (z i ) i≥1 defined as is distributed as the frequencies of the blocks of the stationary distribution of Kingman's coalescent with erosion rate d.
Let us explain the intuition behind Theorem 1.3.Kingman's coalescent is dual to a measure-valued process called the Fleming-Viot process (Etheridge, 2000).The Fleming-Viot process describes the offspring distribution of a population with constant size, while Kingman's coalescent gives the genealogy of that population.By a classical duality argument, Kingman's coalescent at time t can be obtained by sampling individuals at time t from a Fleming-Viot process and placing in the same block those that have the same ancestor (Bertoin and Le Gall, 2003).The link with Theorem 1.3 is that the diffusions (Z i ) i≥1 correspond to the sizes of the offspring of the individuals of a Fleming-Viot process, ordered by extinction time of their progeny, see Section 5.The integral transformation is roughly due to the fact that in Kingman's coalescent with erosion, one needs to place in the same block the individuals that have the same ancestor at their last erosion event, which is an exponential variable with parameter d.This heuristical argument is made rigorous in Section 5, where Theorem 1.3 is proved.
Size of the small blocks.In order to capture the characteristics of the small blocks of Π n , we study the empirical measure of the size of the blocks of Π n .Let M n be the total number of blocks of Π n , and let (|C n 1 |, . . ., |C n M n |) be their sizes.For each k ≥ 1, we denote the frequency of blocks of size k.The probability vector (µ n k ) k≥1 is the empirical measure of the size of the blocks of Π n .We give the following characterization of the asymptotic law of (µ n k ) k≥1 and M n .
Theorem 1.4.(i) The following convergence holds in probability (ii) Moreover, for each k ≥ 1, the following convergence holds in probability where J is the total progeny of a critical binary branching process.
In the previous proposition and hereafter we call critical binary branching process the Markov process on N starting from 1 that jumps from k to k + 1 and from k to k − 1 at rate k.Its progeny is the sum of the initial number of particles and of the total number of birth events, i.e., of jumps from k to k + 1, before the process is absorbed at 0. Remark 1.5.It is interesting to notice that the limiting distribution of the vector (µ n k ) k≥1 is determinisitc and does not depend on the erosion coefficient d.
Remark 1.6.The convergence of the vector (µ n k ) k≥1 is equivalent to the convergence in probability of the empirical measure of the size of the blocks of Π n to the distribution of J in the weak topology.
Let us again discuss briefly the heuristic of our proof of this result.Erosion occurs at a rate proportional to the size of the blocks, i.e., a block of size k is eroded at rate k, while coalescence events do not take the sizes of the blocks into account.As there are only few blocks with large size in Π n , and many small blocks, most coalescence events occur between small blocks, while most erosion events occur within these few large blocks.When restricting our attention to small blocks, we can neglect erosion, and consider that pairs of blocks coalesce at rate 1, and that new blocks of size 1 appear at constant rate due to the erosion of the large blocks.
This heuristic led us to consider a process analogous to Kingman's coalescent with erosion, where pairs of blocks coalesce at rate 1, but new singleton blocks immigrate at constant rate d.We call this process Kingman's coalescent with immigration.We consider the genealogy of a block sampled uniformly from Kingman's coalescent with immigration.We prove that this genealogy converges, as the immigration rate goes to infinity, to a critical binary birth-death process.See the forthcoming Proposition 3.6.
Outline.The remainder of the paper is organized as follows.In Section 2 we provide two constructions of Kingman's coalescent with immigration, as well as a coupling between Kingman's coalescents with erosion and immigration.Section 3 is then devoted to giving the genealogy of the blocks of Kingman's coalescent with immigration.We there prove a result analogous to Theorem 1.4, see Proposition 3.1.Theorem 1.4 is proved in Section 4, where we carry out the coupling between Kingman's coalescents with erosion and immigration.Finally, we prove Theorem 1.3 in Section 5.
Possible extensions.As we have mentioned, Kingman's coalescent is part of the more general class of fragmentation-coalescence processes.We now briefly discuss potential extensions of our results to such processes.
The main ingredient of our study of the size of small blocks is that fragmentation is faster for larger blocks, while coalescence occurs at the same speed regardless of the size of the blocks.This allows us to neglect fragmentation and consider a purely coalescing system where new blocks immigrate due to the fragmentation of the large blocks.First, this picture remains valid for Λ-coalescents with erosion, but the proofs would be more involved because computations could no longer be made explictly.Morever, we believe that this picture also remains valid for a broad class of binary fragmentation measures.The particles that are removed from the large block would no longer be of size one, but should not have time to split on the time-scale when small blocks are formed, yielding a situation similar to the erosion case.
Theorem 1.3 relies on a construction of the stationary distribution of Kingman's coalescent with erosion from a Fleming-Viot process that can be directly generalized to Λ-coalescents with erosion (and even to Ξ-coalescents with erosion) by using the corresponding Λ-Fleming-Viot process.However, the explicit expression of the size of the blocks in terms of hierarchically independent diffusions cannot be achieved in general.Nevertheless see the end of Section 5 for a discussion of a possible extension of Theorem 1.3 to Beta-coalescents with erosion.
Overall, the techniques and ideas we use in this work are not entirely specific to Kingman's coalescent with erosion.Nevertheless, in this case, the proofs are greatly simplified because all calculations can be made explicitly.This reason led us to restrict our attention to Kingman's coalescent with erosion in this work, and to leave possible extensions for future work.

Kingman's coalescent with immigration
In this section we construct Kingman's coalescent with immigration as a partition-valued process such that pairs of blocks coalesce at rate 1 and new blocks immigrate at rate d.Then, we give an alternative construction of Kingman's coalescent with erosion from the flow of bridges of Bertoin and Le Gall (2003).Finally, the coupling between Kingman's coalescents with erosion and with immigration is carried out in Section 2.4.

Definition
Consider a Poisson point process on R with intensity d dt, and let (T i ) i∈Z be its atoms labeled in increasing order such that T 0 < 0 < T 1 .The sequence (T i ) i∈Z corresponds to the immigration times of new particles in the system.
Fix N ∈ Z, we will first define Kingman's coalescent with immigration for the particles that have a label larger that N , and then extend it to all particles by consistency.We do that in the following way.Initially, set ∀t < T N , ΠN t = O .We now extend ΠN t to all real times by induction.Suppose that ΠN t has been defined on (−∞, T k ), for k ≥ N .We first set to represent the immigration of the new particle with label k.We now let each pair of blocks of ΠN t coalesce at rate one for T k ≤ t < T k+1 .One can achieve this by considering, conditional on ΠN T k = πk , an independent version (Π k t ) t≥0 of Kingman's coalescent started from πk , and setting We say that the process ( ΠN t ) t∈R is the N -Kingman coalescent with immigration rate d.The following proposition shows that we can extend consistently the N -Kingman's coalescent with immigration to a process taking its values in the partitions of Z, and that it is a Markov process whose transitions coincide with our intuitive description of Kingman's coalescent with immigration.
Proposition 2.1.(i) There exists a unique process ( Πt ) t∈R , called Kingman's coalescent with immigration rate d, such that for all N ∈ Z, its restriction to {i ∈ Z : i ≥ N } is distributed as the N -Kingman coalescent with immigration.
(ii) With probability one, Πt has finitely many blocks for all t ∈ R.
(iii) The process ( Πt ) t∈R is Markovian.Conditional on Πt = π , where π is a partition of {i ∈ Z : i ≤ n}, each pair of blocks coalesce at rate 1, and at rate d the process goes to the partition π ∪ {n + 1}, i.e., a new particle immigrates.
Proof.(i) Let ( ΠN t ) t∈R be a N -Kingman's coalescent with immigration.It is sufficient to show that the restriction ( ΠN+1 t ) t∈R of ( ΠN t ) t∈R to {i ∈ Z : i ≥ N + 1} is distributed as a N + 1-Kingman's coalescent with immigration, and the result will follow from Kolmogorov's extension theorem.Obviously, the immigration times of ( ΠN+1 t ) t∈R have the desired distribution.The result is now a simple consequence of the sampling consistency of Kingman's coalescent.
(ii) Let us now prove the second point.Kingman's coalescent has the property of coming down from infinity (Kingman, 1982).This means that even if Kingman's coalescent is started from a partition with an infinite number of blocks, then for all positive times it will have only finitely many blocks.Thus, as the number of immigrated particles is locally finite, Kingman's coalescent with immigration only has a finite number of blocks for all times a.s.
(iii) That each ( ΠN t ) t∈R is a Markov process is a direct consequence of the Markov property of Kingman's coalescent, and of the fact that the immigration times are distributed according to an independent Poisson point process with intensity d.This readily implies the Markov property of ( Πt ) t∈R .
An interesting consequence of the last result is that the process counting the number of blocks of Kingman's coalescent with immigration is a Markov birth-death process.More precisely, for t ∈ R, let M t be the number of blocks of the partition Πt .Then (M t ) t∈R is a stationary birth-death process.
Corollary 2.2.The process (M t ) t∈R counting the number of blocks of Kingman's coalescent with immigration rate d is a stationary Markov process.Conditional on {M t = k}, it jumps to

Preliminaries on flows of bridges
The previous construction of the Kingman coalescent with immigration is based on Kolmogorov's extension theorem.The aim of the next two sections is to give an alternative construction of Kingman's coalescent with immigration based on the flow of bridges of Bertoin and Le Gall (2003).This construction will only be needed in Section 4 for the proof of Theorem 1.3.In this section we recall the material on flows of bridges that will be needed.
Bridges.We call bridge (Bertoin and Le Gall, 2003) any random function of the form for some random mass-partition (β i ) i≥1 and an independent i.i.d.sequence of uniform [0, 1] variables (V i ) i≥1 .For a bridge B, we define its inverse Let (U i ) i≥1 be a sequence of i.i.d.uniform variables.An exchangeable partition Π of N can be obtained from B and (U i ) i≥1 by setting Let (C 1 , C 2 , . . . ) be the blocks of Π labeled in decreasing order of their least elements, i.e., such that i ≤ j ⇐⇒ min(C i ) ≤ min(C j ).
To each block C i is associated a unique random variable V i defined as If Π has finitely many blocks, say M , for i > M we set V i = Ṽ i where ( Ṽ i ) i≥1 is an independent sequence of i.i.d.uniform random variables.The sequence (V i ) i≥1 will be referred to as the sequence of ancestors of the blocks of Π.The key results on bridges from Bertoin and Le Gall (2003) is their Lemma 2 that we state here for later use.
Lemma 2.3 (Bertoin and Le Gall 2003).Consider a bridge B, and let Π and (V i ) be respectively the partition and sequence of ancestors obtained from B as above.Then (V i ) i≥1 is independent of Π, and (V i ) i≥1 is a sequence of i.i.d.uniform variables.
The standard flow of bridges.A flow of bridges is defined as follows.
Definition 2.4.A flow of bridges is a family of bridges (B s,t ) s≤t such that: (ii) For p ≥ 1 and t 1 ≤ • • • ≤ t p , the bridges B t 1 ,t 2 , . . ., B t p−1 ,tp are independent, and B t 1 ,t 2 is distributed as B 0,t 2 −t 1 .
(iii) The limit B 0,t → Id as t ↓ 0 holds in probability in the Skorohod space.
A flow of bridges encodes the dynamics of a population represented by the interval [0, 1].Let t ∈ R and x < y.If the interval [x, y] is interpreted as a subfamily of the population at time t, then its progeny at time s ≤ t is represented by the interval [B s,t (x−), B s,t (y)].
(Notice that time is going backward: if t is the present, then s ≤ t represents the future of the population.)By the independence and stationarity of the increments of the flow, the distribution of a flow of bridges is entirely characterized by the distribution of B 0,t , for t ≥ 0. We will be particularly interested into the so-called standard flow of bridges, that can be described as follows.Let t ≥ 0 and consider the bridge where (i) The process (N t ) t≥0 is distributed as a pure-death process started at ∞, and going from k to k − 1 at rate k(k − 1)/2.
(iii) The variables (V i ) i≥1 is an independent i.i.d.sequence of uniform variables.
Then we know (Bertoin and Le Gall, 2003) that there exists a flow of bridges (B s,t ) s≤t such that B 0,t is distributed as above.It is called the standard flow of bridges.
Our interest in the standard flow of bridges is that is represents the dynamics of a population whose genealogy is given by Kingman's coalescent.Let (U i ) i≥1 be a sequence of i.i.d.uniform variables, and let Πt be the partition obtained from the bridge B 0,t and the sequence (U i ) i≥1 .We stress that the same sequence is used for all t.Then the process ( Πt ) t≥0 is distributed as Kingman's coalescent started from the partition of N into singletons (Bertoin and Le Gall, 2003).
The Fleming-Viot process.One of the main advantages of flows of bridges is that they couple a backward process, giving the genealogy of the population, and a forward process, giving the size of the progeny of the individuals in the population.This forward process is often encoded as a measure-valued process known as a Fleming-Viot process.
Let (B s,t ) s≤t be a standard flow of bridges.For each t ≥ 0, B −t,0 is the distribution function of some random measure ρ t on [0, 1].The measure-valued process (ρ t ) t≥0 is called a Fleming-Viot process (Etheridge, 2000).A well-known fact that we will use is that the dynamics of the mass of a fixed interval is a Wright-Fisher diffusion.More precisely, let x ∈ [0, 1] and X t = ρ t ([0, x]).Then the process (X t ) t≥0 is a Wright-Fisher diffusion started from x, i.e., it is distributed as the unique solution to where W is a standard Brownian motion.

A flow of bridges construction of Kingman's coalescent with immigration
Let (B s,t ) s≤t be a standard flow of bridges.We now construct a version of Kingman's coalescent with immigration from (B s,t ) s≤t .Consider a Poisson point process on R × [0, 1] with intensity d dt ⊗ dx, and let (T i , U i ) i∈Z be its atoms, labeled in increasing order of their first coordinate such that T 0 < 0 < T 1 .Similarly to Section 2.1, the times (T i ) i∈Z correspond to immigration times of new particles.Here the sequence (U i ) i∈Z represents the location in the population of these immigrated particles.
For each t ∈ R, we define a partition Πt of {i ∈ Z : T i ≤ t} by setting The following proposition shows that ( Πt ) t∈R is distributed as Kingman's coalescent with immigration.
Proposition 2.5.The process ( Πt ) t∈R defined from the flow of bridges is a version of Kingman's coalescent with immigration rate d.
Proof.The proof almost identical to the proof of Corollary 1 of Bertoin and Le Gall (2003).
The main difference is that here the flow of bridges is sampled at various times (T i ) i∈Z while for the classical Kingman coalescent, the flow of bridges is only sampled at an initial time.
We work conditionally on (T i ) i∈Z and consider these times as fixed.It is sufficient to show that for all N ∈ Z, between immigration times the blocks of ( ΠN t ) t∈R coalesce according to independent versions of Kingman's coalescent.
Let t ∈ R, and let (C 1 , . . ., C Mt ) be the blocks of ΠN t , where M t is the number of blocks, and where the blocks are labeled such that Similarly to Section 2.2, we can define the sequence of ancestors of ΠN t by setting and supplementing it with an independent sequence of i.i.d.uniform variables and let (V * i ) i≥1 be the sequence of ancestors of Π * , i.e., ), where (C * 1 , C * 2 , . . . ) denote the blocks of Π * labeled in increasing order of their minimal elements as above.Using the fact that for u where b(i) denotes the label of the block of ΠN tp to which i belongs.By independence of the increments of the flow of bridges, the bridge B tp,t p+1 is independent of the collection of variables ( ΠN t ) t≤T k , ΠN .In order to end the proof of the claim we need to distinguish two cases.First, suppose that t p+1 < T k+1 .Then, due to our labeling convention, we have that ) i≥1 (up to the auxiliary variables that play no role).Conversely, if t p+1 = T k+1 , then one of the variables (V * i ) i≥1 has to be replaced by the ancestor U k+1 of the block {k + 1}.More precisely, if ΠN T k+1 has M k+1 blocks, again by labeling convention, the block {k + 1} has label M k+1 .Thus, (V It is straightforward to see that as U k+1 is independent of all other variables, the sequence (V . ., ΠN t p+1 and thus that points (i) and (ii) of the claim hold.
where b(i) is the label of the block of ΠN T k to which i belongs.As the sequence (V The that fact these coalescents are independent is a consequence of the previous induction.This proves (iii), and ends the proof of the result.

Coupling erosion and immigration
We now explain the coupling between Kingman's coalescents with erosion and with immigration.Let n ≥ 1, consider a Poisson point process P n on R with intensity nd dt and let (T i ) i∈Z be its atoms ordered increasingly such that T 0 < 0 < T 1 .To each atom of the process we attach a uniform mark in [n].We denote by i the mark attached to T i , so that ( i ) i∈Z is a sequence of i.i.d.uniform variables on [n].
Consider t ∈ R. For each k ∈ [n], let ϕ t (k) be the label of the last atom of P n with mark k, i.e., ϕ t (k) ∈ Z is the unique i such that i = k and there is no atom T of P n with T i < T ≤ t carrying mark k.Let ( Πt ) t∈R be Kingman's coalescent with immigration rate nd built from the Poisson process (T i ) i∈Z as in Section 2.1.We define a partition Π n t of [n] by setting i ∼ Π n t j ⇐⇒ ϕ t (i) ∼Π t ϕ t (j).In words, i and j belong to the same block of Π n t iff the last particles of ( Πt ) t∈R with marks i and j have coalesced before time t.The key point of this construction is that (Π n t ) t∈R is distributed as Kingman's coalescent with erosion.
Proposition 2.6.The process (Π n t ) t∈R is a stationary version of the n-Kingman coalescent with erosion rate d.
. By thinning, the set of atoms of P n with mark k is a Poisson process on R with intensity d dt, and these processes are independent.Thus new atoms of P n with mark k arrive at rate d.Let us consider what happens at such an arrival time.Suppose that i = k.Then, by definition, we have ϕ T i (k) = i, as the atom T i has mark k.Moreover, the particle i is a singleton of the partition ΠT i (it is the particle that has newly immigrated).Thus at time T i , the integer k is removed from its block and placed in a singleton block.This is the description of an erosion event, which occur at rate d.
Let us now describe the dynamics between immigration times.The atoms of P n that are the last atoms with their marks form a subset of the atoms P n .By sampling consistency of Kingman's coalescent, the restriction of the process ( Πt ) t∈R to these atoms is also distributed as Kingman's coalescent.Thus any two pairs of blocks of such atoms with a last mark coalesce at rate one, and so does the blocks of (Π t ) t∈R .
The fact that (Π t ) t∈R is stationary follows from the stationarity of the Poisson point process.
Combined with the construction of Kingman's coalescent with immigration from the standard flow of bridges, this coupling gives an interesting construction of the stationary distribution of Kingman's coalescent with erosion.
Corollary 2.7.Let (B s,t ) s≤t be a standard flow of bridges, (T i ) i≥1 and (U i ) i≥1 be independent sequences of i.i.d.exponential variables with parameter d, and of uniform variables respectively.Then the partition Π defined by has the stationary distribution of Kingman's coalescent with erosion rate d.
Proof.Consider a Poisson process P n on R × [0, 1] with intensity nd dt ⊗ dx, and attach to each atom of P n a uniform mark on [n].If (T i , U i ) denotes the last atom of P n with mark i before t = 0, then T i is exponentially distributed with parameter d, U i is uniform on [0, 1], and all these variables are independent.A combination of Proposition 2.6 and Proposition 2.5 now proves the result.
Remark 2.8.The construction of Kingman's coalescent with immigration from Section 2.1 and the construction with the flow of bridges of Section 2.3 only rely on the sampling consistency of Kingman's coalescent.These constructions could be extended directly to a case where the coalescence events occur according to a Λ-coalescent (Pitman, 1999;Sagitov, 1999).In particular, the construction of the stationary distribution of Kingman's coalescent with erosion of Corollary 2.7 extends directly to Λ-coalescents with erosion if one replaces the standard flow of bridges by the corresponding Λ-flow of bridges.

Size of the blocks of Kingman's coalescent with immigration
In this section we study Kingman's coalescent with immigration.The main result we will show is the following.We prove this result by choosing k blocks uniformly from Πn 0 , and counting backwards in time the number of blocks that are ancestors of these blocks, i.e., that will further coalesce to form these blocks.We show that this process converges, under appropriate scaling, to k independent critical binary branching processes, yielding the result.
We first give a precise definition of the ancestral process counting the number of blocks in Section 3.1, along with its basic properties.The convergence is then carried out in Section 3.2.

The ancestral process
Let ( Πt ) t∈R be a version of Kingman's coalescent with immigration rate d.The process ( Πt ) t∈R is naturally endowed with a notion of ancestry between its blocks.For t ∈ R, let M t be the number of blocks of Πt .Let ( C1 , . . ., CMt ) be an enumeration of the blocks of Πt .We say that this enumeration is exchangeable if conditional on {M t = k}, for any permutation σ of [k], ( C1 , . . ., Ck ) We can always consider an exchangeable enumeration of the blocks of Πt by changing the labels of any enumeration according to an independent uniform permutation.
For s ≤ t, consider Πt = ( C1 , . . ., CMt ) and Πs = ( C 1 , . . ., C Ms ) an enumeration of the blocks of Πt and Πs respectively.In Kingman's coalescent with immigration, a block present at time s can only coalesce with other blocks.Thus, for any block C i , there is a unique block Cj of Πt such that C i ⊆ Cj .We say that C i is an ancestor of Cj .
Definition 3.2.Let ( Πt ) t≥0 be Kingman's coalescent with immigration, and let ( C1 , . . ., CM 0 ) be the blocks of Π0 enumerated in an exchangeable order.For each t ≥ 0 and i ≤ M 0 , we define A t (i) to be the number of blocks of Π−t that are ancestors of Ci .We set A t (i) = 0 for i > M 0 .Then defining A t := (A t (1), A t (2), . . .), the process (A t ) t≥0 is called the ancestral process associated to ( Πt ) t∈R .
The definition of the ancestral process is illustrated in Figure 2. The process (A t ) t≥0 can be seen as a particle system where at time 0, there are M 0 particles with distinct types, and (A t (i)) t≥0 records the number of particles with type i.As we have reversed time, each Each black circle represents an immigration event, and the lines merge at the coalescence time of the blocks to which they correspond.At t = 0 the blocks of Π0 are labeled according to the permutation σ, and the value of (A t ) t≥0 is given below for some times.
coalescence event now corresponds to the birth of a new particle, and each immigration event to the death of a particle.
Recall that (M t ) t∈R stands for the number of blocks of ( Πt ) t∈R forward in time.For each t ∈ R, we define N t := M −t , the number of blocks of ( Πt ) t∈R backwards in time.The process (N t ) t≥0 also gives the number of particles of the ancestral process (A t ) t≥0 , that is we have The following proposition shows that the ancestral process is Markovian.This is a key feature that makes Kingman's coalescent with immigration easier to study than Kingman's coalescent with erosion.
Proposition 3.3.Let (A t ) t≥0 be the ancestral process associated to Kingman's coalescent with immigration rate d, and let (N t ) t≥0 be the number of particles of (A t ) t≥0 .Then (A t ) t≥0 is a Markov process with initial condition Moreover, conditionally on A t : • each particle gives birth to a new particle of its type at rate d/N t .
The proof of Proposition 3.3 can be found in Appendix A, we only sketch it here.The process (M t ) t∈R is a stationary birth-death process, with rates given in Corollary 2.2.A simple calculation shows that it is actually a reversible process, i.e., with our notation, that (N t ) t≥0 is distributed as (M t ) t≥0 .When (N t ) t≥0 jumps from k to k + 1, a particle has given birth to two particles.By exchangeability of our system, the particle that gives birth is chosen uniformly, i.e., each particle gives birth at the same rate d/k.Similarly, when (N t ) t≥0 jumps from k to k − 1 a particle chosen uniformly from the population dies.Thus each particle dies at rate k(k − 1)/(2k) = (k − 1)/2.
Making the above argument rigorous involves counting the number of trajectories of ( Πt ) t∈R yielding a given trajectory of (A t ) t≥0 .We postpone it until Appendix A.
In order to prove Proposition 3.1, we need to keep track of the number of ancestors of k blocks chosen uniformly from Π0 .As we have chosen a uniform labeling of the blocks of Π0 , this amounts to considering the process (A t (1), . . ., A t (k); t ≥ 0).Proposition 3.3 directly gives us the distribution of this process.
Corollary 3.4.The process (A t (1), . . ., A t (p), N t ; t ≥ 0) is a Markov process such that conditional on {A t (1) = a 1 , . . ., A t (p) = a p , N t = k}, the process jumps to: Proof.We see from the expression of the transition rates of (A t ) t≥0 that the rate at which each particle splits or dies only depends on the rest of the population through the total population size N t .This is enough to prove the result.

Convergence
We now prove that the process (A t (1), . . ., A t (p); t ≥ 0) converges to independent critical binary birth-death processes when time is rescaled by a factor 1/ √ n.We start with the following lemma.
Lemma 3.5.Let M n have the stationary distribution of (M n t ) t≥0 , the number of blocks of Kingman's coalescent with immigration rate dn.The sequence (M n / √ n; n ≥ 1) is tight.
Proof.Let n ≥ 1 and consider a birth-death process (X n t ) t≥0 such that conditional on {X n t = k}, the process jumps to • k + 1 at rate dn; where the death rate µ k is defined as The process (X n t − √ 2dn + 1 ; t ≥ 0) is distributed as a simple random walk, reflected at 0. Thus it admits a geometric stationary distribution with parameter γ n given by . This shows that the process (X n t ) t≥0 also admits a stationary distribution.If X n has the stationary distribution of (X n t ) t≥0 , then X n is distributed as , where Y n has a geometric distribution with parameter γ n .
Hence, for K and n large enough, we have Thus the sequence (X n / √ n; n ≥ 1) is tight.Recall that (M n t ) t≥0 is a birth-death process jumping from k to k + 1 at rate dn, and from k to k − 1 at rate k(k − 1)/2 ≥ µ k .Its stationary distribution is thus dominated by that of X n , and this proves the result.
We now prove our main convergence result.The proof will use a result from Chapter 11 of Ethier and Kurtz (1986) on the a.s.convergence of rescaled Markov processes.In order to stick to their notation, we introduce and Proposition 3.6.Let (A n t ) t≥0 be the ancestral process of Kingman's coalescent with immigration rate dn.Then in the sense of convergence in distribution in the Skorohod space, and where the processes (X 1 , . . ., X p ) are i.i.d.critical binary birth-death processes, with per-capita birth and death rate d/2.
Proof.We start by showing that the process ( N n t / √ n; t ≥ 0) converges to the constant process with value √ 2d.The process ( N n t ) t≥0 is a Markov process jumping from Thus, the process ( N n t ) t≥0 is of the same form as the processes considered in Theorem 2.1 of Chapter 11 of Ethier and Kurtz (1986), except that the scaling is √ n and not n.Let us consider a stationary version of the process ( N n t ) t≥0 .Lemma 3.5 shows that the sequence ( N n 0 / √ n; n ≥ 1) is tight.We can thus find an increasing sequence of indices (n k ) k≥1 such that the subsequence ( N n k 0 / √ n k ; k ≥ 1) converges in distribution to a limiting variables N .Using Skorohod's representation theorem (see e.g.Billingsley, 1999), we can assume that the convergence holds a.s.Applying Theorem 2.1 of Chapter 11 of Ethier and Kurtz (1986) shows that the sequence of processes ( N n k t / √ n k ; t ≥ 0, k ≥ 1) converges a.s.uniformly on compact sets to the solution of started from the random variables N .(The original theorem is given for a different scaling, but the proof is easily adapted to ours.)As each process ( N n k t ) t≥0 is stationary, the limiting process is a stationary solution to (2), i.e., is the constant process with value √ 2d.This shows that each converging subsequence of ( N n t / √ n; t ≥ 0, n ≥ 1) converges to the same constant process, and thus that the entire sequence converges.
Let us now prove the convergence of the ancestral processes.Consider independent Poisson processes (P − i (t)) t≥0 , (P + i (t)) t≥0 for i ≤ p, and (P − N (t)) t≥0 , (P + N (t)) t≥0 .Using e.g.Theorem 4.1 from Chapter 6 of Ethier and Kurtz (1986), there exists a unique strong solution to the following equation Moreover, this solution (X n 1 , . . ., X n p , Y n ) is distributed as ( Ân t (1), . . ., Ân t (p), N n t ; t ≥ 0).As Y n / √ n converges in probability to the constant process with value √ 2d, we can find a subsequence such that holds uniformly in t on compact sets.This is sufficient to show that for each i ≤ p, the subsequence of processes (X n i (t)) t≥0 converges a.s. in the Skorohod space to the solution This proves that the entire sequence (X n 1 , . . ., X n p ) converges in probability in the Skorohod topology to the solution of the previous equation.Finally, noting that the solutions of these equations are independent and distributed as critical binary branching processes with branching rate d/2 ends the proof.
Proof of Theorem 1.4.(i) We start by proving that M n / √ n converges to √ 2d in probability.Let us consider a version Πn of the stationary distribution of Kingman's coalescent with immigration rate nd, coupled with a version Π n of the stationary distribution of Kingman's coalescent with erosion rate d on [n].Let M n , resp.M n , denote the number of blocks of Πn , resp.Π n .Recall that the blocks of Π n are subsets of the blocks of Πn , where a particle is retained if there are no other particles with the same label that have immigrated after it.Let | Cn | be the size of a block of Πn chosen uniformly, and let |C n | be the size of the corresponding block of Π n .Some blocks of Πn are only composed of particles that are not retained to form Π n .Such blocks have no corresponding blocks in Π n , and M n − M n is exactly the number of such blocks.Thus This shows that M n / M n goes to 1 in probability.Lemma 3.5 further shows that M n / √ n goes to √ 2d in probability, and thus that M n / √ n also goes to √ 2d in probability.
(ii) We prove the second point using the method of moments.Let (|C n 1 |, . . ., |C n p |) be the sizes of k uniformly sampled blocks of Π n .Then, as the number of blocks M n goes to infinity, we have that where J is the total progeny of a binary critical branching process.The convergence of the moments readily implies convergence in distribution as the limit is a Dirac mass.

Asymptotic frequencies of Kingman's coalescent with erosion
In this section we prove Theorem 1.3, which gives a representation of the asymptotic frequencies in terms of hierarchically independent diffusions.First, we use the flow of bridges construction of Kingman's coalescent with erosion from Corollary 2.7 to give a correspondence between the frequencies of the blocks and the size of the families of a Fleming-Viot process.

Eves of a Fleming-Viot process
Let (ρ t ) t≥0 be a Fleming-Viot process.For each individual x ∈ [0, 1], denote the extinction time of the offspring of x.It is clear that the set is countable.The elements of this set can actually be enumerated in decreasing order of their extinction time, that is, they can be written (e i ) i≥0 with ζ(e 1 ) > ζ(e 2 ) > . . .This fact can be found e.g. in Labbé (2014), Theorem 1.6.The sequence (e i ) i≥0 is called the sequence of Eves of (ρ t ) t≥0 , and was introduced in Bertoin and Le Gall (2003) and Labbé (2014), see also Duquesne and Labbé (2014) for a similar notion for Continuous-State Branching Processes.The following result shows that the frequencies of the blocks of the stationary distribution of Kingman's coalescent with erosion can be recovered from the size of the offspring of the Eves.
Lemma 5.1.Let (e i ) i≥1 be the Eves of a Fleming-Viot process (ρ t ) t≥0 .Then the nonincreasing reordering of the sequence (z i ) i≥1 defined as is distributed as the frequencies of the blocks of the stationary distribution of Kingman's coalescent with erosion rate d.
Proof.Consider a flow of bridges (B s,t ) s≤t , and let (T i ) i≥1 , (U i ) i≥1 be two independent i.i.d.sequences of exponential variables with parameter d, and uniform variables respectively.Again, let Π be the partition of N defined as ), which has the stationary distribution of Kingman's coalescent with erosion.We denote Π = (C 1 , C 2 , . . . ) the blocks of Π, ordered in increasing order of their least elements, i.e., such that i ≤ j ⇐⇒ min(C i ) ≤ min(C j ).

Then let us call
A i = B −1 −T j ,0 (U j ), ∀j ∈ C i , the ancestor of the block C i .
As the flow of bridges (B s,t ) s≤t is independent of the sequences (U i ) i≥1 and (T i ) i≥1 , the sequence (B −1 −T i ,0 (U i )) i≥1 is exchangeable.Thus, the law of large numbers shows that for any i ≥ 1, Thus the result is proved if we can show that a.s.
Clearly we have ζ(A i ) > 0, as otherwise the frequency of the block C i would be zero.Moreover, conditionally on the flow of bridges, there exists a.s.some j ≥ 1 such that (U j , T j ) ∈ (x, t) : B −1 −t,0 (x) = e i as by definition of e i this set has positive Lebesgue measure.Thus, a.s.e i is the ancestor of some block of Π, and the result is proved.
In order to prove Theorem 1.3, it remains to show that the sequence of processes ρ t ({e 1 }), ρ t ({e 2 }), . . .; t ≥ 0 has the same distribution as the sequence of hierarchically independent diffusions introduced in Section 1.3.In the following section we characterize this distribution, and complete the proof in the last section.
Remark 5.3.The time τ 1 (t) is infinite with positive probability.However, each of the processes (X 2 , . . ., X n ) has an a.s.limit as t goes to infinity.On the event {τ 1 (t) = ∞}, we take X i (τ 1 (t)) to be this limit, so that the process (Z 1 , . . ., Z n ) is now well-defined.
Before proving Lemma 5.2, we need the following fact that we prove for the sake of completeness.
Lemma 5.4.Let (W t ) t≥0 be a Brownian motion on R started at 1, and let T 0 be the first time it hits 0. Then for α ∈ R, a.s.
for a Brownian motion ( Wt ) t≥0 with the convention that inf O = ∞ and ξ ∞ = −∞.The Lamperti representation of positive self-similar processes (Lamperti, 1972) shows that W t stopped at T 0 satisfies the equality in distribution Thus and which yields the result.
Proof of Lemma 5.2.Consider a n-dimensional Wright-Fisher diffusion (X 1 , . . ., X n ).A calculation of Doob's h-transform using the harmonic function shows that the process (X 1 , . . ., X n ) conditioned on {lim t→∞ X 1 (t) = 1} = {ζ 1 = ∞} is distributed as the unique solution to the equation where (W i,j ) i<j are independent Brownian motions, and W i,j = −W j,i .We will prove that the process (Z 1 , . . ., Z n ) solves this equation.

Now consider a
We start by giving the equation solved by the process (Y 1 , X 2 • τ 1 , . . ., X n • τ 1 ).Notice that here, only a subset of the processes are time-changed, and that τ 1 explodes in finite time.For these two reasons, let us realize the time-change carefully.
We transform τ 1 into a family of finite stopping times.Our first task is to prove that τ 1 goes continuously to infinity, we do this using the speed and scale measures of the diffusion Y 1 , see e.g.Etheridge (2011) Thus we can write that where W 1 is a Brownian motion started at 1/Y 1 (0), and T 1 is the first time when W 1 hits 1.We now know from Lemma 5.4 that this integral is a.s.infinite, and thus that τ 1 goes continuously to infinity, and does not "jump to infinity".Further consider the times ∀i ≥ 2, S i = inf{t ≥ 0 : X i (t) = 1}, S = min(S 2 , . . ., S n ).
At time S, one of the families has reached fixation, and thus for t ≥ S we have X i (t) = X i (S).
Therefore, for all t ≥ 0, we have X i (τ 1 (t)) = X i (τ 1 (t)∧S), where the stopping time τ 1 (t)∧S is now a.s.finite, and t → τ 1 (t)∧S is continuous.(The continuity requires that τ 1 does not jump to infinity.)Thus, by making a time-change in the following integrals, see e.g.Kallenberg (2002), Theorem 17.24, we obtain A direct computation of the quadratic variations gives ∀i, j, t ≥ 0, [ Wi,j , Wi,j ] t = t ∧ S, and the crossed variations are null.Thus a multidimensional version of Dubins-Schwarz theorem, see e.g.Theorem 18.4 in Kallenberg (2002), shows that we can find independent Brownian motions ( Ŵi,j ) i<j such that Wi,j (t) = Ŵi,j (t∧S).This proves that the time-changed processes solve A final application of Itô's formula shows that the process (Z 1 , . . ., Z n ) as defined above solves the same equation as (X 1 , . . ., X n ) conditioned on {ζ 1 = ∞}.This proves the result.
We can now proceed inductively.Let us set up the notation for the proof.Consider i.
We then define recursively, for i ds.

We finally set
We end this section by pointing out the following fact that will be required in the next section.We have only defined the Wright-Fisher diffusion conditioned on its extinction order for an initial condition (x 1 , . . ., x n ) such that for all 1 ≤ i ≤ n, x i > 0. Nevertheless, the processes Y i have an entrance boundary at 0. Thus there exists a unique extension of the process (Y 1 , . . ., Y n−1 ) started from (0, . . ., 0) that remains Feller, see e.g.Kallenberg (2002), Chapter 23.This shows that a Wright-Fisher diffusion conditioned on its fixation order ( Z1 , . . ., Zn ) admits a Feller extension for the initial condition (0, . . ., 0, 1).

Proof of Theorem 1.3
Let (ρ t ) t≥0 be a Fleming-Viot process, and let (e i ) i≥1 be its Eves.In this section we end the proof of Theorem 1.3 by showing that the distribution of the sequence of processes (ρ t ({e 1 }), ρ t ({e 2 }), . . .; t ≥ 0) is that of a Wright-Fisher diffusion conditioned on its fixation order.
The result we want to prove is the direct extension of Theorem 4 of Bertoin and Le Gall (2003).Reformulated in our setting, this theorem proves that (ρ t ({e 1 }); t ≥ 0) is distributed as the solution to eq. ( 3) started from 0. We now give a similar representation for the process (ρ t ({e 1 }), . . ., ρ t ({e n }); t ≥ 0) giving the size of the progeny of the first n Eves.
Proof.We realize a similar computation as in the proof of Theorem 4 of Bertoin and Le Gall (2003).The proof requires three facts.First notice that Then, if I 1 , . . ., I n are n disjoint intervals of length 1/m, due to exchangeability of the increments of bridges, the process (ρ t (I 1 ), . . ., ρ t (I n ); t ≥ 0) is distributed as the process . Finally, notice that on the event {∀i = j ∈ {1, . . ., n}, m e i = m e j }, conditioning the process ρ t 0, 1 m , . . ., ρ t n−1 m , n m ; t ≥ 0 on its extinction order as in Section 5.2 is equivalent to conditioning it on the location of the Eves, i.e., on the event ∀k ∈ {1, . . ., n}, e k ∈ k−1 m , k m .We can now proceed to the calculation.Let 0 ≤ t 1 < • • • < t p and let ϕ 1 , . . ., ϕ p be continuous bounded functions.Consider ( Z1 , . . ., Zn+1 ) a (n + 1)-dimensional Wright-Fisher diffusion conditioned on its extinction order.Then where, the last line comes from the Feller property of the process ( Z1 , . . ., Zn+1 ).
Our current proof of Theorem 1.3 relies on calculations specific to the Wright-Fisher diffusion.We end this section by discussing a potential alternative proof of this result that would more easily generalize to Beta-coalescents.
The Feller branching diffusion describes the size of a population where different individuals die and reproduce independently.Similarly to the Fleming-Viot process, it is possible to define a measure-valued process, called the Dawson-Watanabe process, that encodes the size of the offspring of each individual in the initial population, see e.g.Etheridge (2000).(Note that there are no mutations here, i.e., no spatial motion of the particles.)Its total mass is then distributed as a Feller diffusion.Starting from a Dawson-Watanabe process, one can renormalize it by its total mass to obtain a process valued in the space of probability measures.Then the resulting renormalized process is distributed as a time-changed Fleming-Viot process, see Birkner et al. (2005).
Let us now discuss the results of Section 5.2 in the light of this new construction.The key point of Section 5.2 is that after removing one family from a Fleming-Viot process and renormalizing the remainder of the population to have mass one, the resulting process remains distributed as an independent time-changed Fleming-Viot process.Suppose that the Fleming-Viot process has been obtained by renormalizing a Dawson-Watanabe process.Then removing a family from the Fleming-Viot process amounts to removing a family from the original Dawson-Watanabe process.By the branching property, removing this family does not change the distribution of the remainder of the population, which remains distributed as an independent Dawson-Watanabe process.Thus when renormalizing the remainder of the population to have size one, we obtain a new time-changed Fleming-Viot process, independent of the removed family.In other words, the results of Section 5.2 essentially originate from the fact that the Fleming-Viot process can be seen as a renormalized branching measure-valued process.
A similar link has been obtained in Birkner et al. (2005) between the Λ-Fleming-Viot processes associated to Beta-coalescents and a family of α-stable measure-valued branching processes.Thus we believe that one could derive a similar, but less explicit, representation of the asymptotic frequencies of the stationary distribution of the Beta-coalescents with erosion than the one obtained in Theorem 1.3.
Once the forest representing ( Π−n , . . ., Π0 ) is built, by construction the nodes corresponding to Π0 all belong to different trees.We set them to be the roots of their respective trees, and label them according to the partition σ. (Notice that the resulting forest is endowed with some additional structure: the nodes added along the procedure are totally ordered by the induction step at which they have been added.) Counting trajectories of ( Π−n , . . ., Π0 ) now amounts to counting forests.Instead of building the forests by starting from the leaves as above, we build a forest with ancestral sequence (a 0 , . . ., a n ) by starting from the roots.Initially, consider a set of |a 0 | roots, labeled by {1, . . ., |a 0 |}, that represent the particles of a 0 .Nodes can be in two states: active or inactive.Active nodes represent the particles that are still alive in the population while inactive nodes represent the dead particles.Initialy all roots are active.We build the forest recursively.Suppose that at step k we have built a forest such that for all i there are a k (i) nodes that are active in the tree with root i.If a particle with label k has died from a k to a k+1 , we inactivate one of the nodes belonging to the tree with root k .There are a k ( k ) such nodes.Similarly, if a particle has split from a k to a k+1 , we inactivate one node in the tree k , and connect it to two new active nodes.There are again a k ( k ) active nodes in the tree k .After step n, we have built a forest with ancestral sequence (a 0 , . . ., a n ).We assign the blocks of Π−n to the remaining active nodes of the forest by choosing one of the |a n |! permutations of the blocks.
There are |a n |! a 0 ( 0 ) . . .a n−1 ( n−1 ) outputs of the previous construction, and all forests with ancestral sequence (a 0 , . . ., a n ) can be obtained that way.However, due to symmetries, some forests can be obtained multiple times through this construction.More precisely, at each birth events, the two daughter nodes are indistinguishable.Interchanging the trees corresponding to the offspring of these two nodes yields the same forest.Thus, the actual number of forests with ancestral sequence (a 0 , . . ., a n ) is |a n |! 2 b a 0 ( 0 ) . . .a n−1 ( n−1 ) where b is the number of birth events, and the result is proved.
Lemma A.2. Let (M t ) t∈R be the process counting the number of blocks of Kingman's coalescent with immigration.Then (M t ) t∈R is a reversible process.
Proof.Let us compute the stationary distribution of (M t ) t∈R .As (M t ) t∈R jumps from k to k + 1 at rate d and from k to k − 1 at rate k(k − 1)/2, a usual calculation shows that its stationary distribution (ν k ) k≥1 is where the renormalization constant is obtained by summing over all the terms.Thus a direct calculation now proves that (ν k ) k≥1 fulfills the detailed balance equation and thus that (M t ) t∈R is reversible.