Asymptotics of the frequency spectrum for general Dirichlet Ξ-coalescents

In this work, we study general Dirichlet coalescents, which are a family of Ξ-coalecents constructed from i.i.d mass partitions, and are an extension of the symmetric coalescent. This class of models is motivated by population models with recurrent demographic bottlenecks. We study the short time behavior of the multidimensional block counting process whose i th component counts the number of blocks of size i . Compared to standard coalescent models (such as the class of Λ-coalescents coming down from inﬁnity), our process has no deterministic speed of coming down from inﬁnity. In particular, we prove that, under appropriate re-scaling, it converges to a stochastic process which is the unique solution of a martingale problem. We show that the multivariate Lamperti

In this work, we consider a particular subclass of Ξ-coalescents where the interval partition of the paintbox has a generalized version of a Dirichlet distribution with a random number of components.More precisely, consider a sequence of non-negative numbers (R(k); k ∈ N) and m a probability measure on (0, ∞).Then, generate at rate R(k) a partition (p where (w 1 , . . ., w k ) are i.i.d.random variables with law m on (0, ∞), s k := k i=1 w i and [k] := {1, . . ., k}.As in the previous paintbox construction, blocks are assigned a uniform random variable, and we merge all the blocks falling in the same interval.This corresponds to a Ξcoalescent where the characteristic finite measure Ξ on the infinite simplex ∆ is described as follows.For every k ∈ N, let us define ν k , a probability measure on the infinite simplex ∆, s.t.ν k = L((w 1 /s k , w 2 , s k , . . ., w k /s k , 0, 0, . . .)), where L(X) denotes the law of the random variable X.Then, for every measurable B ⊂ ∆, The case where the w i 's are Gamma distributed corresponds to the standard Dirichlet masspartition.In particular, if the w i 's are exponentially distributed, it corresponds to a symmetric Dirichlet distribution.We refer to this model as the general Dirichlet coalescent.The name Dirichlet coalescent was coined in [23].Therein the authors consider paintbox construction according to a Dirichlet distribution with a fixed number of components.
Another example of such a process is the symmetric coalescent defined in [24], which corresponds to the case where w i = 1 a.s..In that case, in order to correspond to the paintbox construction described above, the sequence R must satisfy that R(k)/k < ∞, see [24].Here, we assume a finite second moment for m and that the rate of k-events has a heavy tail, in the following sense.
In fact, a Ξ-coalescent is well defined if the rate at which two blocks merge into one is finite [41].It is easy to see from the paintbox construction that, since the vector (p Suppose that the general Dirichlet coalescent starts with n singletons.Denote by μn t = (μ n t (1), . . ., μn t (n)) the vector such that μn t (i) is the number of blocks containing i elements at time t, and denote by |μ n t | the total number of blocks.Define the rescaled vector In this paper, we aim at studying the limiting behavior of the Markov process (µ n t ; t ≥ 0) as n → ∞.We prove (in Theorem 2) that it converges towards a stochastic process (µ t ; t ≥ 0), defined as the unique solution to a martingale problem associated to a continuous coagulation operator (see Theorem 1).
Intuitively, the result can be understood as follows.In the paintbox construction, when there are n lineages, a k-merging event corresponds to throwing n balls into k boxes (with probabilities (p (k) 1 , . . ., p (k) k )), and merging the balls that land in the same box.For k ≫ n, the chance that non-trivial merging occurs is negligible, whereas for k ≪ n, all lineages will be merged into a few lineages (which disappear when rescaling the number of blocks by n).The total rate of k-events with ǫn ≤ k ≤ M n for some small ǫ > 0 and large M < ∞ can by approximated by M n ǫn ρy −α dy = Cn 1−α with some constant C, which explains why time is slowed down by this factor.The heuristics behind the form of the coagulation operator that is the central part of the generator of the limit process are explained in Section 2.2.
We also show that this limit process is self-similar with negative index β := α − 1 (see Theorem 3).In particular, the limit of the rescaled block counting process (|µ t |; t ≥ 0) is the exponential of a time-changed subordinator.As a direct corollary, if we define A t := inf{s > 0 : then (ξ t ; t ≥ 0) is a subordinator.The law of the subordinator can be identified as a direct consequence of Theorem 5 (see Section 7).This shows that the short time behavior of the block counting process remains stochastic.This is in sharp contrast with previous studies where it is shown that classical models (such as Λ-coalescents) exhibit a deterministic speed of coming down from infinity, see Section 1.2 for a more detailed discussion.Our result can be interpreted as a stochastic speed of coming down from infinity.

Speed of coming down from infinity
We say that a Ξ-coalescent comes down from infinity if there are finitely many blocks at any time t > 0 almost surely, even if the coalescent is started with infinitely many blocks.In his original work, Schweinsberg already established a criterion for coming down from infinity [41].
In the case that the characteristic measure of the coalescent is supported on the set of finite mass partitions ∆ * = { p = (p 1 , . . ., p k ) : k j=1 p j = 1, for some k} (which is our case of interest), the process comes down from infinity if and only if Otherwise, the number of blocks stays infinite for a finite amount of time.As another example, coalescents whose characteristic measures are supported only on infinite mass partitions, i.e. for which Ξ(∆ * ) = 0, either come down from infinity or always stay infinite.We consider coalescents supported on ∆ * and that come down from infinity.Limic [34] studied the small time behavior of Ξ-coalescents under what she called a regularity assumption In this setting, and starting with infinitely many lineages, there exists a speed of coming down from infinity, i.e., a deterministic function ν Ξ (t), which is finite for all t > 0 such that, if |μ t | is the number of blocks at time t in a coalescent starting with infinitely many lineages, lim This mirrors the behavior of the class of Λ-coalescents coming down from infinity [5,4].To summarize, most of the previous studies have shown that the block counting process of a large class of exchangeable coalescents exhibits a deterministic behavior at small time scale.
In the present work, we consider Ξ-coalescents belonging to the first family (for which Ξ is supported on ∆ * ), and which come down from infinity (see [41], Section 5.5).We take a different approach, since we study the rescaled number of blocks, starting from n lineages, as n → ∞.In our case, when time is re-scaled by n α−1 , the block counting process converges to a stochastic self-similar process, so there is no deterministic speed of coming down from infinity.
Our results have similar flavor to those of Haas and Miermont [25] for Λ coalescents with dust, and of Möhle and co-authors [23,37] for a class of Ξ-coalescents with dust.In the first work, a self similar behavior of the rescaled number of blocks is obtained in the limit.In the second work, they prove that the frequency of singletons, as well as the number of blocks rescaled by n, converges to the exponential of a subordinator (without any time-rescaling).A natural prospect of research would be to identify conditions that would partition Ξ-coalescents (coming down from infinity) into two main classes: a first class with a deterministic limiting behavior, and a second one with a stochastic descent from infinity.

Perspectives on coming down from infinity
Our results deal with processes valued on the partitions of n when n goes to infinity.Although this is heuristically related to the case n = ∞, which corresponds to working with partitions of N, we expect that there are important technical challenges when studying the process starting with infinitely many blocks.To be precise, the latter would require an entrance law at infinity for the limit of the multidimensional block counting process.In our approach, we avoid this problem by rescaling the block counting process by n so that |µ n 0 | = 1 and there is no need for entrance laws for the limit process (µ t ; t ≥ 0).
The study of entrance laws of self-similar Markov processes has recently been an active area of research.In the one dimensional case there is an extensive literature (see for example [7,10,19] and the references therein).Recent results in the finite dimensional case can be found in [31].This is also a classic problem for Markov additive processes [16].We believe that our results can motivate the study of entrance laws for infinite dimensional self-similar processes.

Biological motivation
The symmetric coalescent [24] can be obtained as the limiting genealogy of a Wright-Fisher population that undergoes rare recurrent bottlenecks reducing the population size to a random number k of individuals for only one generation.In this case, the second point of Assumptions 1 always holds and the first point is fulfilled if the measure characterizing the size of the bottlenecks has power tails of order α.General Dirichlet coalescents naturally arise in an extension of this model, that can be seen as multinomial non-exchangeable reproductive events [44].
The analysis of the asymptotics of the multidimensional block counting process allows us to characterize the limiting behavior of the Site Frequency Spectrum (SFS) of our family of Ξ-coalescents as some functional of the limit process (µ t ; t ≥ 0).The SFS is one of the most widespread statistics in population genetics.It consists in a vector of size n − 1 whose ith component counts the number of mutations that are shared by i individuals in a sample of size n.We suppose that mutations occur at a constant rate over the coalescent tree started with n individuals, so that the SFS is closely related to its branch lengths.In general, this is a complex combinatorial problem and most of the previous works have relied on some approximations of the short time behavior of the block counting process to derive asymptotics for the lower part of the SFS (i.e., number of singletons, pairs etc.).Some examples are [18] for the case of the Kingman coalescent, and [5,13] for coalescents with multiple collisions, such as Beta-coalescents, or [2,28,20,29] for the special case of the Bolthausen-Sznitman coalescent.For fixed n, some studies on the law of the SFS can be found in [22,26,29].
There are few results available regarding the SFS of Ξ-coalescents.Works like [14,42] present computational algorithms based on recursions to derive the expected SFS for finite n.Asymptotic properties of Ξ-coalescents started with n lineages were studied previously, in particular regarding the number of blocks [35].Theorem 4 describes the asymptotics of the SFS for general Dirichlet coalescents.

Notation
Let us start this section with some notations.We denote by N the positive integers and by N 0 the non-negative integers.Let ℓ 1 (R + ) be the set of all sequences with positive coordinates and with finite sum.For every z = (z(1), z(2), . . . ) ∈ ℓ 1 (R + ), we denote the sum of all its elements by |z| = ∞ i=1 z(i).We also denote by ℓ 1 (N 0 ) the set of sequences with coefficients valued in N 0 and finite sum.Define The space Z will be equipped with the ℓ 1 (R + ) norm.The latter definitions are motivated by partitions of n ∈ N. Recall that a partition of n ∈ N denotes an unordered sequence of integers {m(1), . . ., m(k)} such that k i=1 m(i) = n.For every i ∈ N, define z(i) = 1 n #{k : m(k) = i}, the "frequency" of i in the partition of n.Then z = (z(1), z(2), . . . ) is an element of Z n .Starting with n singletons, we study the multidimensional block counting process of the general Dirichlet coalescent as a Markov process valued in Z n (as already outlined in the introduction).Its first component denotes the frequency of singletons, its second component is the frequency of pairs, etc.
For any λ ∈ [0, 1], we define and for every λ We define the following set of test functions

Convergence of the rescaled partition process
We are now ready to enunciate and comment on our main results.We start by describing the random coagulation corresponding to the jump events of the general Dirichlet Ξ-coalescent.Set z n ∈ Z n where nz n (i) is the number of balls of size i (the size of a ball refers to the number of samples/lineages it represents) .Then throw nz n (i) balls of size i, i ≥ 1, at random into k boxes in such a way that the probability of falling into box #{j ≤ k : sum of the sizes of the balls falling in box j is ℓ}.(4) Note that, for ℓ ≥ n, Λ k,n (z n )(ℓ) = 0.By a slight abuse of notation, we define the (random) operator Λ k,n acting on Z n such that, for every function g defined on Z n , Thanks to these notations we can define the infinitesimal generator of the Z n -valued process (µ n t ; t ≥ 0) defined in (2) as for every measurable and bounded f : Before diving into technicalities, let us first motivate the coming results.Assume that k, n → ∞ with k/n ∼ x ∈ (0, ∞), i.e., a large number of balls (n) and boxes (k), of the same order.Under this restriction, if an event involving k boxes occurs, the number of balls of size i falling in box 1 is well approximated by a Poisson random variable with parameter where .
Further, since the number of balls/boxes is large, the total number of boxes with r balls of size i should be well approximated by By a similar heuristic, if k, n → ∞ with k/n ∼ x ∈ (0, ∞), we expect where the expectation is taken with respect to the random variable Γ.This justifies the limit operator introduced later on in (6).

A martingale problem
We now define the martingale problem, associated to a continuous coagulation operator, described as follows.Let x > 0 and define C x : Z → Z, such that its ℓth coordinate is given by From the heuristics of the previous section, C x (z n )(ℓ) is a natural candidate to approximate Λ k,n (z n )(ℓ).As for Λ k,n , we define the operator C x on functions on on Z such that, for every function g bounded and measurable on Z, We will show in due time that C x (z) ∈ Z, see Proposition 2.
Theorem 1 (Uniqueness of the martingale problem).For every z ∈ Z and f ∈ T , the function There exists a unique càdlàg process (µ t ; t ≥ 0) valued in Z with µ 0 = z such that Theorem 1 characterizes the limiting process in the following result.
Theorem 2. Suppose that Assumptions 1 hold.If µ n 0 = z n → z ∈ Z, then for every T > 0, where the process (µ t ; t ≥ 0) is the unique solution to the martingale problem (8) with initial condition z.

Self-similarity
The second part of this paper is devoted to the study of the limiting process (µ t ; t ≥ 0) characterized in Theorem 1.We prove that it is an infinite dimensional self-similar process, with negative index β := α − 1 ∈ (−1, 0) (Proposition 4).We can characterize its infinite dimensional Lamperti-Kiu transform.
The fact that (µ t ; t ≥ 0) is self-similar is inherited from the regular tail behavior of R(k) (which is reflected by the x −α in the generator (equation ( 7)) together with the fact that for any positive constant γ and for every ℓ ∈ N, γC x (z)(ℓ) = C γx (γz) (see equation ( 6)).
To characterize this transformation, first consider the limiting block counting process (|µ t |; t ≥ 0).According to Proposition 4, (|µ t |; t ≥ 0) is a non-increasing self-similar positive Markov process with parameter β.The standard Lamperti transform tells us that such a process is identical in law to the exponential of a time-changed subordinator.Recall (A t ; t > 0) and (ξ t ; t > 0) defined in (3).Then (ξ t ; t ≥ 0) is a subordinator and (|µ t |; t ≥ 0) can be recovered by the relation Let us turn to the infinite dimensional self-similar process (µ t ; t ≥ 0).Let S := {z ∈ ℓ 1 (R + ) : |z| = 1} be the unit sphere in ℓ 1 (R + ).The idea of the infinite dimensional Lamperti-Kiu transform is to decompose the process into its "radial part" (|µ t |; t ≥ 0) (the block counting process) and its "spherical part" which encodes the evolution of the asymptotic frequencies of singletons, pairs, etc.In the spirit of the one-dimensional case, the process can be related to a time-changed Markov additive process (MAP, see [15]).Theorem 3 is a natural extension of Theorem 2.3. in [1], established in finite dimension.

Site Frequency Spectrum
The third part of this work is devoted to the asymptotics of the SFS of the family of general Dirichlet Ξ-coalescents, in the limit of large n.Consider the infinite sites model, where it is assumed that mutations occur according to a Poisson Point Process of intensity r > 0 over the coalescent tree and that each new mutation falls in a new site so that all the mutations can be observed in the generic data.Define the rescaled SFS Under the infinite sites model, there is a very close relation between the SFS and the block counting process of the coalescent tree.More precisely, conditional on the coalescent, the number of segregating mutations affecting i individuals of the sample is given by a Poisson random variable with parameter r Tn 0 μn s (i)ds, where Tn denotes the time to the most recent common ancestor (height) of the coalescent tree.Theorem 4. Let (ξ t , θ t ) be defined as in Theorem 3. We have where the convergence is meant in the weak sense with respect to the ℓ 1 (R + ) topology.In particular, the (rescaled) total number of mutations |F n | is asymptotically described by the exponential functional of a subordinator [11], i.e., Observe that a similar rescaling, in n α , α < 1 appears for the lower part of the spectrum (small values of i) of Beta-coalescents coming down from infinity [5], although α was used in a different parametrization there.Also note that, in most coalescent models, the rescaling order is not the same all along the vector.As an example, four different renormalizations are listed in the study the SFS of the Bolthausen-Sznitman coalescent, [29].

Outline of the paper
The rest of the paper is organized as follows.In Section 3 we use Stein's method to derive bounds for the total variation distance between vectors obtained by throwing balls into urns and their Poisson approximations.These results are used in Section 4 to prove the convergence of the generator of the multidimensional block counting process (µ n t ; t ≥ 0) (defined in equation ( 5)) to the generator of the limiting process (defined in (7)).Section 5 is devoted to the study of the martingale problem (8).Before proving the uniqueness of its solution (Theorem 1), we analyze the coagulation operator C x (some additional technical results can be found in Appendix B).In Section 6, we prove the convergence of (µ n t ; t ≥ 0) to the unique solution of the martingale problem (Theorem 2).In Section 7, we prove that the limiting process is self-similar and we characterize its Lamperti-Kiu transform (Theorem 3).We also provide an additional representation of the process using stochastic flows.Finally, in Section 8, we study some asymptotics on the branch lengths which allow us to prove Theorem 4. Appendix A contains some moment estimates on the mass partition components p (k) j (defined in (1)) that are used in several proofs.

Urn estimates
Let E be a discrete space equipped with the usual σ-field F generated from the singletons.Recall that the total variation distance between two measures ν 1 , ν 2 on E is given by For a random variable X, we denote by L(X) its law.
In this section we recall and establish some bounds for the total variation distance between binomial variables and vectors obtained by throwing balls into urns and their Poisson approximations.Those results are mainly obtained using Stein's method [40].

Undistinguishable balls
We start by considering n indistinguishable balls that are allocated at random to k urns.For i ∈ [k], let p i be the probability of being allocated to the ith urn.Let X i be the number of balls allocated to urn i so that X i has a binomial distribution with parameters n and p i and Lemma 1.Let y > 0. Let Y be a Poisson distributed random variable with parameter yn/k.Then, Proof.Let W be a Poisson random variable with parameter p 1 n.Using triangular inequality, For the first term in the RHS we use the celebrated Chen-Stein inequality for the approximation of the total variation between Poisson and binomial random variables (see for example Theorem 4.6 in [40]).For the second term we use an inequality for the total variation distance between Poisson random variables with different means, that can be found in equation ( 5) of [39].Now we consider balls that are allocated to two different urns.
Lemma 2. Let (W 1 , W 2 ) be a pair of independent Poisson distributed random variables with respective parameters p 1 n and p 2 n.Then, Proof.Our argument relies on the following observation 1. Z := X 1 + X 2 follows a binomial distribution of parameters (n, p 1 + p 2 ).
Analogously, we can consider a pair of random variables constructed as follows.Let W be a Poisson random variable of parameter (p 1 + p 2 )n.Conditionally on W = z, (W 1 , W 2 ) has the same multinomial law of parameters (z, p 1 p 1 +p 2 , p 2 p 1 +p 2 ).For z ∈ N 0 , denote by B z a binomial random variable with parameters (z, p 1 p 1 +p 2 ).Then i,j≥0 So, using again Chen-Stein's inequality (Theorem 4.6 in [40]), we conclude that

Balls with distinct sizes
In this section we consider an urn problem where balls are distinguishable by their sizes.We start with a general result.

Coagulation operators defined from urn problems
Again, we fix ℓ ∈ N and consider balls with ℓ + 1 distinct sizes.Let N = (N 1 , . . ., N ℓ+1 ) where N i denotes the number of balls of size i.We allocate at random these balls into k urns, such that the probabilities of falling in different urns are given by p = (p 1 , . . ., p k ).We define a random coagulation of N by considering that balls that are assigned to the same urn are merged into one ball whose size is the sum of all of them, i.e., such that ∀m ∈ N, C p ( N )(m) = #{j ≤ k : sum of the sizes of the balls falling in box j is m}.
Lemma 4. Fix a probability vector p = (p 1 , . . ., p k ) ∈ [0, 1] k and consider the random coagulation associated to p.For every almost surely, and Proof.Let us first prove (11).The difference In turn, this is less than the number of urns containing at least two balls, which is less than the total number of balls that are lost.This completes the proof of (11).Now, let us turn to the proof of (12).Since for x ∈ (0, 1), we have log(1 − x) ≤ −x and e Recall that |C p ( N )| is the number of non-empty boxes when assigning | N | balls to k urns with probabilities (p i , . . ., p k ).Using the previous inequality, Observing that k j=1 p j = 1 yields the desired result.

Sequence of urns. Convergence of the generator
Recall from Section 2 the continuous generator A defined in (7) and the discrete generator A n defined in (5).We use the urn estimates obtained in Section 3 to prove the following result.

First moment estimate
Recall the notations C x from (6) and Λ k,n from (4).The main objective of this section is a careful justification of the approximation of Λ k,n (z n )(ℓ) by C k/n (z n )(ℓ) , as suggested in the heuristics provided in Section 2.2, together with a "rate of convergence" that will be needed to prove Proposition 1.
Lemma 5.For every ℓ ∈ N and z n ∈ Z n , we have and where we recall that p ) so that the random variables are coupled through the same w 1 .
Proof.We start by proving the first inequality.Fix ℓ ∈ N.For i ∈ [ℓ], define N i = nz n (i), and set In terms of the urn problem of Section 3, N i is the number of balls of size i, for i ∈ [ℓ], and N ℓ+1 is the number of balls of size strictly larger than ℓ.Let us now consider a partition c ∈ ϕ −1 (ℓ), i.e., a vector c ∈ ℓ 1 (R + ) such that ic(i) = ℓ.An urn containing c(i) balls of size i for each i ∈ [ℓ] corresponds to the formation of a new block of size ℓ.Let B(c) denote the number of urns containing balls given by the partition c (to ease the notation we do not indicate the dependence on z n and k).We have Mirroring the notation of Section 3.2, consider a vector of r.v.'s (X (1) , . . ., X (ℓ+1) ) such that, conditional on p 1 , the entries are independent and X (i) is distributed as a binomial r.v. with parameters (N i , p (k) 1 ).By exchangeability of the boxes, we have where b 1 (c) is the indicator that X (i) = c(i) for every i ∈ [ℓ] and that X (ℓ+1) = 0.
The result follows by a direct application of Corollary 1 after conditioning on p (k) 1 and Γ.We now prove the second inequality.Mirroring the notation of Section 3.1, we consider the random variable X 1 such that conditional on p and the result then follows from Lemma 1.

Second moment estimate
The aim of this section is to bound the variance of the operator Λ k,n (z)(ℓ).
Lemma 6.For every ℓ ∈ N, there exists a constant C > 0 such that for every z n ∈ Z n , we have where h is a function of k, with no dependence in n, z n which goes to 0 as k → ∞.Further, the same property holds for Var |Λ k,n (z n )| .
Proof.We start by proving the inequality for Var |Λ k,n (z n )| .Fix ℓ ∈ N. In the following, we write N i = nz n (i).We consider the vector ( X (1) , . . ., X (ℓ+1) ) where the entries 2 ) are, conditional on (p 2 ), independent random vectors such that is binomial with parameters (N i , p 2 ) and, conditional on ). Analogously to Lemma 5, we define b j (c) (with j = 1, 2) as the indicator that X (i) j = c(i) for i ∈ [ℓ] and that X (ℓ+1) j = 0. Adapting the notations in the proof of Lemma 5 and using again exchangeability between urns, Step 1.We start by considering the first term in (13).Define ( W (1) , . . ., W (ℓ+1) ) such that the entries 2 ) are, conditional on (p 2 ), independent random vectors such that W where, for the last term of the second inequality, we used the fact that the product function We now consider each of the terms in the RHS of ( 14) separately.
We start with the third term.The result follows from the second item of Corollary 2 and point (ii) of Proposition 6 (in Appendix A).More precisely, as in the proof Lemma 5, 1 , . . ., X ), L(W 1 , . . ., W )) where C is a positive constant.
We can then apply the first item of Corollary 2 to obtain the following bound where C is a positive constant and we used the fact that E((kp 2 ) is finite by point (ii) of Proposition 6 (in Appendix A).
Finally, we deal with the second term of ( 14), 2 ) = |Cov E(d 1 (c 1 )|p where g v,c and h are defined as in Lemma 12 The inequality follows from that lemma.Combining the three inequalities ( 15), ( 16) and ( 17), there exists a constant C such that the first term in the RHS of ( 13) can be bounded by where h(k) → 0 as k → ∞.
Step 2. Now we consider the second term in (13), 1 ), where we used the fact that P(b 1 (c 1 ) = 1) is bounded from above by the probability of the event of having at least one ball in the first box.So the second term in (13) can be bounded by This completes the proof of the first inequality of Lemma 6.
We now prove the inequality for Var |Λ k,n (z n )| .We write where B 0 is the number of empty boxes.The proof follows the same steps as the proof of the first inequality.If b i,0 is the indicator that box i is empty In the first step, analogously to (14), we can write where d 1,0 , i = {1, 2} is the indicator that a Poisson r.v. with parameter n|z n |p 2 ), d 1,0 and d 2,0 are independent.For the first term we use Lemma 2. For the second term, we use a similar bound to equation (17), where g v,c i , i = 1, 2 is replaced by g v,c 0 defined by g v,c 0 (x) := exp(−v(ℓ + 1)x) that has Lipshitz constant v(ℓ + 1) = n/k and the inequality follows from Lemma 11 instead of Lemma 12.For the third term we use the Chen-Stein inequality.For the second step, the proof is analogous to the proof of the first inequality.

Convergence of the generators
The previous sections provide the main ingredients to prove Proposition 1.Before writing this proof, we still need one preliminary result.
Proof.Since f ∈ T , there exists a Lipshitz function F and λ ∈ [0, 1] K such that f = F • ψ λ .As a consequence, there exists C > 0 (that only depends on the choice of F ), such that In addition, if λ < 1 where in the last line we used the first part of Lemma 4. If λ = 1, is the number of blocks that are lost in the coalescence event.Thus, we can apply (12) (with N = nz n ), and set the constant Notice that in the last line we used the exchangeability of the vector (p k ) and that the bound does not depend on z n .By Assumptions 1, , and by point (ii) of Proposition 6 (in Appendix A), E((kp )/E(w 1 ) 2 , which yields the desired result.
As a consequence, we get a result that will be useful later on to obtain dominated convergence.
The second term on the RHS is bounded by Lemma 7. Finally, We proceed by taking successive limits, first when n → ∞ and then when A → ∞.By Lemma 7, the second term in the RHS vanishes and it remains to show that lim where the expectation is taken coordinatewise.The rest of the proof will be decomposed into three steps.In the first one, we will show that the first term converges to Af (z).In the second and third ones, we will show that the second and third terms on the RHS vanish.To do so, we will use the first and second moment estimates derived in this section.
Step 1.We have For the first term, we first note that C x (f )(z) is continuous in x. (This can be shown by a standard domination argument).This implies that We now prove that the second term converges to 0. Since f ∈ T , there exists a Lipshitz function F and λ ∈ [0, 1] K such that f = F • ψ λ , so there exists C > 0 such that It will be shown in Proposition 2 that for λ ∈ [0, 1/4), where |z| − ψ λ (z) ≥ 0. Since the exponential function is Lipschitz on (−∞, 0), there exists a constant B such that This bound is independent of k and goes to 0 as n → ∞, which completes the proof.
Step 2. We prove that the absolute value of the second term in ( 18) converges to 0. Since f ∈ T , the problem boils down to proving that for every λ ∈ [0, 1], where By Assumptions 1, and since E(kp By point (ii) of Proposition 6 in the Appendix, the sequence (kp 1 |) → 0 and by a similar integral-sum comparison, J n,A → 0 as n → ∞.
Step 3. Finally, we prove that the term on the third line of ( 18) converges to 0. As in the previous step, it is enough to prove that for every λ < 1, We start by proving the first limit.Recall that By applying succesively Cauchy-Schwarz and Jensen's inequality, Therefore, it is enough to prove that for every ℓ 0 Using the first item of Lemma 6, for every ℓ 0 there exists a constant C and a function h(k) → 0 such that ∀k ≤ An, it is easy to show (19) from there.The second limit can be shown along the same lines.
5 Martingale problem.Proof of Theorem 1

The coagulation operator
In this section, we study some properties of the coagulation operator C x defined in (6).
Our results will be both based on the following interpretation of the operator C x .Conditional on a realization of the random variable Γ = w 1 /E(w 1 ) and consider the sequence of random variables N x,z := (N x,z (i); i ∈ N) such that conditional on Γ, the N x,z (i)'s are independent and Poisson distributed with respective parameters Γz(i)/x.Define C 1 ( N x,z ) as the random vector such that C 1 ( N x,z )(ℓ) = 1 { iNx,z(i)=ℓ} .Using the notations of Lemma 4, C 1 ( N x,z ) can be seen as the trivial coagulation operator associated to a single urn, applied to N x,z .The following relation will be useful for the next results Proposition 2. Let x > 0 and z ∈ Z.The vector C x (z) is in Z and for every λ ∈ [0, 1], Proof.We first prove that C x (z) is in Z. From (20), we have According to Lemma 13 (in Appendix B), for ℓ ∈ N, whose expression coincides with the ℓ th coordinate of C x (z) in (6).In order to prove (21), it remains to show that the Mac-Laurin expansion of ρ converges to ρ pointwise on in a neighborhood of 0. The result for λ ∈ [0, 1] is obtained by standard analytic continuation.To do so, we use Taylor's theorem and prove that the remainder R ℓ (λ) converges to 0. Let δ > 0. Using Lemma 14, for every λ < δ we have , which converges to 0 as ℓ → ∞ for δ small enough.
The next result is useful to study the integrability of the generator A.
Lemma 8.For every z ∈ Z we have Proof.Let x > 0 and condition on a realization of Γ. From the definition of our trivial coagulation operator, which is the desired result for λ = 1, since which, combined with (22), yields the result for λ < 1 by summing over ℓ.

Martingale problem
Proof of Theorem 1. Lemma 8, ensures that the integral with respect to x in the operator A is integrable at ∞. This, together with the fact that x → x −α is integrable at 0 shows that the operator A is well defined.We now proceed in three steps.
Step 1.Let (µ t ; t ≥ 0) be a solution to the martingale problem.Fix K ∈ N and λ ∈ {1} × [0, 1) K−1 .Define the projected process (y λ t ; t ≥ 0) := (ψ λ (µ t ); t ≥ 0).In Step 1, we are going to prove the uniqueness in law of the projected process.Notice that, since λ 1 = 1, the first coordinate of y λ t corresponds to |µ t |.Let B be the operator acting on C 2 ([0, 1] K ) such that for every y = (y 1 , . . ., y where exp( u) is the vector with coordinates {exp(u i )} K i=1 , exp( u) − 1 is the vector with coordinates (exp(u i ) − 1) K i=1 and the expected value is taken w.r.t.Γ.It is straightforward to see that Proposition 2 implies that (y λ t ; t ≥ 0) satisfies this martingale problem To conclude, we now show that the solution to this problem is unique.
According to Theorem 5.1 in [3], we need to check that for every λ ∈ {1} × [0, 1] K−1 and every measurable set B ⊂ R \ {0}, the function is a continuous and bounded function.By standard continuity theorem under the integral, this boils down to proving that for every x ∈ R \ {0}, z → g(z, x) is continuous and that there exists a function h satisfying B h(x)dx < ∞ and such that for every z ∈ Z, |g(z, x)| ≤ h(x).First, observe that z → C x ψ λ (z) is continuous.Since C x (z) is defined as an expectation with respect to Γ (see 6), we use again a standard continuity under the integral theorem, by noticing that the quantity inside the expectation is bounded uniformly by x.This implies the continuity of z → ψ λ (z) − C x ψ λ (z) 2 on (0, ∞).The continuity of z → g(z, x) follows from there.The existence of the upper bound h follows from two observations.First, which, combined with Lemma 8, implies the existence of a constant C such that for z ∈ Step 2. Let us study the uniqueness of the solution to our martingale problem.Fix t 1 < • • • < t n and consider the multidimensional process Z : λ → (ψ λ (µ t 1 ), . . ., ψ λ (µ tn )) on [0, 1] (the "time" parameter is now λ).The previous step shows that the finite dimensional distributions of Z are uniquely determined.Since |µ t | < 1, the radius of convergence of Finally, we can differentiate ψ λ (µ t ) under the sum at 0 infinitely many times to recover µ t from its moment generating function, i.e., This shows that the finite dimensional distributions of µ t are uniquely determined.
Step 3. The existence of a solution follows from our convergence result (Theorem 2).
6 Convergence to the limiting process.Proof of Theorem 2 In this section, we prove convergence in D([0, T ], Z) equipped with the Skorokhod M 1 topology.The proof is based on a useful characterization of tightness in M 1 (see Theorem 12.12.3 and Remark 12.3.2 in [45]).We work with M 1 instead of the more commonly used J 1 because, as far as we know, it is cumbersome to apply similar arguments for the J 1 topology.
Proof.Let us define the function s : We know that this is a continuous function.We consider the process μn := s(µ n ).It is sufficient to prove tightness of μn .Observe that every entry of μn is decreasing.We use Theorem 12.12.We need to check that: (i) For each ǫ > 0, there exists c such that (ii) For each ǫ > 0 and η > 0, there exists δ such that Using Corollary 3 and the fact that t 3 −t 1 ≤ δ, this quantity tends to 0 as δ → 0 which completes the proof.
In the following, we denote by µ any subsequential limit of (µ n ; n ∈ N) in D([0, T ], Z).It remains to prove that µ is the (unique) solution to the martingale problem.We start by showing that the limiting process µ has no fixed point of discontinuity.We have We now let n → ∞ (at fixed p).As a consequence, ν t − νt p → 0 as p → ∞.
Let us now consider the functions (ν K t := E( K i=1 µ t (i)); t ≥ 0), which are also non-increasing and valued in [0, 1].By (11) in Lemma 4,we have and we can apply the same reasoning as above to prove that for every This implies that µ tp converges to µ t in distribution coordinatewise.By Sheffe's lemma, µ tp converges to µ t in distribution in ℓ 1 (R + ).
The proof for t p ↑ t follows along the same lines.
Proof of Theorem 2. We need to show that the process (µ t ; t ≥ 0) is the (unique) solution to the martingale problem.Let f ∈ T .Let p ∈ N. Let h 1 , . . ., h p be continuous and bounded functions from ℓ 1 (R + ) to R + .Let t 1 < • • • < t p ≤ t and s ≥ 0. Recall that A n refers to the generator of the rescaled process µ n , so that it remains to prove that for such choice of times and test functions, we have Let us now consider a coupling such that µ n converges to µ a.s. in D([0, T ], Z).In virtue of Lemma 9, the times t, t + s and t i 's are a.s.continuity points for the limiting process so that µ n u → µ u for u ∈ {s, s + t, t 1 , • • • , t k } a.s.. Further, by monotonicity of each coordinate, the set of discontinuities for the functions (|µ t |; t ≥ 0) and ( ℓ i=1 µ t (i)); t ≥ 0) is a (random) countable set a.s..This implies that the set of discontinuity points for the limiting process (µ t ; t ≥ 0) has a.s.null Lebesgue measure i.e., for every fixed t, P(µ is continuous at t) = 1.Now, in virtue of Corollary 3, we can use the bounded convergence theorem (to pass the limit inside E and the time integral).( 23) follows from Proposition 1 and the fact that a.s. the set of discontinuities for the limiting process has null Lebesgue measure.

Self-similarity
In this section, we show that the limiting process (µ t ; t ≥ 0) is a self-similar Markov process.The self-similarity property provides a natural Lamperti representation of the process, given in (9).This representation allows to construct the process µ as the flow induced by a SDE driven by a Lévy noise, see Theorem 5.
Proof.Fix γ > 0 and consider the rescaled process µ (γ) := (γµ tγ −β ; t ≥ 0).By uniqueness of the solution of the martingale problem introduced in Theorem 1, it is sufficient to check that µ (γ) is also a solution.Fix an integer K, a function F ∈ C 2 ([0, 1] K ) and a vector λ ∈ [0, 1] K .Let F (γ) (x) = F (γx).Since (µ s ; s ≥ 0) is a solution of the martingale problem, then is a martingale.Next, we observe that for every z ∈ ℓ 1 (R + ) This implies Changing the variables sγ −β = s and x = γx in the latter integral yields This shows that s )) ρx −α dxds defines a martingale for every F ∈ C 2 ([0, 1] K ) so that µ (γ) is also a solution of the martingale problem introduced in Theorem 1.
This result allows us to identify the jump measure of the subordinator ξ t as follows.Consider the measure on R + defined by ρx −α dx.The jump measure of the subordinator is the pushforward of this measure by the function g (as defined above).This is a direct consequence of Theorem 5.
Proof.Let λ = (1, λ ′ ) ∈ {1} × [0, 1) K−1 .Let (µ t ; t ≥ 0) be a solution of the martingale problem defined in (8).Recall the definition of (as in the proof of Theorem 1).For every y ∈ [0, 1] K , recall the definition of where the expected value is taken w.r.t.Γ.As in the proof of Theorem 1, y λ is the unique solution of the martingale problem Define the process ( w λ t ; t ≥ 0) := (exp(− ξt ), exp(− ξt )x λ ′ t ; t ≥ 0).Let (τ t ; t ≥ 0) be the Lamperti change of time in (9) defined w.r.t.ξ.Since τ is the inverse time change defined in Theorem 3, we need to prove that ( w λ τt ; t ≥ 0) = (y λ t ; t ≥ 0) in law.The strategy consists in showing that the time changed process w λ τt solves the same martingale problem.By applying Ito's formula in the discontinuous case (see [36]) Using the fact that there are at most ℓ 0 lineages remaining at time τ j 0 , we have where in the last line we used (28).It remains to show that the expectation on the RHS is finite.
In order to see that, we note that successive ℓ 0 -events are separated by independent exponential r.v.'s with the same parameter R(ℓ 0 ) > 0. Further, at any of those events, there is a strictly positive probability p to go from n lineages to a single lineage.By a simple coupling argument, one can bound from above the r.v.Tℓ 0 by X i=1 e i where the e i 's are i.d.d.exponential r.v.'s with parameter R(ℓ 0 ) and X is an independent geometric r.v. with parameter p.Since the upper bound has mean 1 R(ℓ 0 )p < ∞, E( Tℓ 0 ) < ∞.
Proof of Proposition 5. We need to show that for every A > 0, We start with (31).Let K ∈ N. By monotonicity, we have Since the limiting process (µ t ; t ≥ 0) has no fixed point of discontinuity (see Lemma 9), the RHS is converging to K −1 ⌊AK⌋+1 Next, for every J, n ∈ N, the process ( J i=1 µ n s (i); s ≥ 0) is also monotone in s, the exact same argument shows convergence of J i=1 A 0 µ n s (i)ds to J i=1 A 0 µ s (i)ds.Finally, the proof of ( 31) is complete by noting that all the previous convergence statements hold jointly for every J ∈ N. .
The right hand side of the inequality does not depend on λ, and one can check that it has a finite expectation.Since ρ(λ) = E(f • g(λ)) (where the expectation is taken w.r.t.Γ), by standard derivation theorem under the integral, ρ(λ) is infinitely differentiable and the derivatives can be calculated by differentiating inside the expectation.

s 0 1
|µ r | β dr > t}, and ξ t := − log |µ At | , is the net gain or loss of balls of size m in the coagulation operation.This can be computed by taking the difference of the following two quantities (a) The number of balls of size m falling in an urn where another ball is assigned (regardless of its size).(b) The number of urns with more than two balls and whose sizes add up to m. Suppose first that N m > C p ( N )(m).Thus, N m − C p ( N )(m) is smaller than (a) alone.Finally, (a) is smaller than the twice the total number of balls that are lost.As an illustrative example, consider the case when the N m balls are assigned in pairs to N m /2 different urns, coagulating into N m /2 balls of size 2m.Then the number of balls of size m that are lost is N m and the total number of balls that are lost is N m /2.Now suppose that N m < C p ( N )(m).In this case C p ( N )(m) − N m is less than (b).

1 ,
X 1 is distributed as a binomial r.v. with parameters (n|z n |, p (k) 1 ).Let b 1 be the indicator that box 1 contains at least one ball.Similarly, let Y 1 be the random variable such that, conditional on Γ, Y 1 is distributed as a Poisson r.v. with parameter Γn|z n |/k.Let d 1 be the indicator that Y 1 ≥ 1.We have define d j (c) (with j = 1, 2) as the indicator that 3 in [45].Let us define the supremum norm on Z ∀x ∈ Z, ||x|| := sup 0≤t≤T ||x(t)|| = sup 0≤t≤T max i |x t (i)|.

i=0
|µ i/K | for every K ∈ N. As a consequence, lim A 0 |µ n s |ds is bounded from above by A 0 |µ s |ds.A similar argument shows the reverse bound.This proves that lim n A 0 |µ n s |ds = A 0 |µ s |ds.