Weak convergence of the scaled jump chain and number of mutations of the Kingman coalescent

The Kingman coalescent is a fundamental process in population genetics modelling the ancestry of a sample of individuals backwards in time. In this paper, in a large-sample-size regime, we study asymptotic properties of the coalescent under neutrality and a general finite-alleles mutation scheme, i.e. including both parent independent and parent dependent mutation. In particular, we consider a sequence of Markov chains that is related to the coalescent and consists of block-counting and mutation-counting components. We show that these components, suitably scaled, converge weakly to deterministic components and Poisson processes with varying intensities, respectively. Along the way, we develop a novel approach to generalise the convergence result from the parent independent to the parent dependent mutation setting. This approach is based on a change of measure and provides a new alternative way to address problems in the parent dependent mutation setting, in which several crucial quantities are not known explicitly.


Introduction
The Kingman coalescent [16] is a classical stochastic process that models the genealogical tree of a sample of individuals.With the aim of including various genetic forces, generalisations of this pivotal model resulted in the establishment of other wellknown models, such as the ancestral selection graph [19,23], the ancestral recombination graph [10,12], the Λ−coalescent [24,26], and the Ξ−coalescent [22,28].Coalescent theory is also closely related to urn models.In fact, a Pólya-like urn structure can be embedded in the coalescent process by matching lineages and balls [31].
In this paper, we consider the Kingman coalescent with a finite-allele mutation scheme evolving under neutrality (i.e.no type has a selective advantage), and study its asymptotic behaviour as the size of the sample grows to infinity.Beside the purely mathematical significance of deriving new properties of a classical model, our analysis is inspired by applied statistical problems which have recently drawn attention to largesample-size regimes in population genetics and are briefly mentioned in the following.
Coalescent models are often used for inference, combined with Monte Carlo methods to approximate the likelihood of an observed genes configuration.In particular, importance sampling algorithms [13,29,5] and sequential Monte Carlo methods [18] have been developed based on simulating backwards the genealogy of the sample (extensions based on generalisations of the coalescent exists [30,3,11,14,4,17]).However, it is known empirically that these methods do not scale well with increasing sample size [15].Another issue is that the coalescent approximation is only accurate when the sample size is sufficiently smaller than the effective population size.Otherwise, as explained in [1], substantial differences between the original models and the coalescent approximation could arise.Studying large-sample-size asymptotic properties of the coalescent provides a tool both for the theoretical asymptotic analysis of inference methods, which could guide their improvement, and for the analysis of the appropriateness of the coalescent as sample sizes in modern studies in genetics grow rapidly.
The contribution of the paper is twofold.The main result (Theorem 2.1) is the convergence of a sequence of Markov chains related to the coalescent, which consists of block-counting and mutation-counting components and is described in Section 2. We show that these components, suitably scaled, converge to deterministic components and Poisson processes with varying intensities, respectively.The convergence result is first proved under the assumption of parent independent mutation (PIM), i.e. the type of the mutated offspring does not depend on the type of the parent.We then develop a novel approach, based on a change of measure and presented in Section 3, to remove the PIM assumption and generalise the convergence result.The change-of-measure approach (Theorem 3.1) constitutes the other main result of the paper.In fact, to the best of our knowledge, it provides the first tool in population genetics to transfer results that are obtained under the widely-spread PIM assumption to the general mutation setting, which is notoriously more difficult and often neglected in the literature due to several crucial quantities not being known explicitly in this setting.Other technical challenges for the convergence proofs are addressed in Section 4 by constructing a technical framework that allows employing Ethier-Kurtz classical tools [7].Section 5 contains the convergence proofs.

Outline and main result
The coalescent describes the genealogy of a sample of individuals by describing the evolution of their ancestral lineages which coalesce and mutate as time evolves backwards from time 0, at which the sample is taken, until one lineage is left, i.e. the most recent common ancestor of the sample is reached.Here, lineages are typed, that is, types are assigned to lineages as the coalescent evolves backwards as in e.g.[13,6,9,8], rather than superimposed afterwards.The block-counting jump chain of the typed coalescent, conditioned on the initial sample, counts the number of ancestral lineages of each type over a discrete time.We denote the mutation rate by θ and the mutation probability matrix by P = (P ij ) d i,j=1 , which is assumed to be irreducible.
In a population at equilibrium, we take a sample of the form ny of the Kingman coalescent evolving backwards in time from the sample, i.e. conditioned on H (n) (0) = ny (n) 0 , to its most recent common ancestor, and denote the number of lineages of type i after k steps by H where, denoting the d-dimensional unit vector along the j th dimension by e j , • v = e i , i = 1, . . ., d, with probability ρ (n) (e i | y), which corresponds to the coalescence of two lineages of type i; • v = e i − e j , i, j = 1, . . ., d, with probability ρ (n) (e i − e j | y), which corresponds to the mutation of a lineage of type j to type i forwards in time (or, equivalently, i to j backwards in time).
The backwards transition probabilities ρ (n) are generally not known in an explicit form, unless mutations are parent independent, see e.g.[13,29,5,11] for a more detailed description.Implicit and explicit expressions for ρ (n) , in the general and PIM case respectively, are provided in (A.1) in Appendix A.1 and in (5.1) in Section 5.1.
Furthermore, along with Y (n) , consider the process which counts the number of mutations of each type occurring in the genealogy.That is, is the cumulative number of mutations from type i to type j that have occurred in the genealogy during the first k steps, i.e.
In this paper, we study the asymptotic behaviour of the sequence ), as n → ∞.Note that, the sequence M (n) is not only interesting for the purpose of studying an additional asymptotic property of the coalescent, but it is also crucial because it naturally appears in our change-of-measure argument, as explained in the next sections.The main convergence result is stated in the following theorem.

and y
(n) 0 → y 0 as n → ∞.Then, for all t ∈ [0, y 0 1 ), as n → ∞, the sequence of processes where Y is the deterministic process defined by .
Recall that, by definition, converging weakly means converging in the Skorokhod We now provide an intuitive explanation of Theorem 2.1 and illustrate the two main challenges that need to be addressed to obtain a rigorous proof.Define the operator A (n) , associated to the Markov chain Z (n) , as where e ij is the unit matrix having 1 in position ij and 0 everywhere else.Recent results on the asymptotic behaviour of ρ (n) [9], show that, if Therefore, by evaluating (2.1) at y (n) , and letting y (n) → y ∈ R d + as n → ∞, it is straightforward to see that the sequence of operators A (n) converges to, in some sense to be properly defined,

3)
The operator A, defined in (2.3) above, can be proven to be, as the intuition suggests, the infinitesimal generator of the limiting process of Theorem 2.1, see Appendix A.3.In order to make rigorous the convergence of generators sketched above to then prove convergence of the corresponding processes, we need to define the appropriate space of functions, in the domain of the generator A, for which the convergence holds.Two main challenges arise.
The first challenge is given by the lack of an explicit expression for the transition probabilities ρ (n) in the parent dependent mutation case.Even though the pointwise asymptotic behaviour (2.2) of the transition probabilities is known, this is not enough to prove convergence of generators, as evident in Section 5.1, because additional information concerning uniform convergence would be needed.We address this issue by first proving the convergence in the PIM case, for which explicit expressions are available, and by then employing a novel approach based on a change of measure to generalise the result from a PIM setting to a general setting, as explained in details in Section 3. It is important to point out that the counting-mutations sequence M (n) would naturally appear through the change of measure, even if only the block-counting sequence Y (n) was considered to begin with (equation (3.1) in Section 3).This strongly motivates the study of the asymptotic behaviour of M (n) , without which the generalisation to the parent dependent setting would not be possible.
The second challenge, which arises already in the simpler PIM setting, is that, if the components of y are allowed to be arbitrarily close to zero, the scaled transition probabilities of mutation in (2.2) explode, see (5.1) in Section 5.1.As more precisely described in Section 4, this issue can be addressed by constructing an appropriate technical framework which relies on the definition of a suitable metric to avoid components being arbitrarily close to zero and allows employing Ethier-Kurtz classical tools [7].
While sections 3 and 4 are dedicated, respectively, to the change-of-measure argument and the technical framework, Section 5 contains the convergence proofs leading to, and including, the proof of Theorem 2.1.

From parent independent to parent dependent mutations: a novel change-of-measure approach
Suppose that the convergence of Theorem 2.1 holds when mutations are parent independent, that is when the mutation probability matrix Q = (Q i ) d i=1 has identical rows.The goal of this section is to illustrate how to generalise the convergence result to a general setting, with mutation probability matrix P = (P ij ) d i,j=1 .More precisely, using subscripts Q and P to indicate the underlying mutation probability matrices and fixing a bounded continuous function g : and that we want to prove Our approach consists of rewriting the expectations with respect to Z(n)

P
and Z P as expectations with respect to

Z(n)
Q and Z Q , respectively.That is, we apply a change of measure to transform the parent dependent mutation setting in a parent independent mutation setting.We present the core idea in the following.
Consider the probability of a realisation H(0), . . ., H(K), K ∈ N, of H P (0), . . ., H P (K), where p P (H(k)) is the probability of observing a sample H(k) under the unconditioned coalescent, i.e. the sampling probability, and P (H is the explicitly-known forward transition probability.More precisely, p P , which can be expressed explicitly only when mutations are parent independent, can be defined either through a recursion formula or as a multinomial draw from the stationary distribution of the corresponding Wright-Fisher diffusion, see [13] for details.The stationary distribution, which is Dirichlet in the PIM case, is also not know explicitly in the general case but exists as long as the mutation probability matrix is irreducible [27].The boundary condition is assumed to be p P (e i ) = π P,i , i = 1, . . ., d, where π P = (π P,i ) d i=1 is the invariant distribution of the mutation probability matrix.Despite the implicit formulation, a large-sample-size asymptotic formula for the sampling probabilities is available in general [9], which makes the rewriting in the last display useful for the asymptotic analysis.The forward transition probabilities instead are always explicitly known [29].
Given a configuration H P (k + 1) = H(k + 1), a lineage is chosen uniformly at random, i.e. of type i with probability Hi(k+1) . The chosen lineage is split with probability and it is mutated to type j with probability • v = e j − e i , i, j = 1, . . ., d, with probability Hi(k+1) Note that this description clarifies the equivalence between the coalescent and the Pólya-like urn model mentioned in the introduction: according to the probabilities above, typed-balls, which correspond to typed-lineages, are duplicated and replaced, instead of split and mutated respectively [31].
It is now possible to derive the likelihood ratio, the Radon-Nikodym derivative, of the change of measure from parent dependent mutation with mutation probability matrix P to parent independent mutation with mutation probability matrix Q. Assuming, without loss of generality, that Q is positive, it yields Note that the likelihood ratio above does not depend on the whole history, but instead only depends on the initial and last configurations, H(K), H(0), and on the total number of mutations from type i to type j, K−1 k=0 I {H(k)−H(k+1)=ej −ei} , for i, j = 1, . . ., d.This is the key observation for the change-of-measure approach and it is also the reason why it is crucial for the asymptotic analysis to consider, along with the block-counting sequence, the counting-mutation sequence which naturally appears in the likelihood ratio.
In the next theorem, we tailor the previous calculations to the change of measure from Z(n) Q and derive similar calculations for the change of measure from Z P to Z Q .Note that an analogous result can be obtained for any process that is adapted to the block-counting jump-chain of the Kingman coalescent. where Furthermore, for all t ∈ [0, y 0 1 ), Proof.By definition, { Z(n) P (s)} s∈[0,t] is a function of n and H P (0), . . ., H P (⌊tn⌋).Thus, we can simply use (3.1) with K = ⌊tn⌋ to obtain (3.2).
Since the first component of Z P = (Y, M P ) and of and do not depend on the mutation probability matrix, it is sufficient to focus on the second components.Using that and noting that t In fact, c P,Q (M Q (t)) can be interpreted as the exponential martingale associated to the change of measure that transforms the distribution of the Poisson processes {M P (s)} s∈[0,t] into the distribution of the Poisson processes {M Q (s)} s∈[0,t] , see e.g.[20].This proves (3.3) and completes the proof.

Handling the explosion: technical framework
In this section, we construct a technical framework that allows us to address the problem caused by the scaled transition probabilities of mutation in (2.2) exploding near the boundary Ω 0 := {y = (y 1 , . . ., y d ) : y i = 0 for some i} .In such a framework, we can rigorously state the convergence of generators (Proposition 4.1 below) and we can show the weak convergence of the sequence Z (n) to the process Z by relying on classical results [7], as explained briefly below and proved in detail in the next section.
The natural state-space of the limiting process Z is R d + × N d 2 , equipped with the Euclidean metric.Consider instead E 1 = (0, ∞] d , equipped with the metric ψ 1 (y 1 , y 2 ) := , where the inversion is component-wise and the inverse of ∞ is by convention 0; and consider E = E 1 × N d 2 , equipped with the product metric ψ := ψ 1 ⊕ • 2 .
In the new state-space E 1 , the roles of 0 and ∞ are reversed, whilst, in the rest of the space, the metric ψ 1 is equivalent to the Euclidean metric.In particular, compact sets are bounded away from Ω 0 , and thus functions with compact support are equal to zero near Ω 0 .See Appendix A.2 for more details on the metric space (E, ψ).
Let C ∞ c (E) be the space of real-valued continuous functions on (E, ψ) with compact support, equipped with the supremum norm, which is analysed in details in Appendix A.2.This is precisely the space of functions that we need for the convergence of generators, which is stated in the next proposition. Let be the state space of Z (n) and let η n map any function on E into its restriction on E (n) , with value zero on Ω 0 × N d 2 .
Proposition 4.1.Assume parent independent mutations.Let A (n) be the operator, defined in (2.1), associated to the Markov chain Z (n) , and let and A be the infinitesimal generator, defined in (2.3) , associated to the Markov process Z.Then, for all Proposition 4.1, which is proved in Subsection 5.1, allows to derive the convergence of the corresponding semigroups, which is proved in Section 5.2.In order to then prove weak convergence (Theorem 2.1), another technical issue needs to be addressed.In fact, as time approaches y 0 1 , Y approaches the origin, and the intensities of the Poisson processes M explode, that is, the process Z exits its state space E in a finite time.This is a classical problem, and thanks to the technical framework constructed in this section, we can rely on classical results [7] to address the issue and finally prove weak convergence.Following [7], a new point ∆, called the cemetery point, is introduced in the state space.Let E ∆ = E ∪ {∆} be the one-point compactification of E, which is itself a metric space, see Appendix A.2 and [21] for more details, and let Z ∆ be the extension of Z, corresponding to Z before time y 0 1 , and being equal to ∆ from time y 0 1 on.Furthermore, let ξ ∆ n be the inclusion map from E (n) into E ∆ , sending Ω 0 × N d 2 to ∆.
for a fixed t ∈ [0, y 0 1 ), we prove an extended convergence which then implies Theorem 2.1, as shown in Subsection 5.3.

Convergence proofs
In this section, we prove Theorem 2.1, i.e. the weak convergence of the sequence { Z(n) (s)} s∈[0,t] to the process {Z(s)} s∈[0,t] .The plan for the proof of is the following.
First, under the assumption of parent independent mutations, • [Sect.5.1] the convergence of generators (Proposition 4.1) is proved; • [Sect.5.2] the convergence of the corresponding semigroups is proved; • [Sect.5.3] the weak convergence (Theorem 2.1) is proved by showing the weak convergence of the extended processes on the extended state space (4.1), which relies on the technical framework of Section 4.
Then, in the general case of possibly parent dependent mutations, • [Sect.5.4] we use the change-of-measure approach (Theorem 3.1) to prove the weak convergence (Theorem 2.1).

Convergence of generators (PIM)
In this subsection, we prove Proposition 4.1.Assume the mutation probability matrix Then, the backwards transition probabilities are explicitly known see e.g.[5] and Appendix A.1.Given f ∈ C ∞ c (E), let K = {(y, m) ∈ E : y j ≥ δ, m ij ≤ M, ∀i, j = 1, . . ., d} be the compact set that contains the support of f , see Remark A.4 in Appendix A.2.Note that f and all of its derivatives are Lipschitz continuous with respect to ψ and bounded.
If (y, m) ∈ E (n) ∩ K c , then f = 0 in a neighborhood of (y, m), and the points (y, m + e ij ), y − 1 n e j , m and y − 1 n e j + 1 n e i , m + e ij belong to K c , ∀n ∈ N >0 .Therefore, It is to be shown that the difference in the next display is bounded by some function of n, which does not depend on y and m, and vanishes as n → ∞.The bound is constructed as (5.2) A bound for each term of the sums in the right hand side of (5.2) can be obtained as follows.Applying the mean value theorem yields for some a ∈ (0, 1), therefore, using also that ρ (n) (e j | y) ≤ 1, and that .
For the second term in (5.2), by the explicit expression of ρ (n) (e j | y) in (5.1), it follows that, ∂f (y, m) ∂y j For the third term in (5.2), since ρ (n) (e j − e i | y) ≤ θQj nδ , and f is Lipschitz continuous, Finally, for the last term in (5.2), .
Since the bound for (5.2) does not depend on (y, m) and vanishes as n goes to infinity, the proof is complete.

Convergence of semigroups (PIM)
In this subsection, we prove the convergence of the semigroups associated to the generators of the previous subsection under the assumption of parent independent mutations.Take f ∈ Ĉ(E), the space of continuous real-valued function on (E, ψ) vanishing at infinity, see Remark A.4 for a characterisation.Let T (n) be the semigroup associated to the Markov chain Z (n) , that is, and let T be the semigroup associated to the Markov process Z, described in the following.Let Ω ∞ := {y = (y 1 , . . ., y d ) : Λ ij (t, y) wij w ij ! e −Λij (t,y) . (5.5) with We now prove that, for all f ∈ Ĉ(E), for all t ≥ 0, (5.6) First, note that T (n) is a linear contraction and that, as shown in Appendix A.3, {T (t)} t≥0 is a strongly continuous contraction semigroup associated to the generator A. Furthermore, C ∞ c (E) is a core for the generator A. This holds by Proposition 3.
and A is the generator of a strongly continuous contraction semigroup.Therefore, by Theorem 6.5 in [7, Ch.1], the convergence of semigroups (5.6) is equivalent to the convergence of generators of 4.1, which concludes the proof.

Weak convergence (PIM)
In this subsection, we prove Theorem 2.1 under the assumption of parent independent mutations.We first prove the result for the extended processes (4.1), following [7].
The semigroup T , associated to the Markov process Z, is extended to a conservative semigroup T ∆ as in e.g.[7,Ch. 4], which is associated to the extended Markov process Z ∆ , by letting, for f ∈ Ĉ(E ∆ ) = C(E ∆ ), (5.7) In the expression above it is implied that T (t) is applied to the restriction to on E and to f (∆) on ∆.By Lemma 2.3 in [7, Ch.4], the extended semigroup {T ∆ (t)} t≥0 inherits all the properties of {T (t)} t≥0 and additionally it is conservative.Hence, it is a Feller semigroup.
By defining the bounded linear operator η n , the convergence of semigroups of the previous subsection can be naturally extended to, for all f ∈ C(E ∆ ) and t ≥ 0, lim Therefore, by Theorem 2.12 in [7, Ch. 4], there exists a Markov process Z ∆ , associated to the semigroup T ∆ , with sample paths in Since T ∆ is the extension of the generator T associated to the process Z, Z ∆ coincides with Z on E. This easily implies Theorem 2.1, as shown in the following.Since Y(0) = y 0 and t < y 0 1 , the limiting process at time t is not at the cemetery point, that is, Z ∆ (t) = Z(t) ∈ E. Letting for example δ = 1 2 min i=1,...,d { Ỹi (t)}, which is positive, the convergence of the extended processes (4.1) implies that the probability of all components of { Ỹ (n) (s)} 0≤s≤t being larger than δ converges to 1. Being the paths up to time t bounded away from Ω 0 , the convergence of the extended processes (4.1), restricted to the paths up to time t, is equivalent to the convergence of the non-extended processes written in terms of the Euclidean metric.

Weak convergence (general mutation)
In this subsection, we prove Theorem 2.1 for a general mutation matrix P , knowing from the previous subsection that the theorem holds for a PIM mutation matrix Q, which we assume positive, and using the change-of-measure approach, i.e.Theorem 3.1.
First, we prove the following lemma on the asymptotic properties of the sequences r (n) ) which constitute the Radon-Nikodym derivative of the change of measure from P to Q appearing in Theorem 3.1.We use Theorem 2.1 for the sequence Z (n) Q and the asymptotic results in [9] for the sampling probabilities.Remark 5.1.For the following proofs it is useful to recall that, for a sequence of random variables that converges in distribution, convergence of the L 1 -norms occurs if and if the sequence is uniformly integrable.
To prove (iii), first recall that, by [9, Thm 4.3], n d−1 p P (ny pP (y), as y (n) → y, where pP is the stationary density of the Wright-Fisher diffusion that is dual to the Kingman coalescent.Thus which is equal to y 0 y 0 1 y 0 1 − t .By combining the two convergences above and using the characterization of convergence in probability in terms of almost surely converging subsequences, it is straightforward to conclude that r Proving (iv) is rather straightforward.In fact, since c P,Q is a continuous function, and since, by Theorem 2.1, , by the continuous mapping theorem.Combining this with (iii) and using Cramér-Slutzky's theorem proves (iv).
Finally, (v) holds since the sequence c P,Q ( M(n) 0 ) converges in distribution to c(M Q (t)), as n → ∞, by (iv), and the expectations are constantly equal to 1, by (i),(ii), thus by Remark 5.1 the sequence is uniformly integrable.
We now prove Theorem 2.1 for a general mutation matrix P , using properties (iii) and (v) of Lemma 5.2, Theorem 3.1 and the PIM version of Theorem 2.1 (applied to Q).
Fix a bounded continuous function g : (5.8) 0 ) → 1 in probability, as n → ∞, by Lemma 5.2 (iii), Cramér-Slutzky's theorem implies in distribution.Furthermore, the sequence {G (n) (t)r Lemma 5.2 (v), and G (n) (t) is bounded.e.g.[5].The expression above can be obtained from the expression of the forward transition probabilities by employing Bayes' rule.In the PIM case, it is known that π can be explicitly written as which gives an explicit expression also for the backwards transition probabilities, given in (5.1).

A.2 Metric state-spaces and spaces of continuous functions
The goal of this section is to analyse the metric space (E 1 , ψ 1 ), in order to show that compact sets are bounded away from the boundary Ω 0 , Remark A.2, and to characterise continuous functions on the metric space (E, ψ), Remark A.3, with the properties of compact support and vanishing at infinity, Remark A.4. Characterisations in terms of the Euclidean metric are provided.

Remark A.2 (Compact sets).
A compact set in (E 1 , ψ 1 ) is a closed set which is bounded away from Ω 0 .More precisely, a set K ⊂ (E 1 , ψ 1 ) is compact if and only if K is closed in (E 1 , ψ 1 ) and there exists δ > 0 such that, for all y ∈ K, y j > δ, j = 1, . . ., d.
Furthermore, real-valued continuous functions on (E 1 , ψ 1 ) are simply continuous functions on (E 1 \ Ω ∞ , • 2 ) that can be continuously extended on Ω ∞ , as specified in the following.

Remark A.3 (Continuous functions).
A function f : (E 1 , ψ 1 ) → R is continuous if and only if the two following conditions are satisfied and N d 2 is a discrete space, the above characterisations can be easily extended to (E, ψ).We focus now on spaces of continuous functions on (E, ψ).In particular, a continuous function on (E, ψ) is simply a function which is continuous with respect to its component in E 1 in the sense of Remark A.3.We say that a function vanishes at infinity if, given any ǫ > 0, there exists a compact set K ǫ ⊂ E such that |f (z)| < ǫ, ∀z ∈ K c ǫ .Since the component in (E 1 , ψ 1 ) of a compact set is bounded away from Ω 0 and the component in N d 2 is bounded, the following holds.The characterisations of this section provide a description of the spaces of continuous functions on (E, ψ) vanishing at infinity, Ĉ(E), and of smooth functions on (E, ψ) with compact support, C ∞ c (E).Furthermore, note that C ∞ c (E) is dense in Ĉ(E) which is dense in the space of bounded continuous functions on (E, ψ).Moreover, all functions in C ∞ c (E) and Ĉ(E) are uniformly continuous, while functions in C ∞ c (E), and all of their derivatives, are Lipschitz continuous.
Finally, we characterise continuous functions on the extended state-space E ∆ = E ∪ {∆}, which is the one-point, or Alexandroff, compactification of E equipped with the standard topology inherited by the topology in E. Since E is a separable locally compact metric space, its one-point compactification is metrizable, see e.g.[21].Let (E ∆ , ψ ∆ ) be the corresponding metric space.
Note that, while using the metrics ψ and ψ ∆ allows for a more compact formulation, all the statements concerning continuous functions on E or E ∆ , can be expressed in terms of the Euclidean metric by using the remarks of this section.

A.3 Infinitesimal generator and semigroup of the limiting process (PIM)
In this section, under the assumption of parent independent mutations, we show that the semigroup T , defined in (5.3), is a positive, strongly continuous, contraction semigroup and that the infinitesimal generator A, associated to the semigroup T , is If (y, m) ∈ K ǫ , consider the following inequality, which is bounded by ǫ, if t < η ′′ ǫ , for some η ′′ ǫ > 0 which only depends on δ ǫ , not on y.Furthermore, using (A.3) and again that f is bounded yields where the last quantity is bounded by ǫ if t < η ′′′ ǫ , for some η ′′′ ǫ > 0, which only depends on δ ǫ , not on y.Therefore, if t < min{η ′ ǫ , η ′′ ǫ , η ′′′ ǫ }, sup

Infinitesimal generator
Let A be the operator defined as in (2.
3), for f ∈ C ∞ c (E), Af (y, m) = − ∇ y f (y, m), [f (y, m + e ij ) − f (y, m)] λ ij (y), This allows to characterise closed sets and compact sets of (E 1 , ψ 1 ) in terms of the Euclidean metric.(Closed sets).A set C ⊂ (E 1 , ψ 1 ) is closed if and only if ω(C) = {x ∈ S 1 : x = 1 y for some y ∈ E 1 } is closed in (S 1 , • 2 ).Equivalently, C is closed if and only if one of the following conditions is satisfied •