Scaling Limit of the Fleming-Viot Multi-Colour Process

We consider the $N$-particle Fleming-Viot process associated to a normally reflected diffusion with soft catalyst killing. The Fleming-Viot multi-colour process is obtained by attaching genetic information to the particles in the Fleming-Viot process. We establish that, after rescaling time by $t\mapsto Nt$, this genetic information converges to the (very different) Fleming-Viot process from population genetics, as $N\rightarrow\infty$. An extension is provided to dynamics given by Brownian motion with hard catalyst killing at the boundary of its domain.


Introduction and main result
In this paper we study the behaviour of a system of interacting diffusion processes, known as a Fleming-Viot particle system, first introduced by Burdzy, Hołyst and March in [16].We will establish that if one attaches genetic information to the Fleming-Viot particle system and rescales time by t → N t, this genetic information evolves for large N like the (very different) Fleming-Viot process from population genetics, which we refer to in this article as a Wright-Fisher process for the avoidance of confusion.This is our main theorem, Theorem 1.4.We emphasise that, despite sharing the same name, no link had previously been established between the Fleming-Viot particle system (or any similar particle system) and the Wright-Fisher process.
Throughout this paper, (X t ) 0≤t<τ ∂ will be defined to be a diffusion process evolving in the closure D of an open, connected, bounded domain D ⊆ R d , normally reflected at the C ∞ boundary ∂D, and killed at position dependent rate κ(X t ) (soft killing).That is, prior to the killing time τ ∂ , X t evolves according to the SDE whereby ξ t is the boundary local time of X t at ∂D and n(x) is the unit interior normal at x ∈ ∂D.A precise definition of such processes is given in Appendix A. We assume throughout that κ ∈ C ∞ (R d ; R ≥0 ) and is strictly positive somewhere on D. We also assume that b ∈ C ∞ (R d ; R d ) and σ ∈ C ∞ (R d ; R d×m ), with σσ T uniformly positive definite.The Fleming-Viot particle system is defined as follows.
Definition 1.1 (Fleming-Viot particle system).The Fleming-Viot particle system ( X N t ) t≥0 consists of N ≥ 2 particles X N t = (X N,1 t , . . ., X N,N t ), t ≥ 0, evolving independently in the domain D according to (1.1).When a particle is killed we relocate it to the position of a different particle chosen independently and uniformly at random.
In general, it is not clear that the Fleming-Viot particle system is well-posed due to the possibility of infinitely-many jumps in finite time.In the present setting, however, this is not an issue as the killing rate is bounded.The Fleming-Viot particle system was introduced by Burdzy, Hołyst and March [16] in the case of Brownian dynamics with instantaneous killing at the boundary (hard killing), where it was shown to provide an approximation method both for the heat equation with Dirichlet boundary conditions and the principal eigenfunction of the Dirichlet Laplacian.The Fleming-Viot particle system with soft killing was considered by Grigorescu in [27].The Fleming-Viot particle system has been shown to provide a general approximation method for absorbed strong Markov processes by Villemonais [47], and has been shown to provide an approximation method for quasi-stationary distributions (QSDs) in a variety of settings [16,2,3,45].When a killed Markov process is Feller, quasi-stationary distributions correspond to left eigenmeasures of its infinitesimal generator [37, Proposition 4].

The Fleming-Viot multi-colour process
We attach genetic information ("colours") to the Fleming-Viot particle system, resulting in the Fleming-Viot multi-colour process, which was introduced by Grigorescu and Kang in [28,Section 5.1].Whereas the colours in the construction of [28,Section 5.1] are assumed to belong to a finite space, the present article develops this by instead assuming the colours belong to a complete, separable metric space.This space is referred to as the "colour space" and is denoted by K.The colour η i t ∈ K gives the genetic information of the particle X i t , for i = 1, . . ., N .A precise definition of the Fleming-Viot multi-colour process is given by the following.Definition 1.2 (Fleming-Viot multi-colour process).We take (K, d) to be an arbitrary complete separable metric space, which we call the colour space.We define ( X N t , η N t ) 0≤t<∞ = {(X N,i t , η N,i t ) 0≤t<∞ : i = 1, . . ., N } as follows: (i) Initial condition: ((X N,1 0 , η N,1 0 ), . . ., (X N,N 0 , η N,N 0 )) ∼ υ N ∈ P(( D × K) N ).
(ii) For t ∈ [0, ∞) and between killing times the particles (X N,i t , η N,i t ) evolve and are killed independently according to (1.1) in the first variable, and are constant in the second variable.
(iii) We write τ i k for the death times of particle (X N,i , η N,i ) (with τ i 0 := 0).When particle (X N,i , η N,i ) is killed at time τ i k it jumps to the location of particle (X N,j , η N,j ), with j = U i k ∈ {1, . . ., N } \ {i} chosen independently and uniformly at random, at which time we set (X N,i ). Moreover we write τ n for the n th time at which any particle is killed (with τ 0 := 0).
(1.2) We then define to be the number of deaths up to time t normalised by 1 N , and define the empirical measures (1.4) We will obtain a scaling limit for the colours as N → ∞ and time is rescaled according to t → N t.We now describe the scaling limit we will obtain.

The Wright-Fisher process
Given a gene with two neutral alleles, a and A, the SDE dp t = p t (1 − p t )dW t models the evolution of the proportion p t ∈ [0, 1] of the population carrying the a-allele in a large population.This is the classical Wright-Fisher diffusion.Generalising this to n alleles, the driftless n-Type Wright-Fisher diffusion process of rate θ > 0 takes values in the simplex ∆ n := {p = (p 1 , . . ., p n ) ∈ R n ≥0 : j p j = 1} and is characterised by the generator and the unique right eigenfunction of eigenvalue −λ, up to rescaling.Throughout we normalise φ so that π, φ = 1.We may therefore define the constant . (1.7) We define the tilted empirical measure of the colours, denoted as (Y N t ) 0≤t<∞ , by ∈ P(K). (1.8) Whereas consideration of this quantity shall play a crucial role in our proof, for the purposes of our theorem statement its role is to provide the initial condition of our scaling limit.To the authors' knowledge, this process is original.The proof of Theorem 1.4 shall be outlined in Subsection 1.7, at which point we shall explain the role of Y N t in the proof.Convergence will be stated in terms of the weak atomic metric on K, denoted as W a .The space of probability measures on K equipped with the weak atomic metric is denoted by P W a (K).This metric was introduced by Ethier and Kurtz [23] in the context of population genetics.Convergence in the weak atomic metric is equivalent to having both weak convergence of measures and convergence of the sizes and locations of the atoms.We provide a definition of the weak atomic metric in Appendix C.2.
Our main theorem is then the following.
Theorem 1.4.We take some deterministic initial profile ν 0 ∈ P(K) and fix a Wright-Fisher process on P(K) of rate Θ and initial condition ν 0 = ν 0 , which we denote as (ν t ) 0≤t<∞ .We consider a sequence of Fleming-Viot multi-colour Processes, denoted by ((( (1.9) We now rescale time according to t → N t.Then (χ N N t ) t>0 converges to (ν t ) t>0 in finite-dimensional distributions, in the following sense.We fix arbitrary n < ∞ and t = (t 1 , . . ., t n ) ∈ [0, ∞) n such that t 1 ≤ . . .≤ t n .We consider arbitrary sequences We then have that (1.10) Remark 1.5.If we take constant killing rate κ ≡ 1 and consider the corresponding Fleming-Viot multi-colour process, we recover the classical Moran model.This is well-known to converge to the Wright-Fisher process of rate 2 [22, (4.12)].On the other hand, we can check that Θ = 2 when κ ≡ 1.
Remark 1.6.We observe that, unless φ is constant (which only happens if κ is constant on D), the empirical measures χ N 0 will, in general, not converge to the same limit as the tilted empirical measures Y N 0 .We therefore no longer have (1.10) if we drop the requirement that N t N 1 → ∞ as N → ∞.This represents the following separation of timescales phenomenon.
We will establish in the proof of Theorem 1.4 that the tilted empirical measure Y N t evolves slowly over an O(N ) timescale, with (Y N N t ) t≥0 converging to the Wright-Fisher process.We further establish that the empirical measure χ N t converges on a shorter O(1) timescale to the tilted empirical measure Y N t .Theorem 1.4 then follows by combining these two facts.Therefore for large N , the empirical measure χ N t quickly approaches ν 0 over an O(1) timescale, before evolving like the Wright-Fisher process over the longer O(N ) timescale.

Background and related results
A similar separation of timescales has been obtained by Méléard and Tran in [36].They considered the evolution of traits in a population of individuals, where the individuals give birth (passing on their trait) and die in an age-dependent manner, and interact with each other through the effect of the common empirical measure of their traits upon their death rates (representing competition for resources).There the age component plays a similar role to spatial position in the present article.They found that the age component converges to a deterministic equilibria (which is dependent upon the traits) on a fast timescale, whilst the trait distribution evolves on a slow timescale, converging to a certain superprocess over the slow timescale as the population converges to infinity.
Aside from obtaining a different limiting process, they also employ a different proof strategy.In their setup, individuals give birth and are killed at rates which ensure that the slow component does not have large drift terms on the fast timescale, whereas it does in the present setup.This necessitates the different proof strategy.In Subsection 1.7 we shall outline the proof strategy of Theorem 1.4, at which point we shall elaborate on the difference between this proof and the proof in [36].
The ancestral paths of both the Fleming-Viot particle system and similar particle systems have been considered by a number of authors, for instance by Méléard and Tran [36], Grigorescu and Kang [28] and Burdzy et.al. [7,17,18,15].None of these make a link with the Wright-Fisher process.In a sequel to the present paper, we shall use Theorem 1.4 to link the ancestral paths of the Fleming-Viot particle system with a Wright-Fisher process on P(C([0, T ]; D)).This link was included in the original preprint version of this paper [43], and earlier in the author's PhD thesis [42,Chapter 4].
In [28], Grigorescu and Kang constructed the immortal particle, also known as the spine, of the Fleming-Viot particle system -the unique ancestral path from time 0 to time ∞.They introduced the Fleming-Viot multi-colour process, with the colours belonging to a finite set, in order to construct this process.The construction of the spine of the Fleming-Viot particle system was later extended to a very general setting by Bieniek and Burdzy [7,Theorem 3.1].Bieniek and Burdzy [7, Section 5] established that, when the state space is finite, the distribution of the spine of the Fleming-Viot particle system converges as N → ∞ to that of the driving Markov process (X t ) 0≤t<τ ∂ conditioned never to be killed -referred to in the literature as the Q-process [19,Section 3].They conjectured that this is also true for general state spaces [7, p.3752].Since then, Burdzy, Kołodziejek and Tadić in [17,18] have established a law of the iterated logarithm [18,Theorem 7.1] which, as they explain, hints that the conjecture of Bieniek and Burdzy should hold in the setting they consider.None of these articles draw a link with the Wright-Fisher process.
In a sequel to the present article, we shall prove Bieniek and Burdzy's conjecture, [7, p.3752], in the setting of the present paper.This proof was included in the original preprint version of this paper [43], and earlier in the author's PhD thesis [42,Chapter 4].This was the first proof of the conjecture outside of the finite state space setting.Subsequent to [42,Chapter 4] and [43], Burdzy and Engländer have established this conjecture in [15], when the driving process is Brownian motion killed at the boundary of its bounded domain.We emphasise that the proof strategy due to Burdzy et al. in [7,15] is completely different to the proof due to the present author in [42,Chapter 4] and [43], with no connection being made between the Fleming-Viot particle system and the Wright-Fisher process in [7,15].Bieniek and Burdzy's proof when the state space is finite [7, Section 5] used the finiteness of the state space in a seemingly essential way -they used the fact that if two particles are at the same location they must have the same probability of being the spine, and moreover the particles can only be at a finite number of possible locations.Burdzy and Engländer were able to use the same argument in [15] when the driving process is Brownian motion killed at the boundary of its domain by dividing the domain up into cubes and using the form of the multidimensional Gaussian distribution to argue that any two particles in the same cube must have almost the same probability of being the spine.On the other hand, the proof appearing in [42,Chapter 4] and [43], and which will appear in a sequel to the present article, instead leverages the connection between the Fleming-Viot particle system and the Wright-Fisher process established in Theorem 1.4.
The N -branching Brownian motion (N -BBM) consists of N particles evolving in between killing times as independent Brownian motions.At rate N , one kills the particle minimising or maximising a given fixed function.At the same time, as with the Fleming-Viot particle system, another particle chosen uniformly at random branches, so that the number of particles remains fixed.Clearly this particle system is similar to the Fleming-Viot particle system.Particle systems of this form were first introduced by Brunet and Derrida in [10].Such particle systems have been studied, for instance, by Brunet and Derrida [11], Durrett and Reminik [21], Maillard [34], and Berestycki, Brunet, Nolen and Penington [6].The genealogy of these particles systems has received particular attention -see also the work of Brunet, Derrida, Mueller and Munier [12,13], Mallein [35] and Penington, Roberts and Talyigás [39].
For the N -BBM studied in [34], the particles are in 1 dimension with the leftmost particle being killed at each killing time.It is a hard open problem to show that the genealogy of this particle system is given by a Bolthausen-Sznitman coalescent [34, p.1066], so we should not expect a Wright-Fisher process scaling limit.This conjecture has been proven for the similar near-critical branching Brownian motion by Berestycki, Berestycki and Schweinsberg in [5].On the other hand, in the "Brownian bees" particle system considered in [6], it is the particle furthest away from 0 which is killed.In contrast to the N -BBM, we should expect the this particle system to have a Wright-Fisher process limit after rescaling time by t → N t as in Theorem 1.4, in the opinion of the present author.The key distinction between these two Brunet-Derrida-type particle systems is that the killing mechanism in the latter has the effect of constraining the mass of particles.However, the genealogy of the Brownian bees particle system has not yet been addressed, nor has a Wright-Fisher process limit previously been established for any variant of this particle system.
A scaling limit for the geneaology of a sequential Markov chain Monte Carlo algorithm was established by Brown, Jenkins, Johansen and Koskela in [9,Theorem 3.2].This captures the phenomenon of ancestral degeneracy, which has a substantial impact on the performance of the algorithm.They established that the geneaology of an n-particle sample converges to Kingman's ncoalescent as the number of particles goes to infinity and time is suitably rescaled.This is suggestive of a Wright-Fisher process, since Kingman's coalescent is dual to the Wright-Fisher process (see [31,Appendix A]), but no such connection is made.
In the engineering literature, Mulatier, Dumonteil, Rosso and Zoia [38] considered a particle system whereby N Brownian particles branch at a rate λ, at which point another particle chosen uniformly at random is removed, conserving the number of particles.Clearly this is very similar to the Fleming-Viot particle system, with the difference being that here particle births trigger another particle chosen uniformly at random to be killed, rather than vice-versa.This is used as a toy model for neutrons in a nuclear reactor and their Monte Carlo simulation.They investigated the phenomenon of "clustering", in which particles cluster together in Monte-Carlo simulations of nuclear reactors, which has a substantial impact on the accuracy of these simulations.They explained this phenomenon as occurring when particle ancestries coalesce more quickly than particles are able to explore the space.They argued that this should occur on a timescale of N λ .However, it is unknown how quickly ancestries coalesce for such systems (when the branching rate is non-constant), even at the level of a conjecture.It should be straightforward to replicate the proof in the present paper for these systems, thereby quantifying how quickly ancestries coalesce via an analogue of Theorem 1.4.This would indicate how large N should be to avoid clustering.We will see in the following subsection that ancestral coalescence occurs more quickly when φ is non-constant (but N and λ are the same), so that a larger N would be needed to avoid clustering.

Effective population size
In population genetics, variance effective population size refers to the population of an idealised, spatially unstructured population with the same genetic drift per generation.For a variety of reasons, this effective population size is generally observed to be considerably less than the census population size [25].
We recall that (π, −λ, φ) is the principal eigentriple of the infinitesimal generator L. We obtained in Theorem 1.4 that, after rescaling time by t → N t, the Fleming-Viot multi-colour process converges to a Wright-Fisher process of rate Θ := . It is straightforward to combine Theorem 1.8 with Theorem A.1 to establish that individuals in the Fleming-Viot multi-colour process die, on average, λ times per unit time.If we remove space, and instead assume that each individual is killed at fixed Poisson rate κ ≡ λ, we obtain the classical static Moran model.We therefore define the variance effective population here to be the size of an equivalent static Moran model.
The Wright-Fisher process is well-known to arise as the limit of suitably rescaled Moran models [22, (4.12)].If we let ( η Moran,N 2 , we have that η Moran,⌊cN ⌋ N t converges to a Wright-Fisher process of rate Θ.It follows that We observe that N eff ≤ N , with equality if and only if φ is constant on D, which is equivalent to κ being constant on D. We offer the following heuristic interpretation of (1.11).We have from Theorem A.1 that On the other hand, the profile of the particles in the Fleming-Viot particle system will settle upon a close approximation of π.Therefore if ||φ|| L 2 (π) is much larger than ||φ|| L 1 (π) , then a small subset of individuals at any given time should be expected to subsequently survive for much longer than the average.These individuals will therefore have far more children than the average, having the effect of speeding up the coalescence time, hence reducing the effective population size.

A hydrodynamic limit theorem for the Fleming-Viot multi-colour process
Both the proof of Theorem 1.4 and our heuristic explanation of it will make use of the following hydrodynamic limit theorem for the Fleming-Viot multi-colour process.The hydrodynamic limit we obtain is given by the laws of the following killed Markov process.Definition 1.7.We define a D × K-valued killed strong Markov process, denoted by ((X t , η t )) 0≤t<τ ∂ , as follows.The process evolves in the first variable like the killed normally-reflected diffusion (X t ) 0≤t<τ ∂ defined in (1.1), with the killing time of ((X t , η t )) 0≤t<τ ∂ being the same as the killing time of (X t ) 0≤t<τ ∂ .In the second variable η t is a constant element of K up to the killing time τ ∂ , so that η t = η 0 for all 0 ≤ t < τ ∂ .After the killing time the process is sent to a fixed cemetery state.
Theorem 1.8.We consider the Fleming-Viot multi-colour process (( X N t , η N t )) t≥0 for N ≥ 2. Then there exists constants C T,N for 0 ≤ T < ∞ and N ≥ 2 such that C T,N → 0 as N → ∞, and such that for any initial condition ( X N 0 , η N 0 ) and any f ∈ B b ( D × K; R), we have that Proof of Theorem 1.8.We take the Fleming-Viot particle system associated to the killed strong Markov process ((X t , η t )) 0≤t<τ ∂ defined in Definition 1.7 (which is well-defined since the killing rate is bounded).We observe that its dynamics are identical to that of the Fleming-Viot multicolour process ( X N t , η N t ) t≥0 associated to (X t ) 0≤t<τ ∂ .We are therefore able to apply [47, Theorem 2.2] to the Fleming-Viot multi-colour process.
We prove in the appendix that ((X t , η t )) 0≤t<τ ∂ has the following large-time limit.Proposition 1.9.For arbitrary sequences (x i , η i ) 1≤i≤n in D × K we consider the process (X t , η t ) 0≤t<τ ∂ with initial distribution given by the empirical measure (1.14)

Heuristics for the proof of Theorem 1.4
The principal difficulty to be addressed Méléard and Tran considered in [36] the ancestries of a similar particle system in [36].There the individuals in the population have a trait and an age, with the individuals giving birth (passing on their trait) and dying in an age-dependent manner.The age component plays a similar role to spatial position in the present article.However, aside from obtaining a different scaling limit, they also employed a different proof strategy.The proof of Méléard and Tran in [36] extended to the particle system setting the strategy of Kurtz [30] and Ball, Kurtz, Popovic and Rempala [4], which concerned diffusions.In contrast, the proof in the present article extends to the particle system setting techniques of Katzenberger [29] (the author is not aware of this technique previously having been extended to the particle system setting), which also concerned diffusion processes.This is necessitated by the following qualitative difference between the two particle systems.
In [36], individuals have a trait x (the slow variable) and an age a (the fast variable).The speed-up of the timescale is given by the parameter n.On the fast timescale, they give birth at rate nr(x, a) + b(x, a), whilst dying at rate nr(x, a) + d(x, a).We observe that the fast term, nr(x, a), is the same in both the former and the latter.Consequentially, when they formulate the corresponding martingale problem, the slow variable does not have a large drift term on the fast timescale.The terms b(x, a) and d(x, a) may change quickly due to the fast evolution of the age term a -this is dealt with via averaging -but they remain O(1) on the fast timescale.
We may contrast this with the Fleming-Viot multi-colour process.We recall that the time change is t → N t.We consider a test function f ∈ C b (K) and observe that on the fast timescale the empirical measure of the colours evaluated against f , χ N N t (f ), satisfies We see that the drift term is of O(N ) on the fast timescale.In particular the change in position of an individual particle has an O(1) effect on the drift.A large deviations principle for the Fleming-Viot multi-colour process would provide controls on the drift valid over a sufficiently large timescale (a LDP for the Fleming-Viot particle system driven by Brownian motion with soft killing was established by Grigorescu in [27]), but would only control the drift on a fast timescale to O(N ).Since microscopic fluctuations in the position of individual particles have an O(1) effect on the drift, there would not seem to be any hope of obtaining adequate controls on the drift term in order to apply a compactness-uniqueness argument (in which one characterises the martingale problem solved by subsequential limits).
The key idea allowing us to deal with these large drift terms will be to consider the tilted empirical measure Y N t , which we recall was given in (1.8) as

Motivation for choosing Y N t
We take inspiration from Katzeberger's approach in [29].Consider a dynamical system in Euclidean space, ẋt = b(x t ), with an attractive manifold of equilibrium M and flow map ϕ(x, s).Katzenberger [29] established (under reasonable conditions) that the long term dynamics of the randomly perturbed dynamical system, can be obtained by considering the following nonlinear projection onto the manifold of equilibria, (1.17) We summarise Katzenberger's idea as follows.Since ∇̟ • b ≡ 0, the Stratanovich chain rule implies that d̟(x ǫ t ) = ǫ∇̟(x ǫ t ) • dW t .In particular, the large drift term has been eliminated from the above expression.We may rescale time to see that ̟(x whereby Wt is the Brownian motion Wt := ǫW t ǫ 2 . Since the dynamical system will be pushed towards the attractive manifold of equilibrium on a fast timescale, one can then argue that ). (1.18) We can therefore obtain a scaling limit for x ǫ t ǫ 2 . This scaling limit is a diffusion on M.
Whilst Katzenberger's results in [29] were restricted to finite-dimensions, we may ask what the analogue of ̟(x ǫ t ) is in the present setting?We will see that Y N t can be thought of as being analogous to the quantity ̟(x ǫ t ) considered by Katzenberger.We denote by ((X t , η t )) 0≤t<τ ∂ the killed Markov process defined in Definition 1.7.It follows from Theorem 1.8 that we can think of the Fleming-Viot multi-colour process as a random perturbation of the dynamical system with flow map Proposition 1.9 provides for the large-time limits of this flow.We therefore see from Proposition 1.9 that the analogue of ̟(x ǫ t ) is given by We discard π since it is constant, leaving only Y N t .There is a second heuristic reason for examining Y N t .If x(t) and y(t) both satisfy the ODEs ẋ = c(t)x and ẏ = c(t)y for the same c(t), then y (t)  x(t) is constant.If we now instead consider the SDEs dX t = c t X t dt + ǫdW t and dY t = c t Y t dt + ǫdW t , Yt Xt will satisfy an SDE with only O(ǫ 2 ) drift terms, since the O(1) terms will cancel out as in the deterministic case (one can check this using Ito's lemma).
We now define for E ∈ B(K) the following, which shall be used throughout the proof of Theorem 1.4, The important point is that, to leading order, both P N,E and Q N evolve with drift terms proportional to themselves, with the same constant of proportionality.Indeed on the slow timescale the killed process X t satisfies dφ(X t ) = Lφ(X t ) + martingale terms = −λφ(X t ) + martingale terms.
Therefore between jumps, and including the process of killing the particles, the quantities P N,E t and Q N t evolve with drift terms −λP N,E t dt and −λQ N t dt respectively.Furthermore if particle (X N,i , η . This occurs at Poisson rate κ(X i t ).Thus, after the time-change t → N t, we can write (1.21) In particular, on the fast timescale given by t → N t, both P N,E N t and Q N N t both evolve with drift proportional to themselves, with the same constant of proportionality given by We observe that the change in position of an individual particle has an O(1) effect on the constant of proportionality (1.22).However these large effects cancel out by placing the normalisation at microscopic scale in the denominator, as the constant of proportionality in both the numerator and denominator must be the same.
From these considerations we see that, having rescaled time by t → N t, Y N N t should satisfy an SDE with O(1) drift terms.It is straightforward to see that the martingale terms will have O(1) quadratic variation on this timescale.It follows that Y N N t should be susceptible to a compactnessuniqueness argument, in which we establish tightness before uniquely characterising subsequential limits by characterising their drift and quadratic variation.We shall thereby obtain a scaling limit for Y N N t .We note that since the leading order terms in (1.21) cancel out, we shall need to calculate the "O(1)dt" higher order terms, which is responsible for much of the computational complexity in the proof of Theorem 1.4.

The relationship between χ N N t and Y N N t
The above will allow us to characterise the limit in distribution of (Y N N t ) t≥0 .Our goal, however, is to characterise the limit in distribution of (χ N N t ) t≥0 .We would therefore like to relate The key observation here is that, on the original slow timescale, the colour of a particle and its spatial position become "independent" after an O(1) time.To be more precise, for any given A ⊆ K, the spatial profile of particles whose colours belong to A, converges over an O(1) timescale to the quasi-stationary distribution π, a deterministic profile.Thus for different subsets A, B ⊆ K, the number of particles with colours belonging to A and B may well be different, but the spatial profiles of the two sets of particles will be the same for large N .Since the particles corresponding to different colours have the same spatial profile, weighting the empirical measure of the colours according to the right eigenfunction evaluated at the corresponding spatial positions will have no effect.It follows that χ N t and Y N t will be close after an O(1) time.
On the fast timescale, χ N N t will therefore be close to Y N N t .This is analogous to the second step in Katzenberger's approach in [29], described above in (1.18).
The proof of Theorem 1.4 will therefore follow by establishing that (Y N N t ) t≥0 converges to the Wright-Fisher process, and showing that χ 1.8 Why is the limit a Wright-Fisher process?
It follows from the above heuristic that χ N t should evolve over an O(N ) timescale, and that χ N N t should converge to some P(K)-valued process (at least on subsequences).In the proof of Theorem 1.4, we will calculate that the limit is a Wright-Fisher process.However, it is not readily apparent from this why the limit should necessarily be a Wright-Fisher process.We offer here a heuristic argument for why we should expect the limit to be a Wright-Fisher process.
We let (ν t ) t≥0 be the limit to be determined of (χ N N t ) t≥0 (perhaps along a subsequence).By the aforedescribed separation of timescales phenomenon, this will be a P(K)-valued process, with the dependence on the spatial component "averaged out".We consider an arbitrary measurable map ι : K → K.We can think of ι as relabelling the colours.The key observation is that {(X N,i t , ι(η N,i t )) : 1 ≤ i ≤ N } is itself a Fleming-Viot multi-colour process -the Fleming-Viot multi-colour process remains one after relabelling the colours.It follows that whatever dynamics (ν t ) t≥0 has, (ι # νt ) t≥0 must have the same dynamics.This allows us both to exchange colours and to relabel different colours as the same colour.
It follows that there should exist continuous functions b, σ 11 : [0, 1] → R and σ 12 : such that the following are continuous martingales for all disjoint and Moreover, since a colour of mass p and a colour of mass q can be relabelled to be a single colour of mass p + q, it is clear that Furthermore, the whole colour space K must have total mass 1, so νt (K) ≡ 1.From these considerations, we see that the only possibility is that, for some constant θ, We recognise the Wright-Fisher diffusion described in Subsection 1.2.In light of Proposition 1.3, it is therefore natural that our unknown limit (ν t ) t≥0 should be a Wright-Fisher process.

Hard catalyst killing
The setting of the present paper -in which the Fleming-Viot particle system is driven by diffusions with soft killing -has been chosen to establish the connection between the Fleming-Viot process and the Wright-Fisher process with a minimum of technical difficulties.Nevertheless, in Section 5 we will extend this connection to the original setting considered by Burdzy, Hołyst and March [16], in which the Fleming-Viot particle system is driven by Brownian motion with instantaneous killing at the boundary (hard killing).To avoid switching back and forth between Fleming-Viot particle systems with different dynamics, we will only consider the case of hard killing in Section 5, the final section prior to the appendix, and in Appendix E. Our results in the case of hard killing are therefore stated and proved in Section 5.
We emphasise that the proof strategy employed in the present paper may be applied to the Fleming-Viot particle system driven by more general killed Markov processes.The principal requirements to apply this proof strategy are that: 1. the driving killed Markov process (X t ) 0≤t<τ ∂ is Feller; 2. its infinitesimal generator has a positive, continuous and bounded principal right eigenfunction φ; 3. L µ (X t |τ ∂ > t) converges to a unique quasi-stationary distribution for any initial condition µ; 4. we can constrain the empirical measure of the spatial positions of the particles m N t to a tight set of measures over any O(N ) timescale, precluding in particular the mass from accumulating at the boundary.
In the case of hard killing at the boundary in a bounded domain, the main additional difficulty is to establish Requirement 4. We will obtain such controls for the Fleming-Viot particle system driven by Brownian motion with hard killing in Section 5.With these controls in hand, the extension of our results to this setting proceeds by essentially the same proof.
When the domain is unbounded, the situation is much more delicate.For diffusions on the positive real line R >0 with hard killing at 0, one could probably establish similar results for Ornstein-Uhlenbeck dynamics, using the strong negative drift to control the particles far away from 0 over an O(N ) time scale.For this process the principal right eigenfunction is unbounded (it's given by φ(x) = x) -instead strong controls on the mass of particles far away from 0 (where φ is large) over an O(N ) timescale would be required to replace the boundedness of φ.To be more precise we would need to show, for any T < ∞, that sup 0≤t≤N T 2 is bounded by some uniform constant with probability arbitrarily close to 1, uniformly in N .On the other hand, we should not expect the Fleming-Viot particle system driven by Brownian motion with drift −1 to have a Wright-Fisher process scaling limit, this drift being too weak to adequately control the particles.Indeed, it is a hard open problem to show that the genealogy of the very similar N -BBM is given by a Bolthausen-Sznitman coalescent [34, p.1066].

Structure of the paper
A summary of the notation which we shall need for our proof is given in Section 2. The proof of Theorem 1.4 shall rely on a number of calculations of the quantity Y N t , defined in (1.8).To avoid obscuring our proof with calculations, we will carry out these calculations in Section 3. We shall then prove Theorem 1.4 in Section 4. We will extend our results to the Fleming-Viot multi-colour process driven by Brownian motion with instantaneous killing at the boundary in Section 5. We conclude with the appendix.

Notation for the proof of Theorem 1.4
We recall from (1.20) that we define for E ∈ B(K), We recall the definition of m N t and χ N t from (1.4), and further define m N,E t for E ∈ B(K), We recall from Appendix A that the infinitesimal generator of the reflected diffusion with (respectively without) soft killing is denoted by L (respectively L 0 ).We further recall that the Carre du champs operator of the latter is denoted as Γ 0 , and is defined on the algebra A. This algebra contains the principal right eigenfunction φ of L, by Theorem A.1.We further define (2.24)

O Notation
The following notation shall significantly simplify our calculations.
For any finite variation process (X t ) 0≤t<∞ we write V t (X) for the total variation Moreover for all càdlàg processes (X t ) 0≤t<∞ we write Given some family of random variables {X N : N ∈ N} and non-negative random variables {Y N : N ∈ N}, we say that X N = O(Y N ) if there exists a uniform constant C < ∞ such that |X N | ≤ CY N .Note that we shall abuse notation by using an equals sign, rather than an inclusion sign.
We now define the notion of process sequence class.Given sequences of processes {(X N t ) t≥0 : N ∈ N} and {(Y N t ) t≥0 : N ∈ N}, we say that: 1.
) if for all N ≥ N 0 (for some N 0 < ∞) and for some C < ∞, X N t is a finite variation process whose total variation is such that t , as process sequence classes.Note that as with sequences of random variables, we abuse notation by using an equals sign rather than an inclusion sign.
Suppose that we have constants r N > 0 (N ∈ N).For a given sequence of processes Y N , write as in 1 -3 can be chosen uniformly in α ∈ A. It will be useful to take the sum and intersection of process sequence classes and specific sequences of processes.To be more precise, for any process sequence classes A N t and B N t , and the sequence of processes F N t , we say that: 1.
) means that for some 0 < C < ∞ there exists for all N large enough martingales G N t and finite-variation processes H N t such that: and

Characterisation of Y N t
In the proof of Theorem 1.4, we will obtain a scaling limit for the tilted empirical measure of the colours on a fast timescale, (Y N N t ) t≥0 .This will rely on various calculations characterising its drift and quadratic variation.To avoid obscuring the proof of Theorem 1.4 with calculations, we perform these calculations here.
In this section, we write (Ω, G, (G t ) t≥0 , P) for the underlying filtered probability space.
Remark 3.1.In the present section, all statements as to processes belonging to various process sequence classes should be interpreted as being uniform over all choices E, F ∈ B(K) (or over all sequences of G 0 -measurable random E N , F N ∈ B(K), in the case of Part 4 of Theorem 3.2).
We recall that .
In this section, we prove the following theorem.
Theorem 3.2.We have the following, uniformly over all choices of E, F ∈ B(K):

There exists martingales K
for E ∈ B(K) and such that for all E, F ∈ B(K). (3.31) ). (3.32) 4. Parts 1-3 remain true if E and F are replaced with a sequence of σ 0 -measurable random sets E N and F N .

Proof of Theorem 3.2
We firstly introduce some definitions.We define We write H = H( r) for the Hessian and calculate and H( r We have the key property We further define We shall firstly establish the following proposition, which characterises P N,E . Proposition 3.3.We have for all E ∈ B(K) that whereby M N,E are martingales which satisfy for all E, F ∈ B(K), (3.38) We write M N t for M N,K t .
We will then establish Part 1 of Theorem 3.2, followed by the following version of Ito's lemma.
Using the boundedness of φ and the fact that there are no simultaneous killing events, we obtain (3.32) from (3.30).
Since in parts 1-3 of Theorem 3.2 the statements of processes belonging to various process sequence classes are uniform over all choices E, F ∈ B(K), Part 4 is immediate.
It remains to prove Proposition 3.3, Part 1 of Theorem 3.2, and Lemma 3.4.

Proof of Proposition 3.3
Since N is fixed throughout this proof, we neglect the N superscript for the sake of notation, where it would not create confusion.We recall that τ i n represents the n th killing time of particle (X i , η i ) (τ i 0 := 0), τ n is the n th killing time of any particle (τ 0 := 0), and J N t := 1 N sup{n : τ n ≤ t} is the number of killing times up to time t, renormalised by N .We denote We define for E ∈ B(K) the processes and (3.41) We will firstly establish that A E t , B E t and C E t are martingales, so that is a martingale.We therefore have (3.37).We will then establish (3.38) by establishing it for E = F and for E, F disjoint.A E t is a martingale We have that if particle X i dies at time t, then each j = i is selected with probability 1 N −1 , so that the expected value of φ E (X i t , η i t ) is given by: 1 Therefore summing over τ i n ≤ t, we see that 1 t is a martingale Since Lφ = −λφ, we see that the following is a martingale, , φ is a Lipschitz process.By considering seperately the quadartic variation of the continuous motion between jumps and at the jumps, it follows that whereby we define To characterise H N,E,F t we split into the cases that E and F are disjoint, and that respectively.Therefore the expected value of φ E (X i Then using the killing rate to characterise the rate at which killing events happen, we see that for some martingale M N,E t .It is straightforward to then see that for all N sufficiently large (which ).We therefore obtain (3.38) in the case that The expected value of this at time τ i n − is then t (P N,E P N,F ).We have therefore obtained (3.38) with E ∩ F = ∅.
Having established (3.38) both in the case that E = F and the case that E ∩ F = ∅, the case of arbitrary E, F follows by linearity.

Proof of Part 1 of Theorem 3.2
We recall that F , H and R were defined in (3.33), (3.34) and (3.36) as We decompose Then by Ito's lemma we have We can therefore calculate for all E, F ∈ B(K).Combining (3.44) with (3.45) we have for all E, F ∈ B(K) disjoint.We also have that Since Q N t is bounded below away from 0, by bounding the partial derivatives of F we can calculate for all E, F ∈ B(K) disjoint that for all E, F ∈ B(K) disjoint.Combining (3.46) with (3.47) we have Part 1 of Theorem 3.2.

Proof of Lemma 3.4
We take 0 ≤ t 0 ≤ t 1 ≤ t and write We may calculate Thus by Taylor's theorem, (3.48), the fact that almost surely there are no simultaneous killing events, and the fact that Q N t is bounded above and below away from 0, we have Since κ and φ are bounded, it is straightforward to then see that

Proof of Theorem 1.4
With the calculations of Section 3 in hand, we now prove Theorem 1.4.We shall make use of the Wasserstein distance W and the weak atomic metric W a , which are defined in Appendix C. We shall firstly prove the following proposition.
In particular, taking f = 1, we have for any E ∈ B(K) that Heuristically, this says that over an O(1) timescale, the number of particles whose colour belongs to E is given by Y N,E t , and the spatial distribution of these particles is given by π, for any E ∈ B(K).Using Proposition 4.1 and the calculations of Section 3, we will then establish that (Y N N t ) 0≤t≤T converges in distribution to the Wright-Fisher process of rate Θ.Proposition 4.2.We take some deterministic initial profile ν 0 ∈ P(K) and define (ν t ) 0≤t<∞ to be a Wright-Fisher process of rate Θ and initial condition ν 0 := ν 0 .We then consider a sequence of Fleming-Viot multi-colour Processes ( X N t , η N t ) 0≤t<∞ .We assume that Y N 0 → ν 0 in W a in probability.
We fix T < ∞ and rescale time by t → N t.We then have the convergence We recall in particular that (ν t ) 0≤t≤T ∈ C([0, T ]; P W (K)) almost surely, by Theorem D.2.We now take a sequence ( t N ) 2≤N <∞ = ((t N 1 , . . ., t N n )) 2≤t≤N converging to t = (t 1 , . . ., t n ) as in the statement of Theorem 1.4.It follows that Recalling the positivity and boundedness of φ from Theorem A.1, we observe that ) N ≥1 must also be a tight sequence of random measures.It therefore follows from We have left only to strengthen the notion of convergence to convergence in the weak atomic metric.After proving propositions 4.1 and 4.2, we shall establish the following proposition.Proposition 4.3.We recall that Ψ(u) := (1 − u) ∨ 0 is the function used to define the W a metric in Appendix C.2.For all δ > 0 there exists ǫ > 0 such that Note that the above sum is well-defined as the terms are non-zero only for k, ℓ ∈ supp(χ N 0 ).
We may therefore apply the compact containment condition, Lemma C.5, to conclude that {L(χ N N t N k )} is tight in P(P W a (K)) for all 1 ≤ k ≤ n, so that we have Theorem 1.4.We have left to prove propositions 4.1, 4.2 and 4.3

Proof of Proposition 4.1
We fix E ∈ B(K) and f ∈ C b ( D).We write We take the D × K-valued killed strong Markov process ((X t , η t )) 0≤t<τ ∂ defined in Definition 1.7.
It follows from Theorem 1.8 that there exists c t → 0 as t → ∞ such that, for all N < ∞ and initial conditions ( X N 0 , η N 0 ) we have that It follows from Theorem 1.8 that for all t < ∞ and N ≥ 2 there exists C t,N < ∞ such that for any initial condition ( X N 0 , η N 0 ), with C t,N → 0 as N → ∞ for fixed t < ∞.On the other hand we observe that Therefore combining (4.54) with (4.55) we obtain that Proposition 4.1 then follows by applying (3.32).

Proof of Proposition 4.2
Our proof proceeds in the following 2 steps: 1. We fix ǫ > 0 and take {k 1 , k 2 , . ..} to be a dense subset of K. Then for all i we can find ) in distribution to a Wright-Fisher diffusion of rate Θ and initial condition (ν 0 (A 0 ), . . ., ν 0 (A n )).

We then use this to prove that
Step 1 We recall that the martingale K N,E t was defined in Theorem 3.2, whilst Λ N,E t and Λ N t were defined in (2.24) to be given by We further define ( )) 0≤t≤T .
We will now verify that {L(( ], for the purpose of checking [1, Condition (A)].In particular we have by (3.32) that for some and hence {L(( Then applying (4.50) and Fubini's theorem to (3.30), we obtain We consider a subsequential limit in distribution of {( Y N N t ) 0≤t≤T }, 0≤t≤T , which by Part 3 of Theorem 3.2 must have continuous paths.Using (4.58) we conclude that (K N,A 0 N t , . . ., K N,An N t ) 0≤t≤T converges in D([0, T ]; R n+1 ) in distribution along this subsequence to 0≤t≤T is a martingale with respect to its natural filtration σ t .We then obtain from (3.31) that for all 0 ≤ i, j ≤ n, is a martingale for all N , so that by (4.57) and (4.58), We have that in probability.Thus each subsequential limit ( Y t ) 0≤t≤T must be a solution of the n+1-type Wright-Fisher diffusion of rate Θ with initial condition (ν 0 (A 0 ), . . ., ν 0 (A n )), which is unique in law.Therefore we have convergence of the whole sequence in D([0, T ]; R n+1 ) in distribution to this Wright-Fisher diffusion.

Proof of Proposition 4.3
It follows from (4.53) that it suffices to verify the following condition.
Condition 4.4.For every δ > 0, there exists ǫ > 0 such that lim sup Therefore, using Gronwall's inequality, there exists uniform C < ∞ such that is a supermartingale for all N large enough.Therefore we have for all N large enough that We have assumed that the initial conditions Y N 0 converge in the weak atomic metric, so Lemma C.5 implies that sup We have therefore verified Condition 4.4 and hence established Proposition 4.3.This concludes the proof of Theorem 1.4.
We write L for its infitesimal generator, which is just the half Dirichlet Laplacian.We write φ ∈ C 0 (D; R >0 ) ∩ C ∞ (D) for the unique principal right eigenfunction of L, of eigenvalue −λ < 0. In general, quasi-stationary distributions correspond to left eigenmeasures of the infinitesimal generator [37, Proposition 4], which in this case corresponds to the normalised right eigenfunction φ.Therefore the unique QSD of (X t ) 0≤t<τ ∂ , denoted as π, is given by As in the soft killing case, the rate of the limiting Wright-Fisher process is given by . (5.65) We again define the tilted empirical measure of the colours by ∈ P(K). (5.66) We prove the following analogue of Theorem 1.4.
Theorem 5.2.We take some deterministic initial profile ν 0 ∈ P(K) and fix a Wright-Fisher process on P(K) of rate Θ and initial condition ν 0 = ν 0 , which we denote as (ν t ) 0≤t<∞ .We consider a sequence of Fleming-Viot multi-colour Processes, denoted by ((( (5.67) We further require the following condition, lim sup (5.68) We now rescale time according to t → N t.Then (χ N N t ) t>0 converges to (ν t ) t>0 in finite-dimensional distributions, in the following sense.We fix arbitrary n < ∞ and t = (t 1 , . . ., t n ) ∈ [0, ∞) n such that t 1 ≤ . . .≤ t n .We consider arbitrary sequences ( t N ) 2≤N <∞ := ((t N 1 , . . ., t N n )) 2≤N ≤∞ such that: We then have that (5.69) We observe that the only difference with Theorem 1.4 is the condition (5.68), which is necessitated by the fact that the domain is no longer compact.Indeed when we considered reflected diffusions with soft killing, the domain D was compact, with the principal eigenfunction φ being bounded away from 0. However in the case of hard killing, the domain D is non-compact, with φ vanishing at the boundary.As a consequence of this, we must establish controls on the mass near the boundary.In order to obtain a hydrodynamic limit theorem over a fixed time horizon for the Fleming-Viot All that remains is to deal with the initial time [0, T 0 ], which can be addressed with a crude bound.We observe that, for any T 0 , δ 1 > 0, in order for a given particle to enter B(∂D, δ 1 ), it either has to start within B(∂D, 2δ 1 ), or else travel at least a distance δ 1 in time T 0 (note that killing only occurs at the boundary).The former possibility can be controlled by (5.68), the latter by controlling the distance travelled by Brownian motion in time T 0 .We obtain that for all ǫ > 0 there exists T 0 > 0 and δ 1 > 0 such that lim inf (5.82) Therefore, for given ǫ > 0, we choose δ 1 , T 0 > 0 for which we have (5.82).For this same T 0 , ǫ > 0, we then obtain δ 0 > 0 such that we have (5.81).Taking δ := δ 0 ∧ δ 1 , we obtain (5.73).

Analogue of the calculations of Section 3
We obtain the following analogue of Theorem 3.2.
Theorem 5.4.We fix arbitrary ǫ > 0, and localise up to the stopping time τ N ǫ defined in (5.74).None of the following statements should be understood to be uniform in ǫ, but rather should be understood as statements for arbitrary fixed ǫ > 0. We have the following, uniformly over all choices of E, F ∈ B(K):

There exists martingales K
for E ∈ B(K), and such that ). (5.85) It is straightforward to obtain the following analogue of Proposition 3.3, by examining the martingale Proposition 5.5.We have for all E ∈ B(K) that .86) whereby M N,E are martingales which satisfy for all E, F ∈ B(K) (5.87) Using that Q N t is bounded from below away from 0 for t ≤ τ N ǫ , uniformly in N , we obtain Part 1 of Theorem 5.4 in precisely the same manner that we obtained Part 1 of Theorem 3.2 in Subsection 3.1.2.
Replacing the boundedness of the jump rate with a bound in terms of the number of jumps, we obtain the following analogue of Lemma 3.4.Lemma 5.6 (Ito's Lemma).We have

.88)
We then obtain parts 2-4 of Theorem 5.4 as in the proof of Theorem 3.2.

Proof of Theorem 5.2
We firstly prove the following analogue of proposition 4.1.
Proposition 5.7.For all E ∈ B(K) and f ∈ C b ( D) we have that

.89)
In particular, taking f = 1, we have for any It follows from the proof of [46, Proposition 4.10] that there exists M < ∞ and p ∈ (0, 1), dependent upon neither N nor t, such that the number of jumps of any particle between the times t ∧ τ N ǫ and (t + 1) ∧ τ N ǫ is stochastically dominated by the sum of M independent geometric random variables.It follows that {J N (t+1)∧τ N ǫ − J N t∧τ N ǫ : t ≥ 0, N ≥ 2} is uniformly bounded in L 2 (P), hence uniformly integrable.
We now calculate We see that It follows from the above that Since ǫ > 0 is arbitrary, Lemma 5.8 follows from (5.75).
Proof of Proposition 5.10.We follow the same proof strategy as the proof of Proposition 4.3, replacing Theorem 3.2 with Theorem 5.4, applying Lemma 5.8 in the obvious manner, and replacing the supermartingale (4.62) with for some sufficiently large constant C < ∞.
Having established Proposition 5.10, we may then apply Lemma C.5 along with (5.75) to conclude that {L(χ N N t N k )} is tight in P(P Wa (K)) for all 1 ≤ k ≤ n, so that we have Theorem 1.4.

A Reflected diffusions with soft killing A.1 Definition
We consider a normally reflected diffusion (X 0 t ) 0≤t<∞ in the domain D corresponding to a solution of the Skorokhod problem.In particular, for any filtered probability space on which is defined the m-dimensional Brownian motion W t and initial condition x ∈ D, there exists by [32, Theorem 3.1] a pathwise unique strong solution of the Skorokhod problem where W s is a Brownian motion and the local time ξ t is a non-decreasing process with ξ 0 = 0.This corresponds to a solution of the submartingale problem introduced by Stroock and Varadhan [41], and is a Feller process [41, Theorem 5.8, Remark 2] (and hence strong Markov).It is then straightforward (using a separate probability space on which is defined an exponential random variable) to construct an enlarged filtered probability space on which (X 0 , W, ξ) is a solution of the Skorokhod problem and on which there is a stopping time τ ∂ corresponding to the ringing time of a Poisson clock with position dependent rate κ(X 0 t ), from which is constructed the killed process (X t ) 0≤t<τ ∂ .This killed process is a solution to where W s is an m-dimensional Brownian motion and the local time ξ t is a non-decreasing process with ξ 0 = 0. Since X 0 t is Feller, the process X t is therefore also Feller (and hence strong Markov).We write L 0 /L = L 0 − κ for the infinitesimal generators of X 0 and X respectively, having the same domains D(L 0 ) = D(L).We further define the Carre du Champs operator Γ 0 on the algebra A, We recall that φ is normalised so that π, φ = 1.We take (x i , η i ) 1≤i≤n ∈ ( D × K) n and calculate φ(x i ).
We apply this to both the numerator and denominator of the right hand side of (2.105) to obtain (1.14).

C Spaces of measures
For a given topological space S we write B(S) for the Borel σ-algebra on S, and write P(S) for the space of probability measures on B(S), equipped with the topology of weak convergence of measures.We write M(S) for the space of all bounded Borel measurable functions on S.

C.1 The Wasserstein metric
For general separable metric spaces (S, d) we let W denote the Wasserstein-1 metric on P(S) generated by the metric d∧ 1, which metrises P(S) [26,Theorem 6].We write P W (S) for the metric space (P(S), W).The following therefore follows from the Skorokhod representation theorem.We similarly obtain the following lemma.

C.2 The Weak Atomic Metric
Convergence in our scaling limit is given in terms of the weak atomic metric, introduced by Ethier and Kurtz in [23].We shall define the weak atomic metric on the colour space (K, d) (which we recall is assumed to be a complete, separable metric space).We write P W a (K) for P(K) equipped with the metric W a .
In [23], Ethier and Kurtz defined the weak atomic metric on the space of all finite, positive, Borel measures, whereas we restrict our attention to probability measures on K.We fix Ψ(u) = (1− u)∨ 0 and define the weak atomic metric to be In [23] they used the Levy-Prokhorov metric instead of the W-metric, and let Ψ be an arbitrary continuous, non-decreasing function such that Ψ(0) = 1 and Ψ(1) = 0. We make the above choices for simplicity (note that W is equivalent to the Levy-Prokhorov metric [26,Theorem 2]).Convergence in the weak atomic metric is equivalent to weak convergence of measures and convergence of the location and sizes of the atoms.

We have both of the following:
(a) W(µ n , µ) → 0 as n → ∞; (b) there exists an ordering of the atoms {α i δ x i } of µ and the atoms {α n i δ x n i } of µ n so that α 1 ≥ α 2 ≥ . . .and lim n→∞ (α n i , x n i ) = (α i , x i ) for all i.
Remark C.4.Note that (2a) is equivalent to µ n → µ weakly by Proposition C.1.Thus measures are close in the weak atomic metric if and only if they are both close in the Wasserstein-1 metric W and have similar atoms.For instance 1  2 Leb [0,1] + .
We note by [23, p.5] that B(P(K)) = B(P Wa (K)), so that probability measures in P(P Wa (K)) are probability measures in P(P(K)) and vice-versa.It will be useful to be able to characterise tightness in both P(P W a (K)) and P(D([0, T ]; P W a (K))).
Ethier and Kurtz established in [23, Lemma 2.9] the following tightness criterion.
(3.108)Note that the above statement is slightly different from the statement of [23,Lemma 2.9].It is straightforward to see that the two statements are equivalent for families of probability measures; for our purposes this lemma statement will be easier to use.

t)
0≤t<∞ be the N -individual static Moran model (where each individual dies at Poisson rate λ), and define the constant c = and a family of process sequence classes {A N,α t : α ∈ A}, we say that X N,α t = A N,α t uniformly if the constants C α and N α 0 used to define X N,α t = A N,α t

Proposition 4 . 1 .
For all E ∈ B(K) and f ∈ C b ( D) we have that

W
a (µ, ν) := W(µ, ν) 4. Parts 1-3 remain true if E and F are replaced with a sequence of σ 0 -measurable random sets E N and F N .Proof of Theorem 5.4.It is useful (and simplifies our calculations) to note that since φ vanishes on the boundary and killing only occurs on the boundary, we necessarily have that φ(B .90) We note that τ N ǫ > 0 implies a uniform positive lower bound on P m N 0 (τ ∂ > T ), where we think of the empirical measure m N 0 as the initial condition of a single killed Brownian motion, killed at time τ ∂ .Using this fact, and using Theorem 5.4, Theorem E.3 and Proposition E.4 in place of Theorem 5.4, Therem 1.8 and Proposition 1.9 respectively, the proof of Proposition 5.7 is identical to the proof of Proposition 4.1 found in Subsection 4.1.The characterisation of Y N t in the setting of soft killing given in Section 3 does not involve dJ N t terms.This is a consequence of the fact that the jumps occur at a (position dependent) Poisson rate.On the other hand, as a consequence of the hard catalyst killing, the characterisation of Y N t given in Subsection 5.3 does involve such terms.Consequentially we shall require the following lemma.