On a Class of Discrete Generation Interacting Particle Systems 1

The asymptotic behavior of a general class of discrete generation interacting particle systems is discussed. We provide L p-mean error estimates for their empirical measure on path space and present sufficient conditions for uniform convergence of the particle density profiles with respect to the time parameter. Several examples including mean field particle models, genetic schemes and McKean's Maxwellian gases will also be given. In the context of Feynman-Kac type limiting distributions we also prove central limit theorems and we start a variance comparison for two generic particle approximating models. VisionSmart through a MITACS center of excellence entitled " Prediction in Interacting Systems " .


Introduction
Interacting particle systems play a prominent role in such diverse areas as nonlinear filtering, population genetics, and statistical mechanics.In these fields, one often employs particles systems to approximate complicated models in order to facilitate simulation, computation, or mathematical derivation.Proper usage of these approximations requires knowledge of the precise conditions under which various types of convergence to the underlying model takes place.Within, we consider discrete time models, propose a particle system approximation algorithm that is most effective for actual computations, and study the fidelity of this and other approximations through rates of convergence.We illustrate our algorithm through application to nonlinear filtering, Feynman-Kac formulae, mean field particle models, and McKean's Maxwellian gas model.
The recent work of Del Moral and Miclo [5] effectively covers the design, application, and analysis of the most common interacting particle system approximations.We propose and analyse an interacting particle algorithm not studied there that is very practical for computer implementation of classical nonlinear problems.Indeed, according to our analysis in section 5 of this paper, our algorithm provides a lower variance estimator of the underlying model than its predecessors in [5].Our algorithm fits into the general framework given in Crisan, Del Moral, and Lyons [2] but is not realized as one of their branching corrections.Still, the analysis in [2] applies to our algorithm and is continued herein.Our algorithm and analysis were motivated in part by Blount and Kouritzin [1] and Del Moral and Miclo [5] respectively.
We start section 2 with the basic notation and our new algorithm.This is followed in subsections 2.2 and 2.3 by the existing algorithm and the amalgamated algorithm, including both algorithms of subsections 2.1 and 2.2.We also explain the significance of our work through the Feynman-Kac distributions in subsection 2.3.Our main results are summarized in subsection 2.4.Section 3 is devoted to statement, proof, and application of our consistency results.In particular, the precise statement and proof of our path space mean ergodic theorem is in subsection 3.1, our uniform mean ergodic theorem development is contained in subsection 3.2 holds, and our applications are in subsection 3.3.Central limit theorem behaviour is investigated in section 4. Finally, the relative advantage of the new algorithm over its predecessor is established through the variance comparison of section 5.

Notation and New Algorithm
For any measurable space (E, E) we denote by M(E) the space of all finite signed measures on E, by M 1 (E) ⊂ M(E) the subset of all probability measures and by B b (E) the space of real bounded measurable functions with supremum norm . .For any µ ∈ M(E), f ∈ B b (E) and for any integral operators K and K on B b (E) we let KK be the composite operator and K(f ), µK be respectively the function, measure on E defined by K(f )(x) = E K(x, dy) f (y) and (µK)(f ) = E µ(dx) K(x, dy) f (y).
Let { K n,η ; n ∈ N , η ∈ M 1 (E)} be a collection of Markov transitions on E. Starting from this family, we introduce our main object of study.This is an N -interacting particle system (abbreviate N -IPS) ξ n = ( ξ 1 n , . . ., ξ N n ), n ∈ N, which is a Markov process with state space E N and transition probability kernels Here for an infinitesimal neighborhood of the point y = (y 1 , . . ., y N ) ∈ E N , and δ a is the Dirac measure at a ∈ E.
We also introduce the empirical distribution of the N -path particles ( which is a random measure on E n+1 , n ∈ N. When the transition probability kernels K n,η are sufficiently regular we will show that for any n ∈ N distributions (2) converge weakly, as where the distribution flow {η n ; n ∈ N} is solution of the measure valued dynamical system Our study concerns approximations (3,4) via empirical measures (2) of N-IPS (1) that were motivated in part by the work of [1].The advantage of our new approach over the previous algorithm (discussed immediately below) can easily be seen in simulations and is explained in section 5 herein.

Existing Algorithm
The measure-valued process (4) and the corresponding N-IPS approximating model (1) can be viewed as a natural extension of discrete time models introduced in a previous work by one of the authors.More precisely, the non linear measure valued process introduced in [3] and further studied in [5] is defined as in ( 4) by replacing the one step transformation η → η K n,η by an abstract transformation η → Φ n (η) so that (4) takes the form As noticed in [3], p. 475-476, if we have for any n and η then the desired distribution flow {η n ; n ∈ N} can alternatively be approximated by an N -IPS defined as in (1) by replacing each transition It is interesting that in this strategy of previous works the empirical measure on path space (2) converge weakly, as N → ∞, to a distinct limiting measure on E n+1 , namely More information on the latter N -IPS model ( 6) can be found in [5] and references therein.

Amalgamated Algorithm
The mathematical models ( 5) and ( 6) can be embedded respectively in the more general and abstract ones ( 4) and (1) simply by replacing K n,η with K n,η = Φ n (η).Then, K n,η = ηK n,η = Φ(η) and mechanism can be used for both strategies with K, In both cases, we also assume that the initial configuration ξ 0 = (ξ 1 0 , . . ., ξ N 0 ) consists of N independent variables with common law η 0 and we write F = {F n ; n ∈ N} for the canonical filtration associated to the discrete-time Markov process ξ.The aim of this paper is to study the asymptotic behavior of the empirical measures (9) on path space for a general class of N -IPS defined by the transitions (8).We also prove central limit theorems for Feynman-Kac's type limiting system (4).We also start a comparison of the fluctuations variance associated to the two N -IPS models (6) and (1).
Our algorithm (8,9) fits into the general framework given in [2].However, our method was not highlighted as one of their branching selections, the advantages of sub algorithm (1) over sub algorithm (6) were not noted, and our L p −rates of convergence, uniform convergence, and central limit theorem results were not studied.
To illustrate the significance of our study; we consider the class of Feynman-Kac's distributions defined by ; where {X n ; n ∈ N} is a given E−valued time inhomogeneous Markov chain with one step transition probability kernels {P n+1 ; n ∈ N} and initial distribution η 0 ; and {U n ; n ∈ N} is a given sequence of bounded and strictly positive functions on E. The above Feynman-Kac's models naturally arise in biology (modeling genetic and evolutionary processes), physics (distributions of killed Markov particles, spatially homogeneous Boltzmann-type equation, Moran particle systems) and non linear filtering (as flow of conditional distributions).The reader who wishes to know more details about specific applications is recommended to consult [5] and references therein.Using the Markov property in (10) it is not difficult to check that the distribution flow {η n ; n ∈ N} satisfies (5) with Since Ψ n (η) can also be written as Ψ n (η) = ηS n,η with One concludes that the desired flow {η n ; n ∈ N} satisfies (4) for both The trilogy law of large numbers, central limit theorem and large deviations principles for genetic type N -IPS models corresponding to the first choice K n,η = Φ n (η) are discussed in [5].
In section 4, we propose a unified framework within which we can study central limit theorems for the N -IPS approximating model (1) associated to any transition kernels K n,η (x, dy) such that ηK n,η = Φ n (η) that includes both cases in (13).

Statement of Main Results
The first central result holds with K = K and K = K.
Theorem 1 Under some regularity conditions on the transitions kernels K n,η for any p ≥ 1 and n ≥ 0 there exists some finite constant c p (n) such that for any In addition, if the evolution semigroup associated to the limiting dynamical system (4) is sufficiently regular then for any p ≥ 1 there exists some finite constant c p < ∞ such that The precise regularity conditions and the proof of Theorem 1 are given in section 3 In subsection 3.1 we prove (14) and the uniform estimate (15) will be discussed in subsection 3.2.In subsection 3.3 we present several examples which can be treated using our framework including Feynman-Kac's type measure valued processes, mean field particle models and McKean's Maxwellian gases models.
We now recall the definition of η N n from ( 8) and give a central limit theorem for the general algorithm.

Theorem 2 The sequence of random fields
The precise statement and the proof of Theorem 2 is given in subsection 4.2.In section 5 we end the paper with a comparison of the fluctuation variance for the two N -IPS approximating models corresponding to (1) and ( 6).

General discrete time models
In this section we investigate the weak law of large numbers for the N -IPS (1) associated to an abstract measure-valued process of the form (4). We always assume that In addition assume that The regularity condition (K) is validated for a number of examples in subsection 3.3 and it is strongly related to the one used in (25) Theorem 2, p. 451, in [3] and in Theorem 3.2 page 305 [2].In fact the N -IPS model (1) can be regarded as an example of branching particle model discussed in [2] and the N -IPS model ( 6) coincide with the one presented in [3].In these two works L 2 -estimates for the particle density profiles η N n are discussed but no satisfying analysis is done on the asymptotic behavior of the empirical measure on path space defined in (2).In subsection 3.1 we prove an L p -mean error estimate for η N [0,n] .In subsection 3.2 we present a sufficient condition on the evolution semigroup associated to the limiting measure valued process (4) under which the particle density profiles η N n converge to the desired distribution η n uniformly with respect to the time parameter.Typical examples that fit into our framework will be discussed in subsection 3.3.

Mean Ergodic Theorems
Proof: We first prove that for any n ∈ N we have for some finite constant c p (n) which only depends on the parameters n and p.Since ξ 0 consists in N -independent random variables with common law η 0 a simple application of Marcinkiewicz-Zygmund's inequality (see for instance (26) page 498 in [10]) yields for some universal constant c p (0).Moreover, the definition of ξ n and another application of Marcinkiewicz-Zygmund's inequality gives the almost sure estimates for some universal constant c p whose values only depends on the parameter p. Suppose the estimate (18) are true with (n − 1) in place of n.Then, the regularity condition (K) and the induction hypothesis imply that for some finite constant c p (n − 1).Using the decomposition and previous estimates, one concludes easily the desired L p -bounds at rank n and the inductive proof of (18) is completed.Let us show (17) by induction on the time parameter n.The first case n = 0 is clear.Assume, as induction hypothesis that the estimate (17) has been proved at rank (n − 1).For any f ∈ B b (E n+1 ) we use the decomposition with where Kn,η n−1 stands for the Markov integral operator Since, conditionally on the algebra forms a sequence of independent, centered and bounded random variables, Marcinkiewicz-Zygmund's inequality applies and for some universal constant c p .Arguing as in (19), we use the regularity assumption (K) and the estimates (18) to prove that for some finite constant c p (n − 1).Finally, the induction hypothesis at rank (n − 1) implies that for another finite constant c p (n − 1) and the rest of the proof is now straightforward.

A Uniform Convergence Theorem
Suppose that the transitions K n,η satisfy (K) and we set Then, the one step mappings Φ n (η) = ηK n,η have the following Lipschitz type regularity: Let {Φ q,n , 0 ≤ q ≤ n} be the non linear semi-group associated to the non linear measure-valued process (4) and defined by the composite mappings If the one step mappings Φ n satisfy condition (Φ) then it follows that for any 0 ≤ q ≤ n and for every f where . . .
Theorem 3.2 Suppose the semi-group {Φ q,n , 0 ≤ q ≤ n} associated to the measure valued process ( 4) satisfies (21,22) Then, for any f ∈ B b (E) and p ≥ 1 and n ≥ 0 we have that for some universal constant b p and therefore where Remark: Equations (21,23,26) collectively impose a contraction property on the semigroup Φ q,n that yields our uniform estimates.In particular, we are assuming that the size of the Proof: We first use the decomposition with the convention Φ 0 (η N −1 ) = η 0 to check that for any bounded test function f An application of Marcinkiewicz-Zygmund's inequality gives for any p ≥ 1 and q ≥ 0 the almost sure estimate for some universal constant b p and with the convention F −1 = {∅, Ω}.Under our assumptions it follows from the above that and the rest of the proof is now straightforward.

Applications
In this section we present a selection of examples for which conditions (K) and/or the assumptions of Theorem 3.2 are met.Let us start with a very simple situation in which the transitions K n,η does not depend on distribution η, that is K n,η = K n and K n,η = ηK n .In this case one can check that (K) trivially holds with Note also that in this situation the E N -valued chain (1) consists in N independent Markov chain with the same transitions K n and clearly the limiting measure (3) coincides with the distribution of the chain from the origin up to time n, namely Conversely, the E N -valued chain ( 6) is an N -IPS and its transitions are given by It is also worth observing that the limiting measure ( 7) is now the n-tensor product measure In this quite trivial example the measure-valued process (4) and the corresponding semi-group are linear and transformations Φ q,n are simply given by Φ q,n (η) = ηK q+1 K q+2 . . .K n .
We also find that ( 21) is satisfied with c p,n (η) = 1 and A convenient tool for the analysis of the long term behavior of the N -IPS approximating models is the Dobrushin's ergodic coefficient α(K) ∈ [0, 1] of a Markov transition K on a measurable space (E, E) defined by where .− .tv stands for the total variation distance.(See Dobrushin [6], [7].)We also recall that α(K) is a natural measure of contraction of the distance of probability measures induced by K and where the supremum is taken over all µ 1 , µ 2 ∈ M 1 (E).We have from ( 27) and (28) the following estimates and In time homogeneous settings (that is and therefore

Feynman-Kac's distributions
In this section we discuss condition (K) and the long time behavior of the N -IPS approximating models ( 1) and ( 6) of the Feynman-Kac's distributions defined in (10).We examine the two situations with Φ n (η) and K n,η defined respectively in (11) and in (12).In the first situation we use the decomposition and conclude that (K) is satisfied with In the second situation we recall that We use the decomposition with We check using (11) that Using the fact that µ(g n + 1) we can show that One easily deduces that condition (K) is met with To see that condition (21 ) is satisfied for the Feynman-Kac type semi-group associated to the distribution flow (10) we use the following technical lemma: where the functions {g p,n ; 0 ≤ p ≤ n} and the Markov transitions {P p,n ; 0 ≤ p ≤ n} satisfy the backward formulae with the conventions g n,n = 1 and P n,n = Id.
¿From the backward recursions (32) it follows that ¿From this lemma one can also check that It follows that condition (21 ) holds with c p,n (η) = a 2 p,n , a p,n def.

= sup
x,y∈E Using Lemma 2.3, one can also prove that Using ( 27) and (28), we obtain To estimate the Dobrushin's coefficient of S (n) q we will use the following mixing type condition.It is not hard to construct {P n } satisfying this mixing condition when E ⊂ R d is compact.

(M)
For any n ≥ 1 and x, x ∈ E we have P n (x, .)∼ P n (x , .).In addition there exists some strictly positive constant such that for any x, x , y ∈ E dP n (x, .) Under (M) we first notice for any x, x ∈ E and A ∈ E that Using ( 35) again, we also have In summary, we have proved in (34-37) that if (M) is satisfied and = osc(U ) < ∞ then conditions (23) and (24) of Theorem 3.2 hold with where we have used ( 29), (30), and (36) to bound d( H q,n ).Therefore For the two N -IPS approximating models corresponding to the two situations (31) we have for any p ≥ 1 and f ∈ B b (E), f ≤ 1, the following estimate for some universal constant b p .

Mean field particle models
Let us suppose that E = R d for some d ≥ 1 and the transition probability kernel K n,η is described by where • dy stands for the Lebesgue measure on R d , y = (y 1 , . . ., y d ) and x ∈ R d .
This example is instructive because the limiting measure (3) can be regarded as the probability distribution of a path sequence (X 0 , . . ., X n ) of random variables defined by the recursion where • X 0 is a random variable with distribution η 0 and independent of the sequence {W n ; n ≥ 1} and for each n ≥ 0, η n is the law of X n .
Note also that for any n and η we have that A direct calculation shows that and using (38) we find that where W n stands for a Gaussian d-vector with distribution γ n .
Suppose next the drift function a n = (a 1 n , . . ., a d n ) takes the form with J a finite set with cardinality |J|, α i n,j and In this situation we see that Then, evidently (K) holds with and

McKean's Maxwellian gases
We examine the discrete time version of the McKean's 2-velocity model for Maxwellian gases presented in section 3 of [9].In this situation the state space is given by E = {−1, +1} and for any η ∈ M 1 ({−1, +1}) the transition probability kernel K n,η is described by we clearly have with h(x) = 1 {1} (x)(= 1 if x = 1 and 0 otherwise).One concludes that (K) holds with Note also that (39) can also be rewritten as where K n ((x, x ), dy) is the transition probability kernel from E 2 into E defined by Therefore, the corresponding limiting measure-valued process (4) can be formulated in terms of a discrete-time Boltzmann's equation 4 Feynman-Kac formulas

Description of Models, Statement of Some Results
Throughout this section {η n , n ∈ N} denotes the flow of Feynman-Kac distributions defined in (10).In this framework the one step mappings {Φ n , n ≥ 1} are defined by (11).The Markov transitions {K n,η , η ∈ M 1 (E) , n ≥ 1} denote any transition probability kernel satisfying the regularity condition (K) presented in section 3 and such that for any n and η A more detailed and precise description of the two N -IPS models corresponding to (1) and ( 6) can be found in section 5.The precise description of the variance in the forthcoming central limit theorems is related to the dynamics structure of the flows of the un-normalized and normalized measures {γ n ; n ∈ N} and {η n ; n ∈ N}.Next, we describe the corresponding evolution semi-groups (more details can be found in [5]).
The Markov property in (10) also gives Therefore, if we set for any 0 with convention Q n,n = Id, we also have It is also convenient to introduce the "normalized" semigroup {Q p,n ; 0 ≤ p ≤ n} given for any 0 ≤ p ≤ n and f ∈ B b (E) by .
On the basis of the definition of the distributions {η n , γ n ; n ∈ N} we have that γ n (1) .
Therefore, for any f ∈ B b (E) and n ∈ N (with the convention ∅ = 1).Taking in consideration the above formula the natural particle approximating measures η N n and γ N n for the "normalized" and "unnormalized" distributions η n and γ n is simply given by One of the main tool for the analysis of the asymptotic behavior is the R d -valued F -martingales defined by with convention K 0,m(ξ −1 ) = Φ 0 (η N −1 ) = η 0 and where f : , is a bounded measurable function.Indeed, in accordance with the definition of the unnormalized measure γ N n and the semi-group Q p,n we have the decomposition = µK(f 1 ), . . ., µK(f d ) and K[(f 1 − Kf 1 )(f 2 − Kf 2 )] for the function on E defined by Comparing (40), (41), and (42), one finds that the R d -valued process is an F -martingale (transform) such that, for any 1 ≤ u, v ≤ d and 0 Many asymptotic convergence results including L p -rates and exponential estimates and fluctuations can be profitably studied in terms of the above F -martingale.We do not give all details since the results are essentially the same as those exposed in [5] for the case To give a flavor, we notice that (44) clearly implies that for any f ∈ B b (E) and n ∈ N for some finite constant c(n) which only depends on the time parameter n.Since we also have that Thus, by the first part of (45) we have Therefore, using Cauchy-Schwartz inequality and the second part of (45) one gets for some finite constant c(n) whose values only depends on the time parameter n.As a direct consequence one concludes that Recalling that = Qp,n (f ).
Therefore, the variance of the random field {W η n (f ) ; g = f ∈ B b (E)} can also be described for any test functions f and f as follows

Variance comparison
To ease the notation, we define m(x) = 1 N N i=1 δ x i for x ∈ E N .We now compare now the asymptotic behavior of the N -IPS approximating model associated to the choices 1) K n,η (u, .)= Φ n (η) and 2) K n,η = K n,η , where Φ n and K n,η are defined in (11) and (12).From their definition it becomes clear that the one step transition of the corresponding N -IPS is decomposed into two mechanisms and during the mutation stage each particle evolves randomly according to the transition probability kernel P n+1 .In other words, for each 1 ≤ i ≤ N , ξ i n+1 is a random variable with law P n+1 (X i n , .).
• In the second case the selection stage consists in sampling for each 1 ≤ i ≤ N a random variable X i n with distribution In words, with probability 1/m( ξ n )(g n ), the i-th particle does not move, so that X i n = ξ i n , and with probability 1−1/m( ξ n )(g n ), X i n is sampled according to the discrete distribution We remark that in both cases (13) we have Note also that in the first situation and given ξ n , the N -tuple In the second situation (and given ξ In both cases (50) and (51) we have the "local" L 2 -estimate Therefore, one concludes that the resulting N -IPS models also fit into the model framework considered in [2].
For comparison, we notice that the selection distribution (48) can be formulated as Hence, in the first case a (N n , 1/m(ξ n )(g n ))−binomial number N n (≤ N ) of particles are sampled randomly according to the empirical measure m(ξ n ) and N − N n particles are chosen randomly with distribution (49).In contrast, in the second case, a binomial number N n of particles with parameter 1/m( ξ n )(g n ) remain in the same location and again N − N n particles are chosen randomly with distribution (49).In the second situation the closer function g n is to the constant function 1 the more likely particles don't move (during selection) and remain in the same sites.Conversely, in the first case, particles are sampled randomly and uniformly in the current population when g n is 1.
To explain in mathematical terms this excess of randomness in the first case, we look at a comparison between the fluctuation variances.We finish this subsection with an example where this excess randomness induces a degenerate variance with respect to the time parameter.In essence our comparison is based on the formula that is valid for any distribution η ∈ M 1 (E), any test function f ∈ B b (E) and any Markov transition K. To highlight the connections between (52) and the "local" variances (50) and (51) we notice that K n,η = S n,η =⇒ η K n,η = ηS n,η = Ψ n (η) so that (52) here takes the form In words, the "local" L 2 -mean error bound (51) is not more than the one in (50).
Let us write σ n (f ), respectively σ n (f ), for the fluctuation variance (47) for the N -IPS corresponding to the choice K n,η = Φ n (η), respectively K n,η , namely In practice the excess of randomness in the first N -IPS approximating model becomes disastrous when the functions {g n , n ∈ N} are "nearly" constant which corresponds to the case of high observation noise in nonlinear filtering problems.This difference between these two algorithms can be seen by considering the degenerate situation where the functions g n equals to the unit function 1 and K p,η = K p does not depend on η.In this case, we clearly have and convention K 0 = η 0 .In the degenerate situation where K n = Id, for any n ≥ 1 we have η n = η 0 and σ n (f ) = (n + 1) σ n (f ) and σ n (f ) = η 0 ((f − η 0 (f )) 2 ).
for any η and n where |S| denotes the cardinality of a set S.