Degenerate stochastic differential equations arising from catalytic branching networks

We establish existence and uniqueness for the martingale problem associated with a system of degenerate SDE's representing a catalytic branching network. For example, in the hypercyclic case: $$dX_{t}^{(i)}=b_i(X_t)dt+\sqrt{2\gamma_{i}(X_{t}) X_{t}^{(i+1)}X_{t}^{(i)}}dB_{t}^{i}, X_t^{(i)}\ge 0, i=1,..., d,$$ where $X^{(d+1)}\equiv X^{(1)}$, existence and uniqueness is proved when $\gamma$ and $b$ are continuous on the positive orthant, $\gamma$ is strictly positive, and $b_i>0$ on $\{x_i=0\}$. The special case $d=2$, $b_i=\theta_i-x_i$ is required in work of Dawson-Greven-den Hollander-Sun-Swart on mean fields limits of block averages for 2-type branching models on a hierarchical group. The proofs make use of some new methods, including Cotlar's lemma to establish asymptotic orthogonality of the derivatives of an associated semigroup at different times,and a refined integration by parts technique from Dawson-Perkins]. As a by-product of the proof we obtain the strong Feller property of the associated resolvent.

where X (d+1) ≡ X (1) , existence and uniqueness is proved when γ and b are continuous on the positive orthant, γ is strictly positive, and b i > 0 on {x i = 0}. The special case d = 2, b i = θ i − x i is required in work of [DGHSS] on mean fields limits of block averages for 2-type branching models on a hierarchical group. The proofs make use of some new methods, including Cotlar's lemma to establish asymptotic orthogonality of the derivatives of an associated semigroup at different times, and a refined integration by parts technique from [DP1]. As a by-product of the proof we obtain the strong Feller property of the associated resolvent.
1. Introduction. In this paper we establish well-posedness of the martingale problem for certain degenerate second order elliptic operators. The class of operators we consider arises from models of catalytic branching networks including catalytic branching, mutually catalytic branching and hypercyclic catalytic branching systems (see [DF] for a survey of these systems). For example, the hypercyclic catalytic branching model is a diffusion on R d + , d ≥ 2, solving the following system of stochastic differential equations: t dB i t , i = 1, . . . , d. (1.1) Here X(t) = (X (1) t , . . . , X t ), addition of the superscripts is done cyclically so that X (d+1) t = X (1) t , θ i > 0, and γ i > 0. Uniqueness results of this type are proved in [DP1] under Hölder continuity hypotheses on the coefficients. Our main result here is to show the uniqueness continues to hold if this is weakened to continuity. One motivation for this problem is that for d = 2, (1.1) arises in [DGHSS] as the mean field limit of the block averages of a system of SDE's on a hierarchical group. The system of SDEs models two types of individuals interacting through migration between sites and at each site through interactive branching, depending on the masses of the types at that particular site. The branching coefficients γ i of the resulting equation for the block averages involves averaging the original branching coefficients at a large time (reflecting the slower time scale of the block averages) and so is given in terms of the equilibrium distribution of the original equation. The authors of [DGHSS] introduce a renormalization map which gives the branching coefficients γ i of the block averages in terms of the previous SDE. They wish to iterate this map to study higher order block averages. Continuity is preserved by this map on the interior of R d + , and is conjectured to be preserved at the boundary (see Conjecture 2.7 of [DGHSS]). It is not known whether Hölder continuity is preserved (in the interior and on the boundary), which is why the results of [DP1] are not strong enough to carry out this program. The weakened hypotheses also leads to some new methods.
The proofs in this paper are substantially simpler in the two-dimensional setting required for [DGHSS] (see Section 8 below) but as higher dimensional analogues of their results are among the "future challenges" stated there, we thought the higher-dimensional results worth pursuing.
Further motivation for the study of such branching catalytic networks comes from [ES] where a corresponding system of ODEs was proposed as a macromolecular precursor to early forms of life. There also have been a number of mathematical works on mutually catalytic branching ((1.1) with d = 2 and γ i constant) in spatial settings where a special duality argument ( [M,[DP2]) allows a more detailed analysis, and even in spatial analogues of (1.1) for general d, but now with much more restricted results due in part to the lack of any uniqueness result ( [DFX], [FX]). See the introduction of [DP1] for more background material on the model. Earlier work in [ABBP] and [BP] show uniqueness in the martingale problem for the operator A (b,γ) on C 2 (R d + ) defined by Here b i , γ i i = 1, . . . , d are continuous functions on R d + , with b i (x) ≥ 0 if x i = 0, and satisfying some additional regularity or non-degeneracy condition. If b i (x) = j x j q ji for some d × d Q-matrix (q ji ), then such diffusions arise as limit points of rescaled systems of critical branching Markov chains in which (q ji ) governs the spatial motions of particles and γ i (x) is the branching rate at site i in population x = (x 1 , . . . , x d ). The methods of these papers do not apply to systems such as (1.1) because now the branching rates γ i may be zero. Although we will still proceed using a Stroock-Varadhan perturbation approach, the process from which we are perturbing will be more involved than the independent squared Bessel process considered in the above references.
We will formulate our results in terms of catalytic branching networks in which the catalytic reactions are given by a finite directed graph (V, E) with vertex set V = {1, . . . , d} and edge set E = {e 1 , . . . , e k }. This will include (1.1) and all of the two-dimensional systems arising in [DGHSS]. As in [DP1] we assume throughout: Hypothesis 1.1. (i, i) / ∈ E for all i ∈ V and each vertex is the second element of at most one edge.
The restrictive second part of this hypothesis has been removed by Kliem [K] in the Hölder continuous setting of [DP1]. It is of course no restriction if |V | = 2 (as in [DGHSS]), and holds in the cyclic setting of (1.1).
Vertices denote types and an edge (i, j) ∈ E indicates that type i catalyzes the type j branching. Let C denote the set of vertices (catalysts) which appear as the first element of an edge and R denote the set of vertices that appear as the second element (reactants). Let c : R → C be such that for j ∈ R, c j denotes the unique i ∈ C such that (i, j) ∈ E, and for i ∈ C, let R i = {j : (i, j) ∈ E}.
Here are the hypotheses on the coefficients: Hypothesis 1.2. For i ∈ V , are continuous such that |b i (x)| ≤ c(1 + |x|) on R d + , and b i (x) > 0 if x i = 0.
The positivity condition on b i | x i =0 is needed to ensure the solutions remain in the first orthant.
If D ⊂ R d , C 2 b (D) denotes the space of twice continuously differentiable bounded functions on D whose first and second order partial derivatives are also bounded. For f ∈ C 2 b (R d + ), and with the above interpretations, the generators we study are (Here and elsewhere we use f i and f ij for the first and second partial derivatives of f .) Definition 1.2.5.
Let Ω = C(R + , R d + ), the continuous functions from R + to R d + . Let X t (ω) = ω(t) for ω ∈ Ω, and let (F t ) be the canonical right continuous filtration generated by X. If ν is a probability on R d + , a probability P on Ω solves the martingale problem M P (A, ν) if under P, the law of X 0 is ν and for all f ∈ C 2 b (R d + ), Af (X s ) ds is a local martingale under P.
A natural state space for our martingale problem is S = x ∈ R d + : (i,j)∈E The following result is Lemma 5 of [DP1] -the Hölder continuity assumed there plays no role in the proof.
Lemma 1.3. If P is a solution of M P (A, ν), where ν is a probability on R d + , then X t ∈ S for all t > 0 P-a.s.
Here is our main result.
Theorem 1.4. Assume Hypotheses 1.1 and 1.2 hold. Then for any probability ν on S, there is exactly one solution to MP(A, ν).
The cases required in Theorem 2.2 of [DGHSS] are the three possible directed graphs for V = {1, 2}: (i) E = ∅; (ii) E = {(2, 1)} or E = {(1, 2)}; (iii) E = {(1, 2), (2, 1)}. The state space here is S = R 2 −{(0, 0)}. In addition, [DGHSS] takes b i (x) = θ i −x i for θ i ≥ 0. As discussed in Remark 1 of [DGHSS], weak uniqueness is trivial if either θ i is 0, as that coordinate becomes absorbed at 0, so we may assume θ i > 0. In this case Hypotheses 1.1 and 1.2 hold, and Theorem 2.2, stated in [DGHSS] (the present paper is cited for a proof), is immediate from Theorem 1.4 above. See Section 8 below for further discussion about our proof and how it simplifies in this two-dimensional setting. In fact, in Case (i) the result holds for any ν on all of R 2 + (as again noted in Theorem 2.2 of [DGHSS]) by Theorem A of [BP].
Although it is not required in [DGHSS], it is of course natural to ask about uniqueness in cases (ii) and (iii) if ν = δ (0,0) . We do have some partial results when the process starts at the corner (0, 0), but the regularity hypotheses are stronger than in Hypothesis 1.2 and the techniques are quite different than those used in this paper, so we do not pursue this here.
Our proof of Theorem 1.4 actually proves a stronger result. We do not require that the γ i be continuous, but only that their oscillation not be too large. More precisely, we prove that there exists ε 0 > 0 such that if (1.2) below holds, then there is exactly one solution of M P (A, ν). The condition needed is For each i = 1, . . . , d and each x ∈ R d + there exists a neighborhood N x such that where Our proof of Theorem 1.4 is an L 2 perturbation argument, and some of our argument follows along the lines of [ABBP]. The operators from which we are perturbing are now different and the method of [ABBP] for obtaining L 2 estimates no longer applies. This leads to some new methodologies.
The analytic tool we use is Cotlar's lemma, Lemma 2.13, which is also at the heart of the famous T1 theorem of harmonic analysis. For a simple application of how Cotlar's lemma can be used, see [Fe], pp. 103-104.
We consider certain operators T t (defined in (2.18) below) and show that We require L 2 bounds on ∞ 0 e −λt T t dt, and (1.3) is not sufficient to give these. This is where Cotlar's lemma comes in: we prove L 2 bounds on T t T * s and T * t T s , and these together with Cotlar's lemma yield the desired bounds on t 0 e −λt T t dt. The use of Cotlar's lemma to operators arising from a decomposition of the time axis is perhaps noteworthy. In all other applications of Cotlar's lemma that we are aware of, the corresponding operators arise from a decomposition of the space variable. The L 2 bounds on T t T * s and T * t T s are the hardest and lengthiest parts of the paper. At the heart of these bounds is an integration by parts formula which refines a result used in [DP1] (see the proof of Proposition 17 there) and is discussed in the next section.
In Section 3 we give a proof of Theorem 1.4. The proofs of all the hard steps, are, however, deferred to later sections. A brief outline of the rest of paper is given at the end of Section 2.
Acknowledgment. We would like to thank Frank den Hollander and Rongfeng Sun for helpful conversations on their related work.

Structure of the proof.
We first reduce Theorem 1.4 to a local uniqueness result (Theorem 2.1 below). Many details are suppressed as this argument is a minor modification of the proof of Theorem 4 in [DP1]. By the localization argument in Section 6.6 of [SV] it suffices to fix x 0 ∈ S and show that for some r 0 = r 0 (x 0 ) > 0, there are coefficients which agree with γ i , b i on B(x 0 , r 0 ), the open ball of radius r 0 centered at x 0 , and for which the associated martingale problem has a unique solution for all initial distributions. Following [DP1] and note that γ 0 j ≡γ j (x 0 ) > 0 for all j because x 0 ∈ S. We may now write ≤ 0 is possible for j ∈ N 2 ∩ Z c (and sob j may differ from b j here), a simple Girsanov argument will allow us to assume that b j (x 0 ) ≥ δ for j ∈ N 2 ∩ Z c (see the proof below) and sob j = b j near x 0 . With this reduction we see that by Hypothesis 1.2 and the choice of δ,b j (x) = b j (x) for x near x 0 . By changingb andγ outside a small ball centered at x 0 we may assume γ j > 0 for all j,b j > 0 for j / ∈ N 1 ,γ j ,b j are all bounded continuous and constant outside a compact set, and is small. For these modified coefficients introducẽ and also define a constant coefficient operator As b 0 j ≤ 0 andb j | x j =0 ≤ 0 is possible for j ∈ N 1 (recall we have modifiedb j ), the natural state space for the above generators is the larger When modifyingγ j andb j it is easy to extend them to this larger space, still ensuring all of the above properties ofb j andγ j . If ν 0 is a probability on S 0 , a solution to the martingale problem M P (Ã, ν 0 ) is a probability P on C(R + , S 0 ) satisfying the obvious analogue of the definition given for M P (A, ν). As we have Af (x) =Ãf (x) for x near x 0 , the localization in [SV] shows that Theorem 1.4 follows from: , then for any probability ν on S(x 0 ), there is a unique solution to MP(Ã, ν).
Proof of reduction of Theorem 1.4 to Theorem 2.1. This proceeds as in the proof of Theorem 4 in [DP1]. The only change is that in Theorem 2.1 we are now assuming b j > 0 and b 0 j > 0 for all j / ∈ N 1 , not justb j ≥ 0 on {x j = 0} for j / ∈ N 1 and b 0 j > 0 for j ∈ Z ∩ (R ∪ C) with b 0 j ≥ 0 for other values of j / ∈ N 1 . If b j (x 0 ) > 0 for all j ∈ N 2 , then the proof of Theorem 4 in [DP1] in Case 1 applies without change, and so we need only modify the argument in Case 2 of the proof of Theorem 4 in [DP1] so that it applies if b j (x 0 ) ≤ 0 for some j ∈ N 2 . This means x 0 j > 0 by our (stronger) Hypothesis 1.2 and the Girsanov argument given there now allows us to locally modify b j so that b j (x 0 ) > 0. The rest of the argument now goes through as before.
Turning to the proof of Theorem 2.1, existence is proved as in Theorem 1.1 of [ABBP]-instead of the comparison argument given there, one can use Tanaka's formula and (2.4) to see that solutions must remain in S(x 0 ).
We focus on uniqueness from here on. The operator A 2 j is the generator of a Feller branching diffusion with immigration. We denote its semigroup by Q j t . It will be easy to give an explicit representation for the semigroup P i t associated with A 1 i (see (3.2) below). An elementary argument shows that the martingale problem associated with A 0 is well-posed and the associated diffusion has semigroup and resolvent R λ = e −λt P t dt. Define a reference measure µ on S 0 by The norm on L 2 ≡ L 2 (S 0 , µ) is denoted by · 2 .
The key analytic bound we will need to carry out the Stroock-Varadhan perturbation analysis is the following: Proposition 2.2. There is a dense subspace D 0 ⊂ L 2 and a K (M 0 Here are the other two ingredients needed to complete the proof of Theorem 2.1. Proposition 2.3. Let P be a solution of M P (Ã, ν) where dν = ρ dµ for some ρ ∈ L 2 with compact support and set

7)
then for all λ ≥ 1, Then for any bounded measurable function f on S 0 and any λ > 0, Remark 2.5. Our proof of Proposition 2.4 will also show the strong Feller property of the resolvent for solutions to the original MP(A, ν) in Theorem 1.4-see Remark 6.2.
Assuming Propositions 2.2-2.4 the proof of Theorem 2.1 is then standard and quite similar to the proof of Proposition 2.1 in Section 7 of [ABBP]. Unlike [ABBP] the state space here is not compact, so we present the proof for completeness. Proof of Theorem 2.1. Let Q k , k = 1, 2, be solutions to M P (A, ν) where ν is as in Proposition 2.3 and define S k (2.8) Note that for t > 0, where the finiteness follows by considering the associated SDE for X j and using the boundedness ofb j . This shows that M f is a martingale under Q k . Let g ∈ D 0 . Multiply (2.8) by λe −λt integrate over t, take expectations (just as in (7.3) of [ABBP]), and set Taking the difference of this equation when k = 1, 2, we obtain where we have used the definition of ε 0 (in (2.1)) and Proposition 2.
Proposition 2.3 implies the above terms are finite for λ ≥ 1 and so we have To prove uniqueness we first use Krylov selection (Theorem 12.2.4 of [SV]) to see that it suffices to consider Borel strong Markov processes ((Q x k ) x∈S 0 , X t ), k = 1, 2, where Q x k solves M P (Ã, δ x ), and to show that Q x 1 = Q x 2 for all x ∈ S 0 (see the argument in the proof of Proposition 2.1 of [ABBP], but the situation here is a bit simpler as there is no killing). If S k λ are the resolvent operators associated with Q k , then (2.9) implies that for all f ∈ L 2 , compactly supported ρ ∈ L 2 , and λ ≥ 1.
For f and λ as above this implies S 1  [ABBP], and are proved in Sections 5 and 6, respectively. There are some additional complications in the present setting. Most of the work, however, will go into the proof of Proposition 2.2 where a different approach than those in [ABBP] or [DP1] is followed. In [DP1] a canonical measure formula (Proposition 14 of that work) is used to represent and bound derivatives of the semigroups P i t f (x) in (2.5) (see Lemma 3.8 below). This approach will be refined (see, e.g., Lemmas 3.11 and 7.1 below) to give good estimates on the derivatives of the the actual transition densities using an integration by parts formula. The formula will convert spatial derivatives on the semigroup or density into differences involving Poisson random variables which can be used to represent the process with semigroup P t from which we are perturbing. The construction is described in Lemma 3.4 below. The integration by parts formula underlies the proof of Lemma 7.1 and is explicitly stated in the simpler setting of first order derivatives in Proposition 8.1.
In [ABBP] we differentiate an explicit eigenfunction expansion for the resolvent of a killed squared Bessel process to get an asymptotically orthogonal expansion. We have less explicit information about the semigroup P t of A 0 and so instead use Cotlar's Lemma (Lemma 2.13 below), to get a different asymptotically orthogonal expansion for the derivatives of the resolvent R λ -see the proof of Proposition 2.2 later in this section.
Convention 2.7. All constants appearing in statements of results concerning the semigroup P t and its associated process may depend on d and the constants {b 0 j , γ 0 j : j ≤ d}, but, if M 0 is as in (2.4), these constants will be uniformly bounded for M 0 ≤ M for any M > 0.
We state an easy result on transition densities which will be proved in Section 3.
Proposition 2.8. The semigroup (P t , t ≥ 0), has a jointly continuous transition density separately, and satisfies the following: (a) p t (y, x) =p t (x, y), wherep t is the transition density associated withÂ 0 with parame- In particular (2.10) (b) If D n x is any nth order partial differential operator in x and 0 ≤ n ≤ 3, then [1 + (y j /t) 1/2 ] for all y ∈ S 0 , (2.11) and sup y |D n , and for n ≤ 2 and D n x as in (b), and (2.15) Notation 2.9. ThroughoutD x will denote one of the following first or second order differential operators: A deeper result is the following bound which sharpens Proposition 2.8.
Proposition 2.10. ForD x as above and all t > 0, This is proved in Section 4 below. The caseD x = x j D 2 x j x j for j ∈ Z ∩ C will be the most delicate.
ForD as in Notation 2.9 and t > 0, define an integral operator T t = T t (D) by By (d) above T t is a bounded operator on L ∞ , but we will study these operators on L 2 (S 0 , µ). We will use the following well known elementary lemma; see [Ba], Theorem IV.5.1, for example, for a proof.
We have used (a) in the last line.
Proposition 2.10 allows us to apply Lemma 2.11 to T t and conclude T t is a bounded operator on L 2 with norm T t ≤ c 2.10 t −1 (2.19) Unfortunately this is not integrable near t = 0 and so we can not integrate this bound to prove Proposition 2.2. We must take advantage of some cancellation in the integral over t and this is where we use Cotlar's Lemma: Lemma 2.13 (Cotlar's Lemma). Assume {U j : j ∈ Z + } are bounded operators on L 2 (µ) and {a(j) : j ∈ Z} are non-negative real numbers such that Proof. See, e.g., Lemma XI.4.1 in [T].
The subspace D 0 in Proposition 2.2 will be (2.21) As we can take g ∈ C 2 with compact support, denseness of D 0 in L 2 follows from Corollary 2.12(b). To see that D 0 is a subspace, let P 2 −j i g i ∈ D 0 for i = 1, 2 with j 2 ≥ j 1 . If g 1 = P 2 −j 1 −2 −j 2 g 1 , theng 1 is in L 2 by Corollary 2.12 (a) and also in C 2 b (S 0 ) by Proposition 2.8(d). In addition, where we have used Corollary 2.12(a) again. Hence P 2 −j 1 g 1 = P 2 −j 2g1 whereg 1 satisfies the same conditions as g 1 . Therefore P 2 −j 1 g 1 + P 2 −j 2 g 2 = P 2 −j 2 (g 1 + g 2 ) ∈ D 0 .
We show below how Cotlar's Lemma easily reduces Proposition 2.2 to the following result.
Proposition 2.14. There is an η > 0 and c 2.14 so that ifD x is any of the operators in Notation 2.9, then T * s T t f 2 ≤ c 2.14 s −1−η/2 t −1+η/2 f 2 and T s T * t f 2 ≤ c 2.14 s −1−η/2 t −1+η/2 f 2 for any 0 < t ≤ s ≤ 2, and any bounded Borel f ∈ L 2 (µ). (2.22) Assuming this result, we can now give the Proof of Proposition 2.2. Fix a choice ofD x (recall Notation 2.9), let λ ≥ 1, and for k ∈ Z + , define If k = j a similar calculation where the contributions to the integral from {s ≥ t} and {t ≥ s} are evaluated separately shows Cotlar's Lemma therefore shows that N j=0 U j ≤ √ c 2.14 2(1 − 2 −η/4 ) −1 := C(η) for all N. (2.23) In the last line the bound (2.15) allows us to differentiate through the t integral and (2.14) allows us to differentiate through the µ(dy) integral and concludeD x P u h = T u h. A change of variables in the above now gives (2.24) and the result follows.
For Proposition 2.14, an easy calculation shows that for 0 < s ≤ t, and K (2) s,t (x, y) = D x p s (x, z)D y p t (y, z)dµ(z). (2.27) A simple refinement of Lemma 2.11 (Lemma 3.16 proved at the end of Section 3) will show that (2.22) follows from for all 0 < t ≤ s ≤ 2 and i = 1, 2.
(2.28) This calculation will reduce fairly easily to the case N 2 empty and Z ∩ C a singleton (see the proof of Proposition 2.14 at the end of Section 4 below). Here there are essentially 4 distinct choices ofD x , making our task one of bounding 8 different 4-fold integrals involving first and second derivatives of the transition density p t (x, y). Fairly explicit formulae (see (4.7)-(4.9)) are available for all the derivatives except those involving the unique index j in Z ∩ C, and as a result Proposition 2.14 is easy to prove for all derivatives but those with respect to j (Proposition 4.3). Even here the first order derivatives are easily handled, leavingD x = x j D x j x j . This is the reason for most of the rather long calculations in Section 7. In the special case d = 2, of paramount importance to [DGHSS], one can avoid this case using the identity A 0 R λ f = λR λ f − f , as is discussed in Section 8. We give a brief outline of the rest of the paper. Section 3 studies the transition density associated with the resolvent in Proposition 2.2 for the key special case when Z ∩ C is a singleton and N 2 = ∅. This includes the canonical measure formulae for these densities (Lemmas 3.4 and 3.11) and the proof of Proposition 2.8. In addition some important formulae for Feller branching processes with immigration, conditional on their value at time t, are proved (see Lemmas 3.2, 3.14 and Corollary 3.15). The section ends with an elementary result (Lemma 3.16) on integral operators on L 2 . In Section 4, the proofs of Propositions 2.14 and 2.10 are reduced to a series of technical bounds on the derivatives of the transition densities (Lemmas 4.5, 4.6 and 4.7). Most of the work here is in the setting of the key special case considered in Section 3, and then at the end of Section 4 we show how the general case of Proposition 2.14 follows fairly easily thanks to the product structure in (2.5). Propositions 2.3 and 2.4 are proved in Sections 5 and 6, respectively. Lemmas 4.5-4.7 are finally proved in Section 7, thus completing the proof of Theorem 2.1 (and 1.4). The key inequality in Section 7 is Lemma 7.1 which comes from the integration by parts identity for the dominant term (see Proposition 8.1 for a simple special case). In Section 8 we describe how all of this becomes considerably simpler in the 2-dimensional setting required in [DGHSS].
3. The basic semigroups. Unless otherwise indicated, in this section we work with the generator in (2.3) where Z ∩ C = {d} and N 2 = ∅. Taking d = m + 1 to help distinguish this special setting, this means we work with the generator Our Convention 2.7 on constants therefore means: Convention 3.1. Constants appearing in statements of results may depend on m and then these constants will be uniformly bounded for M 0 ≤ M for any fixed M > 0. Note that M 0 ≥ 1. It is easy to see that the martingale problem M P (A 1 , ν) is well-posed for any initial law ν on S m . In fact, we now give an explicit formula for P t . Let X t = (X (1) t , . . . , X (m+1) ) be a solution to this martingale problem. By considering the associated SDE, we see that X (m+1) is a Feller branching diffusion (with immigration) with generator and is independent of the driving Brownian motions of the first m coordinates. Let P x m+1 be the law of X (m+1) starting at x m+1 on C(R + , R + ). By conditioning on X m+1 we see that the first m coordinates are then a time-inhomogeneous Brownian motion. Therefore Recall (see, e.g., (2.2) of [BP]) that X (m+1) has a symmetric density q t = q b,γ t (x, y) (x, y ≥ 0) with respect to µ m+1 (dy), given by or more precisely a version of this collection of probability laws which is symmetric in (x m+1 , y m+1 ) and such that (x m+1 , y m+1 ) →r t (x m+1 , y m+1 , dw) is a jointly continuous map with respect to the weak topology on the space of probability measures. The existence of such a version follows from Section IX.3 of [RY]. Indeed, Corollary 4.3 of the above states that if γ = 2, then if xy = 0. (3.4) is the modified Bessel function of the first kind of index ν > −1. The continuity and symmetry of L in (x, y) gives the required continuous and symmetric version ofr 1 (x m+1 , y m+1 ). A scaling argument (see the proof of Lemma 3.2 below) gives the required version ofr t for general γ > 0.
for all x m+1 ≥ 0 and Borel ψ : (3.6) (Weakly continuous means continuity with respect to the weak topology on the space of probability measures.) Combine (3.6) and (3.2) to conclude that X has a transition density with respect to µ(dy) given by The next result is a refinement of Lemma 7(b) of [DP1].
Proof. Assume first γ = 2, t = 1 so that we may use (3.4) to conclude (recall ν = b 2 − 1) A bit of calculus shows with the first inequality being strict if λ > 0. The above series expansion shows that I ν (αz) ≤ α ν I ν (z) for all z ≥ 0, α ∈ [0, 1], and so using (3.9) in the above ratio bound, we get L(λ, x, y) ≤ L(λ, x, 0) for all λ, x, y ≥ 0. (3.10) We have where in the last line c > 0 and we have used (3.9), inf 0≤w≤1 w coth w−1 w 2 = c 1 > 0, and inf w≥1 w coth w−1 w = c 2 > 0. For x ≤ 1 we may bound the above by (recall Convention 3.1) and for x ≥ 1 we may, using (2we −w ) b/2 ≤ 1 for w ≥ 1, bound it by These bounds show that w −pr 1 (x, y, dw) ≤ c(p)(1 + x) −p and so by symmetry in x and y we get For general γ and t,X s = 2 tγ X ts is as above withγ = 2 andb = 2b γ . We have t 0 X s ds = t 2 γ 2 1 0X u du, and so, using the above case, We observe that there exist c 1 , c 2 (recall Convention 3.1 is in force) such that To see this, suppose first b/γ ≡ r ≥ 1 and use Jensen's inequality to obtain Next suppose r ∈ [1, 2] and again use Jensen's inequality to see that These two inequalities imply (3.11) by using the identity Γ(m + r + 1) = (m + r)Γ(m + r) a finite number of times.
(3.12) By (3.11) and Stirling's formula we have where in the last line we used an elementary Poisson expectation calculation (for b/γ ≥ 1/2 see Lemma 3.3 of [BP] ). If b/γ ≥ 1/2, the above is bounded and (3.12) is immediate. Assume now that p = 1/2 − b/γ > 0 and x ≥ y. Then the above is at most This proves (3.12) and hence (a). (b) The continuity follows easily from (3.7), the continuity of q t (·, ·), the weak continuity ofr t (·, ·, dw), Lemma 3.2 and dominated convergence. Using Lemma 3.2 and (a) in (3.7), we obtain (recall Convention 3.1) (c) This is well-known, and is easily derived from (3.3). (d) The expectation we need to bound equals On the other hand if x m+1 < t, the above expression for E x m+1 ((X (m+1) ) −p )is trivially at most The result follows from these two bounds.
a simple change of variables shows the quantity we need to bound is Carrying out the differentiation and resulting Gamma integrals, we see the absolute value of the above summation equals

Now let {P 0
x : x ≥ 0} denote the laws of the Feller branching process with generator see, e.g., Theorem II.7.3 of [P] which can be projected down to the above situation by considering the total mass function. Moreover for each t > 0 we have (Theorems II.7.2(iii) and II.7.3(b) of [P]) and so we may define a probability on C ex by ( 3.15) The above references in [P] also give the well-known 16) and so this together with (3.14) implies The representation (3.13) leads to the following decomposition of X (m+1) from Lemma 10 of [DP1]. As it is consistent with the above notation, we will use X (m+1) to denote a Feller branching diffusion (with immigration) starting at x m+1 and with generator given by (3.1), under the law P x m+1 .
In addition, we may assume where Ξ is independent of X ′ 0 and is a Poisson point process on C ex with intensity Remark 3.5. A double application of the decomposition in Lemma 3.4(a), first with general ρ and then ρ = 0 shows we may write , j, k} are independent exponential variables with mean (1/γt), N ′ 0 (t), N ρ (t) are independent Poisson random variables with means ρx m+1 /γt and (1 − ρ)x m+1 /γt, respectively, and (X 0 (t), {e 1 j (t), e 2 k (t)}, N ′ 0 (t), N ρ (t)) are jointly independent. The group of two sums of exponentials in (3.21) may correspond to X 1 (t) in (3.18) and (3.19), and so we may use this as the decomposition in Lemma 3.4(a) with ρ = 0. Therefore we may take N 0 to be N ′ 0 + N ρ , and hence may couple these decompositions so that (3.23) The decomposition in Lemma 3.4 also gives a finer interpretation of the series expansion (3.2) for q b,γ t (x, y), as we now show. Note that the decomposition (from (3.18),(3.19)), where X ′ 0 , N ρ (t) and {e j (t)} satisfy the distributional assumptions in (a), uniquely determines the joint law of (X (m+1) t , N ρ (t)). This can be seen by conditioning on N ρ (t) = n. Both this result and method are used in the following.
Notation 3.7 Let D n denote any nth order partial differential operator on S m and let D n x i denote the nth partial derivative with respect to Proof. Proposition 14 and Remark 15 of [DP1] show that P t f ∈ C 2 b (S m ) and give (3.24) and (3.25) for n ≤ 2. The proof there shows how to derive the n = 2 case from the n = 1 case and similar reasoning, albeit with more terms to check, allows one to derive the n = 3 case from the n = 2 case.
Recall that Q t is the semigroup of X (m+1) , the squared Bessel diffusion with transition density given by (3.3).
Corollary 3.9. If g : R + → R is a bounded Borel function and t > 0, then Q t g ∈ C 3 b (R + ) and for n ≤ 3, Proof. Apply Lemma 3.8 to f (y) = g(y m+1 ).
Lemma 3.11. (a) For each t > 0 and y ∈ S m , the functions x → p t (x, y) and x → p t (y, x) are in C 3 b (S m ), and if D n x denotes any nth order partial differential operator in the x variable, then where the last line follows by dominated convergence, the uniform bound in (3.28) and the fact that f has compact support. An application of (3.24) implies and the result follows upon letting N → ∞.
by dominated convergence and the uniform bounds in (a). The integrability of D n x p t/2 (x, z) with respect to µ(dz) (from (b)) and the fact that p t/2 (z, ·) ∈ C b (S m ) allow us to deduce the continuity of (3.29) in y from dominated convergence. Now use symmetry, i.e., (3.8), to complete the proof.
f is bounded and continuous in z by Lemma 3.3(a) (with a bound depending on y, δ). Let n ≤ 3. The uniform bounds in (a) and integrability of f y,δ allow us to apply dominated convergence to differentiate through the integral and conclude In the last line we have used (c). Now note that if I, X ≥ 0, then Combine (3.32) and (3.30) to derive (d).
Proof. This is a minor modification of the proofs of Lemma 3.11 (a),(b). Use the bound (from Lemma 3.3(a)) in place of Lemma 3.3(b), and (3.27) in place of (3.24), in the above argument. Of course this is much easier and can also be derived by direct calculation from our series expansion for q t . Expectation under P * t is denoted by E * t and we let e(s), s ≥ 0, denote the canonical excursion process under this probability.
Proof. Let P 0 h denote the law of the diffusion with generator γx d 2 dx 2 starting at h. If f ∈ C b (R + ) has compact support, then Proposition 9 of [DP1] and (3.14) show that Here we extend the notation in (3.3) by letting q 0,γ denote the absolutely continuous part of the transition kernel for E 0 (it also has an atom at 0). We have also extended the convergence in [DP1] slightly as the functional e(s)f (e(t)) is not bounded but this extension is justified by a uniform integrability argument-the approximating functionals are L 2 bounded. By (2.4) of [BP] γs and also γs h q 0,γ s (h, y) ≤ (γs) −1 . Dominated convergence allows us to take the limit in (3.34) through the integral and deduce that By (3.16) we conclude that By inserting the above series expansion for q 0,γ t−s (y, x) and calculating the resulting gamma integrals, the result follows.
Lemma 3.14. Let ν = b/γ − 1, and κ ν (z) = Proof. Write X for X (m+1) and z for z m+1 . The scaling argument used in the proof of Lemma 3.2 allows us to assume t = 1,γ = 2 andb = 2b/γ. Dominated convergence implies that The right side can be calculated explicitly from the two formulae in (3.4) and after some calculus we arrive at where ν =b/2 − 1 = b/γ − 1. This gives the required equality.
Set α = ν + 1 and recall that Taking ratios of the above and setting w = (z/2) 2 , we see that it suffices to show We claim in fact that Assuming this, (3.35) will follow with c 0 (ε) = ε −1/2 by summing (3.36) over n ≥ 1. The proof of (3.36) is an elementary application of the quadratic formula once factors of w n−1/2 are canceled.
Proof. We write X for X (m+1) and z for z m+1 . Recall the decomposition (3.21): where the second integral is independent of ({e j (t), j ∈ N}, N 0 (t) = t 0 1 (ν t >0) Ξ(dν), X ′ 0 ) by elementary properties of Poisson point processes and the independence of X ′ 0 and Ξ. Therefore In the last line we have used the independence of X ′ 0 and Ξ, of N 0 (t) and {e j , j ∈ N}, and the joint independence of the {e j } (see Lemma 3.4). Now use Lemmas 3.13 and 3.14 to bound the last and first terms, respectively, and note the second term is bounded by the mean of t 0 X 1 (s) ds, where X 1 is as in (3.19). This bounds the above by Condition the above on σ(N 0 (t), X t ) to complete the proof.
We now return to the general setting of Propositions 2.2 and 2.8. Proof of Proposition 2.8. From (2.5) we may write where p i t are the transition densities from Lemma 3.11 and q j t are the transition densities from Lemma 3.12. The joint continuity and smoothness in each variable is immediate from these properties for each factor (from Lemma 3.11 and (3.3)). (a) is also immediate from (3.8). The first part of (b) is also clear from the above factorization and the upper bounds in Lemmas 3.11(a) and 3.12(a). The second part of (b) is then immediate from (a). (c) also follows from Lemmas 3.11(b) and 3.12(b) and a short calculation.
(d) is an exercise in differentiating through the integral. As we will be doing a lot of this in the future we risk boring the reader by outlining the proof here and refer to this argument for such manipulations hereafter. Let f be a bounded Borel function on S 0 and 0 ≤ n. Note first that the right-hand side of (2.14), I(x), (finite by (c)) is continuous in x.
To see this, choose a unit vector e j , set x ′ i = x i if i = j and x ′ j variable, and note that for h > 0, We have used (c) in the above. Let f N (y) = f (y)1 (|y|≤N) . By the integrability in (c), the left-hand side of (2.14) equals where the differentiation through the integral over a compact set is justified by the bounds in (b) and dominated convergence. The bound in (c) shows this convergence is uniformly bounded (in x). For definiteness assume n = 2 and D 2 x = D 2 x i x j for i = j. By the above convergence and dominated convergence we get Now differentiate both sides with respect x ′ j and then x ′ i and use the Fundamental Theorem of Calculus and the continuity of D 2 x p t (x, y)f (y)dµ(y), noted above, to obtain (2.14). This shows P t f ∈ C 2 b as continuity in x was established above. Finally (2.15) is now immediate from (2.13) and (2.14).
Proof. Let K * K(·, ·) denote the integral kernel associated with the operator K * K. The hypothesis implies that By Lemma 2.11 and the fact that K * K is symmetric, we have 4. Proofs of Propositions 2.14 and 2.10. By Lemma 3.16, Proposition 2.14 will follow from (2.28). We restate this latter inequality explicitly. Recall thatD x is one of the first or second order partial differential operators listed in Notation 2.9. (2.28) then becomes ≤ c 2.14 s −2−η t 2 +η for all 0 < t ≤ s ≤ 2, (4.1) ≤ c 2.14 s −2−η t 2 +η for all 0 < t ≤ s ≤ 2, (4.2) We have stated these conditions for bounded times for other potential uses; in our case we will verify (2.28) for all 0 < t ≤ s. Recall also our Convention 3.1 for constants applies to c 2.14 and η.
Until otherwise indicated, we continue to work in the setting of the last section and use the notation introduced there. In particular, Convention 3.1 will be in force and the differential operators in Notation 2.9 arẽ In [DP1] a number of bounds were obtained on the derivatives of the semigroup P t f ; (3.24) in the last section was one such bound. Propositions 16 and 17 of [DP1] state there exists a c 4.1 such that for allD x as in Notation 2.9, t > 0 and bounded Borel function f , sup Although these results are stated for m = 1 in [DP1], the same argument works in m + 1 dimensions (see, for example, Proposition 20 of [DP1]). As a simple consequence of this result we get: and Proof. Apply (4.4) and (2.14) to f (y) = sgn (D x p t (x, y)) to obtain (4.5), and to f (y) = sgn (D x m+1 q t (x m+1 , y m+1 )) to obtain (4.6).
One of the ingredients we will need is a bound like (4.5) but with the integral on the left with respect to x instead of y. For derivatives with respect to x i , i ≤ m, this is straightforward as we now show.
By differentiating through the integral in (3.7) we find for i ≤ m, and Integration through the integral is justified by the bounds in Lemma 3.2 and dominated convergence.
Lemma 4.2. For all t > 0, and i ≤ m, and sup z |D z p t (z, y)|µ(dy) ≤ c 4.2 t −1 . (4.11) Proof. (a) By (4.7) the integral in (a) is where in the first inequality we used the symmetry of r t (recall (3.5)) and in the last inequality we have used Lemma 3.2.
(b) Integrate the inequality in (a) to bound the integral in (b) by In the last line we used Lemma 3.3(e). (c) ForD z = D z i (4.10) follows from (b) upon taking p = 0. The other cases are similarly proved, now using (4.8) for the second order derivatives. ((4.11) is also immediate from Lemma 4.1.) Consider first (2.28) forD x = D x i or x m+1 D 2 x i for some i ≤ m.
Proposition 4.3. IfD x = D x i or x m+1 D 2 x i for some i ≤ m, then (2.28) holds with η = 1/2.
Proof. Consider (4.1) forD x = x m+1 D 2 x i . We may as well take i = 1. Assume 0 < t ≤ s and let J = D z p s (z, x)D z p t (z, y ′ )µ(dz) µ(dy ′ ). Then To evaluate J 1 do the dz 1 integral first and use (4.8) to see z m+1 D 2 z 1 p t (z, y ′ ) dz 1 = 0, and so J 1 = 0. (Lemma 3.2 handles integrability issues.) Let J ′ 2 and J ′′ 2 denote the contribution to the integral defining J 2 from {z 1 ≤ y ′ 1 } and {z 1 ≥ y ′ 1 }, respectively. Then by (4.8), Do the z 1 integral first in the above and if X = where the first inequality follows by an elementary calculation-consider |X| ≥ 1 and |X| < 1 separately, note that ∞ 0 (v 2 − 1)p 1 (v) dv = 0, and in the last case use Take the absolute value inside the remaining integrals, then integrate over dy ′ 2 ...dy ′ m , and use (4.9) to express the third order derivative. This and (4.12) lead to Do the trivial integral over dz 2 . . . dz m and then consider the dy ′ 1 dz ′ integral of the resulting integrand. If z ′′ = z ′ − x 1 + b 0 1 s and y ′′ Use this in the above bound on J ′ 2 and the symmetry of r s (z m+1 , x m+1 , ·) in (z m+1 , x m+1 ) (recall (3.5)), and conclude

Another application of Lemma 3.3(e,f) now shows
J ′ 2 ≤ ct −1/2 (x m+1 + s) 3/2 (x m+1 + s) −6/4 s −6/4 = ct −1/2 s −3/2 . Symmetry (switching z 1 and y ′ 1 in the integral defining J ′ 2 amounts to switching the sign of b 0 1 ) gives the same bound on J ′′ 2 and hence for J. This implies that the left-hand side of (4.1) is at most where (4.10) is used in the last line. This completes the proof of (4.1) with η = 1/2 forD x = x m+1 D 2 x 1 . The proof for D x 1 is similar and a bit easier. Finally, very similar arguments (the powers change a bit in the last part of the bound on J ′ 2 ) will verify (4.2) for these operators.
Recall the notationp t from Proposition 2.8.
Lemma 4.4. If D n x and D k y are nth and kth order partial differential operators in x and y, respectively, then for all t > 0, k, n ≤ 2, D n y D k x p t (x, y) exists, is bounded, is continuous in each variable separately, and equals D k ypt/2 (y, z)D n x p t/2 (x, z) µ(dz).
For n ≤ 1, D n y D k x p t (x, y) is jointly continuous.
Apply (2.14), with f (z) = D n x p t/2 (x, z) andp t/2 in place of p t , to differentiate with respect to y through the integral and derive the above identity. Uniform boundedness in (x, y), and continuity in each variable separately follows from the boundedness in Lemma 3.11(a), the L 1 -boundedness in Lemma 3.11(b) and dominated convergence. If n = 1 the uniform boundedness of the above derivative implies continuity in y uniformly in x and hence joint continuity.
We now turn to the verification of (2.28) and Proposition 2.10 forD x = D x m+1 or x m+1 D 2 x m+1 . The argument here seems to be much harder (at least in the second order case) and so we will reduce its proof to three technical bounds whose proofs are deferred to Section 7. Not surprisingly these proofs will rely on the representations in Lemmas 3.4 and 3.11 as well as the other explicit expressions obtained in Section 3 such as Lemmas 3.6 and 3.14.

18)
and for all j ≤ m, (4.20) Lemma 4.6. There is a c 4.6 such that for all 0 < t ≤ s,

21)
and Lemma 4.7. There is a c 4.7 such that if 1 ≤ p ≤ 2, then for all t > 0, w > 0 and y (m) ∈ R m , (1 (y m+1 ≤w≤z m+1 ) + 1 (z m+1 ≤w≤y m+1 ) )z p m+1 |D 2 z m+1 p t (z, y)|µ(dz)µ m+1 (dy m+1 ) ≤ c 4.7 [1 (w≤γt) t p−2+b/γ + 1 (w>γt) t −1/2 w p−3/2+b/γ ]. (4.23) Assuming these results we now verify (2.28) and Proposition 2.10 forD x = D x m+1 or x m+1 D 2 x m+1 . The analogue of Proposition 2.10 is immediate. Proof. Define η as above and fix 0 < t ≤ s ≤ 2. We will verify (4.1) even with the absolute values taken inside all the integrals. By (4.14) with p = 0, In the last inequality we used η ≤ 1/2 (recall M 0 ≥ 1). Therefore the left side of (4.1) (even with absolute values inside all the integrals) is at most where we have used (4.13) with p = −η in the last line. Now apply (4.14) to the above integral in x and then (4.13) to the integral in z ′ , both with p = 0, and conclude that (4.26) is at most as required. The derivation of (4.2) (with absolute values inside in the integral) is almost the same. One starts with (4.13) with p = 0 to bound the integral in y ′ as above, and then uses (4.14) with p = −η to bound the resulting integral in z.
It remains to verify (2.28) forD x = x m+1 D 2 x m+1 . This is the hard part of the proof and we will not be able to take the absolute values inside the integrals in (2.28).
Lemma 4.10. For j = 1, 2, |D j z m+1 p t (z, y)|dz (m) < ∞ for all z m+1 > 0 and y ∈ S m , and for all z m+1 > 0 and y ∈ S m . (4.27) Proof. We give a proof as this differentiation is a bit delicate, and the result is used on a number of occasions. Set j = 2 as j = 1 is slightly easier. Fix y ∈ S m and t > 0. By Lemma 4.5(b,d), Note that F (z m+1 ) < ∞ for almost all z m+1 > 0 by (4.28). The Fundamental Theorem of Calculus and Lemma 3.11(a) imply that if z ′ m+1 > z m+1 > 0, then where dominated convergence and (4.28) are used to show the convergence to 0. This allows us to first conclude that (4.30) and in particular, F (z m+1 ) is finite for all z m+1 > 0, and also that F is continuous. The differentiation through the integral now proceeds as in the proof of Proposition 2.8(d) given in Section 3 (using the Fundamental Theorem of Calculus). The last equality follows from (3.7) and the definition of r t .
Lemma 4.11. There is a c 4.11 so that forD z ≡D z m+1 = D z m+1 or z m+1 D 2 z m+1 , all t > 0 and all y ′ ∈ S m , D z p t (z, y ′ )µ(dz) ≤ c 4.11 (t + y ′ m+1 ) −1 (4.31) Proof. Use (4.27) to see that Changing variables, we must show D z q t (z, y)z b/γ−1 dz ≤ c 4.10 (t + y) −1 . (4.32) The arguments are the same for either choice ofD z so let us takeD z = D z for which the algebra is slightly easier. Let w = y/γt and x = z/γt. By differentiating the power series (3.3) the left side of (4.32) is then where N (w) is a Poisson random variable with mean w, c 1 satisfies Convention 3.1, and we have used (n+b/γ −1) −1 ≤ cn −1 for all n ∈ N. An elementary calculation (e.g. Lemma 3.3 of [BP]) bounds the above by (4.32) follows.
Proof. Consider general 0 < η < 1 for now. (2.28) will follow from (4.33) and sup x D x p s (x, z)D y ′ p t (y ′ , z)µ(dz) µ(dy ′ ) ≤ c 4.11 s −1−η t −1+η for 0 < t ≤ s ≤ 2. (4.34) To see (4.1), multiply both sides of (4.33) by After taking a supremum over y, the resulting left-hand side is an upper bound for the lefthand side of (4.1). For the resulting right-hand side, use (4.24) to first bound the integral in x, uniformly in z, by c 4.8 s −1 and (4.25) to then bound the integral in z, uniformly in y by c 4.8 t −1 . This gives (4.1) and similar reasoning derives (4.2) from (4.34). Next use Lemma 4.11 to see that (4.35) In the next to last inequality we have used Lemma 4.5(b) with q = p = 0. Therefore the triangle inequality shows that (4.33) (with perhaps a different constant) will follow from (4.36) The analogous reduction for (4.34) is easier as (2.14) with f ≡ 1 implies Use this in place of (4.35) and again apply the triangle inequality to see that (4.34) will follow from (4.37) Having reduced our problem to (4.36) and (4.37), we consider (4.36) first and take η = 1/2 for the rest of the proof. The left-hand side of (4.36) is bounded by (4.38) Use the Fundamental Theorem of Calculus (recall Proposition 2.8 for the required regularity) to see that Now recall from (3.7) that p t (z, y) = p 0 t (z (m) − y (m) , z m+1 , y m+1 ). First do the µ(dy ′ ) µ m+1 (dz m+1 ) integrals and change variables to y ′′ = z (m) − y ′(m) in this integral to see that In the last line we have used Lemma 4.7 with p = 2. Now use Lemma 4.5(d) to bound the first term by ct −1/2 s −3/2 , and use (4.22) to bound the second term by cs −2 ≤ ct −1/2 s −3/2 . We have proved T a,1 ≤ c 1 t −1/2 s −3/2 . (4.40) Note that :=T a,3 + T a,4 . (4.41) By Lemma 4.10, and so using Lemma 3.3(g) we have where we have used Lemma 4.5(b) in the last line with p = q = 0. For T a,3 use the Fundamental Theorem of Calculus to write Now take the absolute values inside the integrals and summation, do the integral in r last, and for each r carry out the linear change of variables for the other (2m-dimensional) Lebesgue integrals: For the last inequality we have used Lemma 4.5(b) with q = 1 and p = 2. Now use Lemma 4.5(f) with p = 0 (for the t 3/2 term) and then with p = 3/2 (for the (y ′ m+1 ) 3/2 term) to conclude that T a,3 ≤ c[ts −3 + t −1/2 s −3/2 ] ≤ c 3 t −1/2 s −3/2 (4.44) Combining (4.40), (4.42) and (4.44) now gives (4.36) (with η = 1/2). The left side of (4.37) is at most The Fundamental Theorem of Calculus gives (Lemma 4.4 gives the required regularity) Re-express p t in terms of p 0 t and set y ′′ = y ′(m) − z (m) to conclude In the last line we used Lemma 4.7 with p = 1. Now use (4.18) with p = 1/2 to bound the first term by c 4.5 c 4.7 t −1/2 s −3/2 . By Lemma 4.4, the second term in (4.46) is at most where we have used (4.21) then (4.5). (We are applying (4.21) top s/2 .) We have shown For T b,2 , an argument similar to that leading to (4.43) bounds T b,2 above by In the last line we have used the identity p 0 t (u, y ′ m+1 , z m+1 ) = p t (0, y ′ m+1 , −u, z m+1 ) and then Lemma 4.5(c) with q = 1. Finally use (4.19) with p = 0 and p = 1/2 to bound the above by c(s −2 + t −1/2 s −3/2 ) ≤ ct −1/2 s −3/2 . Use this and (4.47) in (4.45) to complete the proof of (4.37).
Having obtained (2.28) and Proposition 2.10 for the special case N 2 null and Z ∩C = {d}, we now turn to the general case. In the rest of this section we work in the general setting of Propositions 2.2 and 2.14.
Proof of Proposition 2.14. We need to establish (2.28) (thanks to Lemma 3.16), and first do this for the special case when our transition density is q t = q b,γ t , that is Z ∩ C empty and N 2 a singleton. Let p t be the transition density considered above with m = 1. Recall from (3.7) that (4.48) LetD y 2 = D y 2 or y 2 D y 2 y 2 . We claim that we can differentiate through the above integrals and so D x 2 p t (x, z)dz 1 =D x 2 q t (x 2 , z 2 ) for almost all z 2 > 0 and all x, (4.49) and D x 2 p t (x, z)dx 1 =D x 2 q t (x 2 , z 2 ) for all x 2 > 0 and z. (4.50) Lemma 4.10 implies (4.50). The proof of (4.49) uses |D x 2 p t (x, z)|dz 1 < ∞ for a.a. z 2 > 0 by Lemma 3.11(b), and then proceeds using the Fundamental Theorem of Calculus as in the proof of Proposition 2.8(d) in Section 3. (The stronger version of (4.49) also holds but this result will suffice.) Consider first (4.2) for q t . Let 0 < t ≤ s ≤ 2. By (4.49) and (4.50), we have for all x 2 , y 2 > 0, D x 2 q s (x 2 , z 2 )D y 2 q t (y 2 , z 2 )µ 2 (dz 2 ) Similarly, for all x 2 > 0, (4.52) Integrability issues are handled by Lemma 4.10 and Proposition 4.8. The infimum in the second line can be omitted as the expression following does not depend on x 1 . Multiply (4.52) and (4.51) and integrate with respect to µ 2 (dx 2 ) to see that for any y 2 > 0, the last by Proposition 4.12. This gives (4.2) for q t . A similar argument works for (4.1). Next we consider (2.28) in the general case. Write x = ((x (i) ) i∈Z∩C , (x j ) j∈N 2 ) so that (from (2.5)) µĵ = i =j µ i , and let pĵ t (xĵ, yĵ) denote the above product of transition densities but with the jth factor (which may be a p j t or a q j t ) omitted. Consider (4.2) and letD x ≡D x (j) be one of the differential operators in Notation 2.9 acting on the variable j ′ ∈ {j} ∪ R j for some j ∈ Z ∩ C. (The case j ′ = j ∈ N 2 is considered below.) In this case (4.53) shows that the left-hand side of (4.2) equals Take the absolute values inside the two µĵ(dzĵ) integrals (giving an upper bound) and pull the pĵ terms out of the µ j (dz (j) ) integrals. Now we can integrate the pĵ integrals using (3.8) by first integrating over y ′ j , then the first zĵ integral, then the xĵ integral, and finally the second zĵ integral. This shows that the left-hand side of (4.2) is at most In the last line we used (4.2) for p j (i.e., Proposition 4.12). For j ′ = j ∈ N 2 we would use (4.2) for q t , which was established above. This completes the proof of (4.2) and the proof for (4.1) is similar.
Proof of Proposition 2.10. By Proposition 4.8 the required result holds for each p j t factor and then using (4.49) and (4.50), one easily verifies it for q t (as was done implicitly in the previous proof). The general case now follows easily from the product structure (4.53) and a short calculation which is much simpler than that given above.

Proof of Proposition 2.3.
Assume P is a solution of M (Ã, ν) where dν = ρ dµ is as in Proposition 2.3, and assume (2.7) throughout this section. Without loss of generality we may work on a probability space carrying a d-dimensional Brownian motion B and realize P as the law of a solution X of Note that for k ≥ 0, on [ k n , k+1 n ] and conditional on F k/n , X n has generator A 0 but with γ 0 , b 0 replaced with the (random) γ k ≡γ(X (k−1)/n ), b k ≡b(X (k−1)/n ). (

5.2)
We see in particular that pathwise uniqueness of X n follows from the classical Yamada-Watanabe theorem. An easy stochastic calculus argument, using Burkholder's inequalities and the boundedness ofγ,b and X 0 , shows that E ((X n,i t ) p ) ≤ c p (1 + t p ) for all t ≥ 0 and p ∈ N. Here c p may depend on the aforementioned bounds and is independent of n (although we will not need the latter). By making only minor modifications in the proof of Lemma 5.1 in [ABBP] we have: Lemma 5.1. For any T > 0, sup t≤T X n t − X t → 0 in probability as n → ∞. For k ∈ Z + , let and let p k t (x, y) denote the (random) transition density with respect to µ k of the diffusion described in (5.2) operating on the interval [k/n, (k + 1)/n]. Proposition 2.8(b) with n = 0 implies that (1 + y 1/2 j ), for x, y ∈ S 0 , 0 < t ≤ 1, (5.4) where as usual c 5.1 may depend on M 0 , d but not on k. We are also using (2.7) here to bound b k i /γ k i . Let S n λ f = E ∞ 0 e −λt f (X n t ) dt and define S n λ = sup{|S n λ f | : f 2 ≤ 1}, where as usual the L 2 norm refers to the fixed measure µ.
Proof. It suffices to consider |S n λ f | for non-negative f ∈ L 2 (µ). Let A bit of algebra using (2.7) shows that Let E k x denote expectation starting at x with respect to the law of the diffusion with (random) transition density p k , and let R k λ and r k λ (x, y) denote the corresponding resolvent and resolvent density with respect to µ k . Then (5.2) shows that where I n k = r k λ (x, y)p k−1 1/n (X n (k−1)/n , x)µ k−1 (dx)f (y) and we have used Corollary 2.12(a) to see R 0 λ f 2 ≤ λ −1 f 2 . As usual we suppress dependence on d and M 0 in our constants (which may change from line to line) but will record dependence on n. Use (5.4) and (5.5) to see that LetÊ k x denote expectation with respect to the diffusion with transition kernelp k t (x, y) = p k t (y, x) (as in Proposition 2.8(a)) andr k λ (x, y) = r k λ (y, x) be the associated resolvent density with respect to µ k . Then (5.8) is bounded by In the next to last line we have used the conditional independence of {X i : i / ∈ N 1 } (recall (2.5)). In the last line we have used (5.6) and (2.7) to see that 1/2 + 2δ ≤ 1, and also used Lemma 3.3(d). We can apply this last result because for all i / ∈ N 1 and all k ∈ Z + , thanks to (5.6) and (2.7). For i / ∈ N 1 we have E k y (X i s ) ≤ y i + M 0 s, and by (2.7) and (5.6) we have 2dδ ≤ 1/12. Therefore if f (s) = s −2δ + (M 0 s) 1/2+2δ (clearly f ≥ 1) we may now bound (5.8) by In the definition of I n k use Hölder's inequality and then this bound on (5.8) on one of the resulting squared factors to see that In the next to last line we have again used the independence of X i , i / ∈ N 1 under E k x , and the bound 1/2 + 5δ ≤ 1 which follows from (5.6) and (2.7). In the last line we have again used Lemma 3.3(d) whose applicability can again be checked as in (5.9). An elementary calculation on the above bound, again using (5.6) and (2.7) to see that 3δd ≤ 1/8, now shows that and therefore by (5.10), Take expectations in the above and use some elementary inequalities to conclude that the last by (5.3). Put this bound into (5.7) to complete the proof.
Remark 5.3. The above argument is considerably longer than its counterpart (Lemma 5.3) in [ABBP]. This is in part due to the non-compact state space in the above leading to the unboundedness of the densities on this state space (recall the bound (5.4)). More significantly, it is also because the argument in [ABBP] is incomplete. In (5.4) of [ABBP] the norm on f actually depends on k and is not the norm on the canonical L 2 space. The argument above, however, will also give a correct proof of Lemma 5.3 of [ABBP]-in fact the compact state space there leads to considerable simplification.
Proof of Proposition 2.3. This now proceeds by making only minor changes in the proof of Proposition 2.3 in Section 5 of [ABBP]. One uses the above Lemmas 5.1, 5.2 and Proposition 2.2. We only point out the (trivial) changes required. For f ∈ C 2 b (S 0 ) one uses Itô's Lemma and (5.1) to obtain the semimartingale decomposition of f (X n t ). The local martingale part is a martingale as in the proof of Theorem 2.1 in Section 2 (use (5.3)). Corollary 2.12(a) is used, instead of the eigenfunction expansion in [ABBP], to conclude that the constant coefficient resolvent R λ has bound λ −1 as an operator on L 2 . The rest of the proof proceeds as in [ABBP] where the bound ε 0 ≤ (2K(M 0 )) −1 is used to get the final bound, first on S n λ (|f |), and then on S λ (|f |) by Fatou's lemma.

Proof of Proposition 2.4.
Let (P x , X t ) (x ∈ S 0 ) be as in the statement of Proposition 2.4. Throughout this section, for any Borel set A we let T A = T A (X) = inf{t : X t ∈ A} and τ A = τ A (X) = inf{t : X t / ∈ A}, be the first entrance and exit times, respectively, and let |A| denote the Lebesgue measure of A. We say a function h is harmonic in D = B(x, r)∩S if h is bounded on D and h(X t∧τ D ) is a right continuous martingale with respect to P x for each x.
The key step in the proof of Proposition 2.4 is the following.
Proposition 6.1. Let z ∈ S. There exist positive constants r, c 6.1 and α, depending on z, such that if h is harmonic in B(z, r) ∩ S, then Proof. By relabeling the axes we may assume that If z is in the interior of S 0 , the result is easy, because the generator is locally uniformly elliptic, and follows by the first paragraph of the proof of Theorem 6.4 of [ABBP]. So suppose z ∈ ∂S 0 . Then J 0 < d and we may assume, again by reordering the axes, that there is a Since our result only depends on the values of of h in B(x, r) ∩ S, we may change the diffusion and drift coefficients of the generator of X outside B(z, r) ∩ S as we please. By changing the coefficients in this way and again relabeling the axes if necessary, we may suppose that our generator is where J ≤ K, a(i) ∈ {K +1, . . . , d} for i = J +1, . . . , K, each σ i is continuous and bounded above and below by positive constants, each b i is continuous and bounded, and each b i for i > K is bounded below by a positive constant. We have extended our coefficients to the possibly larger space S 1 = {x ∈ R d : x i ≥ 0 for all i ≥ K} as this is the natural state space forÃ. As B(z, r) ∩ S 0 = B(z, r) ∩ S 1 (by (6.2)) this will not affect the harmonic functions we are dealing with. For 0 ≤ δ < 1 let Take n ≥ 1 large enough so that Q n (0) ⊂ B(z, r/2) ∩ S 1 . We will first show there exist c 2 , δ > 0 independent of n such that Since the b i and σ i are bounded, there exists t 0 small such that for all x ∈ Q n+1 (0) and sup The first and last bounds are trivial, and the second inequality is easily proved by first noting and then using the Dubins-Schwarz Theorem and Markov's inequality. Hence (6.5) By Lemma 6.2 of [ABBP] there exists δ such that if U is uniformly distributed on [t 0 /2, t 0 ], then sup Scaling shows that Therefore by (6.5), for any x ∈ Q n+1 (0) with P x -probability at least 1/2, X U2 −n ∈ R n+1 (δ).
Proof of Proposition 2.4. We can now proceed as in the proof of Theorem 6.4 of [ABBP]. To obtain the analogue of (6.14) in [ABBP], we note from (2.2) that if x ∈ ∂S 0 , at least one coordinate can bounded below by a squared Bessel process with positive drift starting at zero.
Remark 6.2. Essentially the same argument shows that if for each x ∈ S, P x is solution of MP(A, δ x ) as in Theorem 1.4 (it will be Borel strong Markov by Theorem 1.4), then the resolvent S λ maps bounded Borel measurable functions to continuous functions. After localizing the problem, one is left with a generator in the same form as (6.3) and so the proof proceeds as above.
We work in the setting and with the notation from Sections 3 and 4. Recall, in particular, the Poisson random variables N ρ (t) from Lemma 3.4.
The required third order difference of G δ t,z (m) ,y on {ν 1 t > 0, ν 2 t > 0, ν 3 t = 0} is now a second order difference ofH δ t,z (m) ,y . Minor modifications of the derivation of (7.21) lead to The above bounds in E δ i i = 1, . . . 4 may be used in (7.5) and after the terms involving √ δ/t are neglected (for q = 0 these terms are bounded by their neighbours, and for q > 0, if they do not approach 0, the right side below must be infinite) we find The required bound follows from the above by a bit of algebra but as the reader may be fatigued at this point we point out the way. Trivial considerations show it suffices to show the following inequalities for n 0 , n 1/2 ∈ Z + and z ′ ≥ 0: (7.25) (7.24) is easy. (7.25) reduces fairly directly to showing that for n 1/2 ≥ 2, If z ′ ≤ 1 this is trivial and for z ′ > 1 consider n 1/2 ≤ z ′ /4 and n 1/2 > z ′ /4 separately. This completes the proof of (a). (b) The proof of this second order version of (a) is very similar to, but simpler than that of (a). One now only has a second order difference and three E δ i terms to consider. In fact we will not actually need q > 0 in (a) but included it so that the reader will not complain about the missing details in the proof of (b) (where q > 0 has been used in Proposition 4.12). We do comment on the lack of N 1/2 in this bound.
An argument similar to that leading to (7.23) shows that |z j − y j | q |D 2 z m+1 p t (z, y)|dz (m) is bounded by It is easy to check that and, using N 1/2 ≤ N 0 from (3.23), that Hence to prove (b), it remains to verifȳ Trivial considerations reduce this to showing that (1 + z ′ ) −1+q/2 ≤ cφ(z ′ , X (m+1) t /t, N 0 ) for N 0 ≥ 2. This is easily verified by considering N 0 < z ′ /2 and N 0 ≥ z ′ /2 separately.
(c) Note that (7.3) is the first order version of (a) and (b) with q = 0. The proof is substantially simpler, but, as it plays the pivotal role in the proof for the important 2-dimensional case, we give the proof in Section 8. (7.4) then follows immediately since the spatial homogeneity in the first m variables, (3.7), implies p t (z, y) = p t (−y (m) , z m+1 , −z (m) , y m+1 ). (7.26) Proof of Lemma 4.5(b). Let J denote the integral to be bounded in the statement of (b), and p n (w) = e −w w n /n! be the Poisson probabilities. Let Γ n be a Gamma random variable with density g n (x) = x n+b/γ−1 e −x Γ(n + b/γ) −1 , (7.27) and recall z ′ = z m+1 /γt. By integrating the bound from Lemma 7.1(b) in z m+1 (using Fatou's Lemma) we see that Our formula for the joint distribution of (X (m+1) t , N 0 ) (Lemma 3.6(a)) allows us to evaluate the above and after changing variables and the order of integration we see that if y ′ = y/γt, then n ((y ′ ) q/2 + Γ q/2 n + n q/2 )) y b/γ−1 dy. (7.28) There is a constant c 0 (as in Convention 3.1) so that E (Γ r n ) ≤ c 0 (n ∨ 1) r for all |r| ≤ 4 and n ∈ Z + satisfying r + n ≥ −3 4M 2 0 . (7.29) Indeed the above expectation is Γ(n + b/γ + r)/Γ(n + b/γ), where The result now follows by an elementary, and easily proved, property of the Gamma function. Assume now the slightly stronger condition |r| ≤ 3, n ∈ Z + and r + n ≥ −1 2M 2 0 . (7.30) Then Γ n = Γ 0 + S n , where S n is a sum of n i.i.d. mean one exponential random variables. If s and s ′ are Hölder dual exponents, where s is taken close enough to 1 so that the conditions of (7.29) remain valid with rs in place of r, then where we have used an elementary martingale estimate for |S n − n| and (7.29). Here c again is as in Convention 3.1. We now use (7.31) and (7.29) to bound the Gamma expectations in (7.28). It is easy to check that our bounds on p and q imply the powers we will be bounding satisfy (7.30). This leads to J ≤ ct q−2+p lim inf δ→0 q δ (y m+1 , y) ∞ n+0 p n (y ′ ) × 1 (n≤1) (1 + (y ′ ) q/2 ) + 1 (n≥2) n −1+p ((y ′ ) q/2 + n q/2 ) y b/γ−1 dy ≤ ct q−2+p lim inf δ→0 q δ (y m+1 , y) e −y ′ (1 + y ′ )(1 + (y ′ ) q/2 ) In the last line N (y ′ ) is a Poisson random variable with mean y ′ . Well-known properties of the Poisson distribution show that for a universal constant c 2 E (N (y ′ ) r 1 (N(y ′ )≥2) ) ≤ h r (y ′ ) ≡ c 2 (1 + y ′ ) r for all y ′ ≥ 0, |r| ≤ 2. (7.33) For negative values of r see Lemma 4.3(a) of [BP] where the constant depends on r but the argument there easily shows for r bounded one gets a uniform constant. If h(y ′ ) = e −y ′ (1 + y ′ )(1 + (y ′ ) q/2 ) + h q/2−1+p (y ′ ) + y ′q/2 h p−1 (y ′ ), then clearly h(y ′ ) ≤ c 3 (1 + y ′ ) q/2−1+p .
This shows we can again use the upper bound in Lemma 7.1(b) to bound the integral over y (m) in the above. One then must integrate the resulting bound in y m+1 instead of z m+1 . This actually greatly simplifies the calculation just given as one can integrate y m+1 at the beginning and hence the q δ term conveniently disappears (see the proof of (4.14) below). For example, if we neglect the insignificant n ≤ 1 contribution to φ in Lemma 7.1, the resulting integral is bounded by ct q−1 (z ′ ) −1 E (1 (N 0 ≥2) (N 0 + (N 0 − z ′ ) 2 )((z ′ ) q/2 + N q/2 0 + (X (m+1) t /t) q/2 )).
This can be bounded using elementary estimates of the Poisson and Hölder's inequality, the latter being much simpler than invoking Lemma 3.6. We omit the details.
Proof of Lemma 4.5(a). (4.13) is the first order version of Lemma 4.5 (b) and we omit the proof which is much simpler. (4.14) is a bit different from (c). Integrate (7.4) over y m+1 to see that y p m+1 |D z m+1 p t (z, y)|µ(dy) ≤ c 7.1 t −1 lim inf δ→0 y p m+1 E z m+1 q δ (y m+1 , X Now use the moment bounds in Lemma 3.3(d,e) to bound the above by The first term is trivially bounded by the required expression using Lemma 3.3 again. Using the joint density formula (Lemma 3.6), the Gamma power bounds (7.29), and arguing as in the proof of (b) above, the term in (7.34) involving N 0 is at most (7.35) where N = N (z ′ ) is a Poisson random variable with mean z ′ . We have E (|N − z ′ |N p 1 (N>0) ) ≤ c 0 (z ′ ∧ (z ′ ) 1/2+p ) for all z ′ > 0 and − 1 ≤ p ≤ 1/2. (7.36) For p ≤ 0 Lemma 3.3 of [BP] shows this (the uniformity for bounded p is again clear). For 1/2 ≥ p > 0 use Cauchy-Schwarz to prove (7.36). Separating out the contribution from N = 0, we see from (7.36) that (7.35) is at most The result follows.
Some simple Gamma distribution calculations like those in the proof of (b), and which the reader can easily provide (recall Convention 3.1), show that the above is bounded by a constant depending only on M 0 . As before by using this bound in (7.41) and integrating out n and y we arrive at J 2 (δ) ≤ c 2 s 3/2 . (7.42) Insert the above bounds on J i (δ) into (7.37) to complete the proof.
Put the bounds on K i (δ) into (7.43) to complete the proof of (4.22).
We omit the proof of (4.21) which is the first order analogue of (4.22) and is considerably easier.
We need a probability estimate for Lemma 4.7. As usual X (m+1) is the Feller branching diffusion with generator (3.1).

A Remark on the Two-dimensional Case.
As has already been noted, the proof of Proposition 2.2 (by far the most challenging step) simplifies substantially if d = 2. As this is the case required in [DGHSS], we now describe this simplification in a bit more detail.
Recall the three cases (i)-(iii) for d = 2 listed following Theorem 1.4. As noted there, the case E = ∅ is covered by Theorem A of [BP] (with d = 2) without removing (0, 0) from the state space, so we will focus mainly on the other two cases here (but see the last paragraph below). In these cases the localization in Theorem 2.1 reduces the problem to the study of the martingale problem for a perturbation of the constant coefficient operator with resolvent R λ and semigroup P t . Our job is to establish Proposition 2.2 for this resolvent.
For f ∈ D 0 , we have The required result follows.
If we wanted to include the case E = ∅ to make the above "short proof" selfcontained, then we need to consider Proposition 2.2 and hence (4.1) and (4.2) for the case The associated semigroup P t = 2 j=1 Q j t is a product of one-dimensional Feller branching (with immigration) semigroups with transition densities given by (3.3). As in the the last part of the proof of Proposition 2.14 at the end of Section 4, (4.1) and (4.2) reduce easily to checking (4.1) and (4.2) for each one dimensional Q i t . In the first part of the proof of Proposition 2.14 (in Section 4) we saw that these easily followed for each differential operator by projecting down the corresponding result for A 0 (as in (8.1)) to the second coordinate. This was checked in the "short" proof above for the the first order operators. It therefore only remains to check (4.1) and (4.2) forD x = xD 2 x and q t in place of p t . As in the proof of Proposition 4.12, we must verify (4.33), (4.34), (4.24), and (4.25) for this operator and one-dimensional density. These, however, can be done by direct calculation using the series expansion (3.3)-the arguments are much simpler and involve direct summation by parts with Poisson probabilities and elementary Poisson bounds.