POISSON POINT PROCESS LIMITS IN SIZE-BIASED GALTON-WATSON TREES

Let b T t be a critical binary continuous-time Galton-Watson tree size-biased according to the number of particles at time t: Decompose the population at t according to the particles’ degree of relationship with a distinguished particle picked purely at random from those alive at t . Keeping track of the times when the di(cid:11)erent families grow out of the distinguished line of descent and the related family sizes at t , we represent this relationship structure as a point process in a time-size plane. We study limits of these point processes in the single- and some multitype case.


Introduction
In this paper we introduce a point process representation of the genealogy of a random tree which we use to study asymptotics for a certain classical model of random trees. To explain the point process representation consider a branching population with one particle being distinguished. Say that at a fixed time t particles belong to the same family, if they have the same most recent common ancestor with the distinguished particle. This relationship structure can be represented by a point process on (0, t] × {1, 2, . . .} where a point has as its coordinates the minimal (backwards) time that family members have a common ancestor with the distinguished particle and the family size at t. Here we study limits of such point processes representing a critical single-or multitype Galton-Watson tree size-biased according to the number of particles at time t, where the distinguished particle is picked purely at random from the population at t.
In the single-type case we show that the Poisson point process Ψ t representing the family structure of the size-biased Galton-Watson tree has a weak limit Ψ, say. The intensity of the Poisson point process Ψ reflects the asymptotic behavior of critical Galton-Watson branching processes and explains the Lévy-Khinchin representation of the limiting gamma distribution for the rescaled population size at t.
We then discuss the case where the particles may have different types and mutate at a fixed positive rate µ to a new type not present so far. Let Ψ µ t be the point process representing the subpopulation with the type of the distinguished particle at time t. Our main result states that the rescaled point process Ψ µ t converges in distribution as first t → ∞ and then µ → 0, and identifies the limiting point process as a random portion R U Ψ of the weak limit Ψ from the single-type case. Here the relative size U is uniform on (0, 1), independent of Ψ, and R U denotes the restriction to the time period (0, U]. This result and a scaling property of Ψ are applied to obtain an exponential limit law for the size of the distinguished type in equilibrium. In Section 2 we first review the decomposition of the size-biased Galton-Watson tree along the distinguished line of descent of a particle chosen purely at random and give a precise description of its point process representation [formula (2.7)]. We then derive the limit laws for the single-type case. The multitype case is contained in Section 3.
Size-biased Galton-Watson trees arise as so-called Palm trees in branching particle systems [4,13] and as limits of the pedigree of a randomly selected particle in a large population [11,12]. They are closely related to Galton-Watson trees conditioned on non-extinction (see the remark at the end of Section 2) or on large total population size. The genealogy of critical Galton-Watson trees conditioned on nonextinction has been studied in [6,8] and later in the super process context, see e.g. [15]. Galton-Watson trees conditioned on large total population size have been shown to converge to some limiting continuum tree [1]. For spinal decompositions and related constructions of random family trees, see e.g. [2,9,16].

The size-biased Galton-Watson tree
Consider a critical binary continuous-time Galton-Watson branching process (Z t ) t≥0 starting with a single founding ancestor. In such a process particles have independent exponential lifetimes with rate one, say, and produce either two or no offspring with probability p 0 = p 2 = 1/2, the lifetimes and offspring numbers being independent (for background of Galton-Watson processes, see e.g. [3]). Let T be the random family tree obtained by having one edge for each particle produced with the length of an edge being the particle's lifetime. We think of T as a rooted planar tree with the distinguishable offspring of each particle ordered left and right. T is identified by its shape [T ] and the collection of the particles' lifetimes L i , 1 ≤ i ≤ n(T ). Here n(T ) is the total progeny, i.e. the number of particles ever produced (including the root), and edges are labeled in some fixed deterministic manner, e.g. according to a depth-first search procedure (for a formal description of the space of trees we refer to [2,17]). By the shape [T ] of a tree T we mean the tree obtained from T by setting all edges to constant length. See Figure 1 for an example of a realization of T . For a rooted binary tree t with edge lengths i , 1 ≤ i ≤ n(t), let T ∈ dt be the event that T has shape [t] and edge lengths in [ i , i + d i ). Since the offspring numbers and exponential lifetimes are assumed to be i.i.d. we have (2.1) Note that the distribution of T depends on [T ] only through n(T ), that is any shape with fixed total progeny is equally likely. By the assumed criticality, T is almost surely finite, which in particular means that the weights in (2.1) sum to 1.
The law of the size-biased critical binary continuous-time Galton-Watson tree T t is defined as where z t (t) is the number of edges at height t of the tree t (note that, by the assumed criticality, the normalizing constant Ez t (T ) on the right-hand side of (2.2) equals 1, hence T t is really distributed according to a probability law). Relation (2.2) says that a tree with population size k at time t is chosen k times as likely as if sampling were according to Galton-Watson measure. In particular, if Z t = z t ( T t ) then Z t has the size-biased distribution of Z t = z t (T ), that is, The size-biased Galton-Watson tree T t is a finite random tree of minimum height t. The tree is no longer symmetric about rehanging subtrees, which means that particles no longer evolve independently of each other nor homogeneously in time. However, as can be seen from the following explicit construction, the size-biased tree T t has a transparent structure if decomposed along the line of descent of a particle V, say, chosen purely at random from the population at time t. We construct a random binary continuous-time tree T * t and a random edge V * of T * t at height t such that the pair (T * t , V * ) has distribution L( T t , V ). We start with the construction of the line of descent of the distinguished particle V * alive at t. Take the points T 1 < T 2 < . . . of a homogeneous rate one Poisson process on (0, t] to be the birth times of V * and its ascendants (other than the founding ancestor) and let each particle in V * 's line of descent (including V * , excluding the founding ancestor) independently have probability 1 2 of being the sibling to the left.
Otherwise the particle is positioned to the right. Having constructed V * 's line of descent we have to determine the distribution of the descendant tree of V * and of the descendant trees of the siblings of the particles in the distinguished line of descent: Let the siblings of of the particles in the distinguished line of descent found ordinary critical binary Galton-Watson trees and attach another independent such tree at the top (i.e. at height t) of the distinguished line of descent. Call the resulting tree T * t . See Figure 2 for an illustration of (T * t , V * ). Note that particles not in the distinguished line of descent evolve as in an ordinary critical binary Galton-Watson branching process and so does the distinguished particle V * after time t. We remark that V * cannot be recovered from T * t unless z t (T * t ) = 1. In fact, the distinguished edge V * is distributed uniformly on the edges of T * t at height t, where e is any edge at height t of the finite tree t with minimum height t. The marginal distribution of the pair (T * t , V * ) on the space of trees is the size-biased distribution (2.2), and assume for convenience that e's ascendants are labeled as e 1 = founding ancestor, e 2 , . . . , e j = e and that 1 , . . . , j are the corresponding lifetimes.
Recall that the infinitesimal probability that a rate one Poisson process on (0, t] has exactly j − 1 points Also, note that each of the 2 j−1 possible left/right arrangements of the particles in the distinguished line of descent is equally likely. Since the subtrees founded by particles off the distinguished line of descent have distribution (2.1) and the remaining lifetime of e after time t is j i=1 i − t, we see from the construction of (T * t , V * ) that Summing over all z t (t) edges of t at height t proves (2.5). Relation (2.4) follows from the fact that the right-hand side of (2.6) does not depend on e.] Remark. The construction above is a special case of the construction of the conditioned family tree of a branching particle system described in [4] and is easily extended to general offspring distributions with finite mean. It is intimately related to the (infinite) size-biased discrete-time Galton-Watson tree constructed in [16].
The following statement is evident from the construction of (T * t , V * ) and the equality in law of ( T t , V ) and (T * Let R s , s > 0, denote the restriction operator, Since the Ψ t satisfy the consistency condition all Ψ t , t > 0, can be constructed from one single Poisson point process Ψ ∞ on R + × N with intensity measure ν ∞ (ds × {k}) = ds P(Z s = k).
A suitable rescaling of Ψ t is to speed up time by t and assign mass t −1 to each particle. The asymptotic behavior of the rescaled point process is described by the following Proposition. Here, d → denotes convergence in distribution, which is just weak convergence of the joint distributions of (t −1 Ψ t (B i ), 1 ≤ i ≤ n) for every finite family of relatively compact Borel subsets B i of (0, 1] × R + .

Proposition 2.2. As t → ∞,
Proof. Regard t −1 Ψ t as a point process on (0, 1] × R + rather than on (0, 1] × t −1 N. By the mapping theorem for Poisson point processes (see e.g. [14], p. 18), t −1 Ψ t is a simple Poisson point process with intensity measureν t = ν t (t · ). Hence, it is sufficient to showν t (B) → ν(B) for every relatively compact Borel subset B of (0, 1] × R + . The critical binary Galton-Watson process (Z s ) s≥0 is a linear growth birth and death process whose distribution is explicitly known (see e.g. [7], p. 480). For any s ≥ 0, Since the rectangles C generate the σ-algebra of Borel subsets of (0, 1] × R + , we in particular havẽ ν t (B) → ν(B) for every relatively compact Borel subset B of (0, 1] × R + . This completes our proof of Proposition 2.2. Remark. Note that to obtain (2.13) we have only used the asymptotic decay of the non-extinction probability P(Z t > 0) and the exponential limit law of Z t conditioned on Z t > 0 following from (2.12). This asymptotic behavior holds for any critical Galton-Watson branching process with finite variance and is commonly referred to as Kolmogorov's asymptotic and Yaglom's exponential limit law (see e.g. Section I.9 in [3]).
The limiting point process Ψ from Proposition 2.2 satisfies the following scaling property which we state for further reference.
The following result describes the asymptotic behavior of Z t as t → ∞.

Proposition 2.4. The number of edges at height t in the size-biased Galton-Watson tree T t has a gamma limit law with shape parameter 2,
Proof. Let g(A) = (s,z)∈A z for a countable set A ⊂ R + × R + . Note that g(Ψ t ) = Z t − 1 is the total number of particles at time t other than the distinguished particle itself. We first show that g(Ψ) has distribution as in (2.15). Write p(A) for the projection of A onto its mass coordinate, By the mapping theorem p(Ψ) is a simple Poisson point process on R + with intensity measure νp = ν((0, 1] × · ). Using a change of variables u = s −1 we obtain It is well-known and easily verified that ν α (dz) = z −1 e −αz dz is the Lévy measure of the gamma process with scale parameter α (see e.g. [18], p. 80). Consequently, g(Ψ) = z∈p(Ψ) z has the gamma distribution as in (2.15) with scale and shape parameter 2, respectively. To establish the weak convergence Hence, lim Assertion (2.17) now follows from Proposition 2.2 and the fact that t −1 g(Ψ t ) = g(t −1 Ψ t ). This completes our proof of Proposition 2.4.

Pruning the size-biased Galton-Watson tree
We now suppose that the particles may have different types. While alive particles independently mutate at rate µ > 0, changing to a new type which was not present so far. (We may take the interval (0, 1) for the set of possible types.) At birth particles inherit the type of their parent. Mutation and branching mechanisms are assumed independent. To indicate mutation events we put marks on the edges of the tree. If a particle mutates at time s, then a mark is put at height s on the edge corresponding to that particle. (The term "marked tree" is sometimes used for what we call a continuous-time Galton-Watson tree or, more general, for a tree-indexed Markov process. Here a mark is just meant to be some symbol.) An edge of length has a Poisson number of mutation marks with mean µ . The positions of the marks are independent and uniformly distributed on the edge. If we cut up the edges of the tree at mutation marks, then the tree falls to components of identical type (see Figure 3). For an account of pruned discrete-time Galton-Watson trees, see [2].
Our aim is to explore the genealogy of the subpopulation with the type of the distinguished particle V at t which is described by the component of the tree containing V (respectively, the piece of the edge V at height t, if the edge is cut up). As in the single type case we suppress some of the information on the relationship between particles and represent the pruned size-biased Galton-Watson tree with a distinguished particle by the random set where Z µ t (s) is the number of particles at time t which at t have the same type as the distinguished particle V and have distance s from V . If we pass to the limit t → ∞ in (3.1) we obtain a weak limiting point process Ψ µ ∞ , say. (Indeed, recall from Section 2 that the law of a subtree growing out of the distinguished line of descent at height t − s does not depend on t. Hence, neither does the law of Z µ t (s), 0 < s ≤ t.) We will refer to the point process Ψ µ ∞ as the equilibrium genealogy. Its law is described by the following lemma.
Proof. By Lemma 2.1 independent critical binary Galton-Watson trees grow out of the distinguished line of descent at rate one. We claim that if the edges of a critical Galton-Watson tree are cut up at rate µ, then the component containing the root is a subcritical binary Galton-Watson tree with the parameters stated in the lemma. Indeed, note that the length of an edge in the pruned tree is the minimum of two independent exponential random variables X 1 and X 2 with rates 1 (original lifetime) and µ (mutation), respectively. The minimum of X 1 and X 2 has exponential distribution with rate 1 + µ.
For the offspring numbers note that a particle in the pruned tree has two children iff it had so before the pruning procedure and if the particle did not mutate. The first event has probability p 2 = 1/2, the second event independently has probability P(X 1 < X 2 ) = (1 + µ) −1 . Finally, note that the particles in some pruned subtree have the same type as the distinguished particle V iff their distance from V is less than the time since the most recent mutation in the distinguished line of descent. This mutation event has occurred an independent exponential time T µ ago.
The following theorem describes the asymptotic behavior of the rescaled equilibrium genealogy f µ (Ψ µ ∞ ) as the mutation rate µ goes to zero.
where R u , 0 < u ≤ 1, is the restriction operator introduced in (2.9), Ψ is the limiting Poisson point process from Proposition 2.2, and U has uniform distribution on (0, 1), independent of Ψ.
Proof. In much the same way as in the proof of Proposition 2.2 we first show that By the mapping theorem f µ (Φ µ ) is a simple Poisson point process on (0, 1] × R + with intensity measure for every relatively compact Borel subset B of (0, 1] × R + . For any s ≥ 0 the distribution of Z µ s is given by (see e.g. [7] (3.5) Using (3.4), (3.5) and the fact that h µ (s) = µ (1 − h µ (s)), we obtain for any rectangle C = [u 1 , which establishes assertion (3.3). Now observe that and that h µ (T µ ) d = U since h µ is the distribution function of T µ . Hence, Lemma 3.1 implies where U is independent of Φ µ . In view of (3.3) the claim of Theorem 3.2 now follows by passing to the limit µ → 0.
Let Z µ ∞ = 1 + g(Ψ µ ∞ ) denote the number of particles with the distinguished type in equilibrium. Our final result describes the asymptotic behavior of Z µ ∞ as the mutation rate µ goes to zero. = U g(Ψ), that is g(R U Ψ) is an independent uniform contraction of a gamma distributed random variable with shape and scale parameter 2 (Proposition 2.4). Consequently, g(R U Ψ) has exponential distribution with rate two (see this e.g. by considering the interval covering the origin in a homogeneous Poisson process on R). To verify (3.6) recall that EZ µ s = e −µs (see e.g. [7], p.457). Hence, Lemma 3.1 implies which completes our proof of Theorem 3.3.

Remark.
It is shown in [5] that the genealogy of the cluster at the origin in the multitype voter model is asymptotically described by the size-biased Galton-Watson tree. Combining this result with Theorem 3.3 explains a classical result by Sawyer [19] that the size of the type at the origin in the multitype voter model with mutation approaches an exponential limit distribution.