GENERAL BRANCHING PROCESSES CONDITIONED ON EXTINCTION ARE STILL BRANCHING PROCESSES

It is well known that a simple, supercritical Bienaymé-Galton-Watson process turns into a subcritical such process, if conditioned to die out. We prove that the corresponding holds true for general, multi-type branching, where child-bearing may occur at different ages, life span may depend upon reproduction, and the whole course of events is thus affected by conditioning upon extinction.


Introduction
The theory of branching processes was born out of Galton's famous family extinction problem. Later, interest turned to populations not dying out and their growth and stabilisation. In more recent years, extinction has retaken a place in the foreground, for reasons from both conservation and evolutionary biology. The time and path to extinction of subcritical general populations was studied in [4]. Here, time structure is crucial, and life spans and varying bearing ages cannot be condensed into simple, generation counting Bienaymé-Galton-Watson processes. Thus, the question arises whether (non-critical) general branching populations (also known as Crump-Mode-Jagers, or CMJ, processes) bound for extinction must behave like subcritical populations. We answer this in the affirmative: a general, multi-type branching process conditioned to die out, remains a branching process, but one almost surely dying out. If the original process was supercritical but with a strictly positive risk of extinction, the new process is subcritical. 1

SUPPORTED BY THE SWEDISH RESEARCH COUNCIL 2 SUPPORTED BY THE SCIENCE FACULTY OF THE UNIVERSITY OF GOTHENBURG THROUGH THE PLATFORM FOR THEORETICAL BIOLOGY
Formulated in such a loose manner, this fact belongs to the folklore of branching, but actually it has been proved only for Bienaymé-Galton-Watson processes, [1], p. 52. A moment's afterthought tells us that it remains true for age-dependent branching processes of the Bellman-Harris type, where individuals have i.i.d. life spans, and split into i.i.d. random numbers of children, independently of life span, the time structure thus not being affected by the conditioning. But what if the flow of time is no longer independent of reproduction? Even the simplest case, that of a splitting reproduction at death, but not independently of age at death/splitting, would seem to offer difficulties, and the same certainly applies to the more realistic general processes where reproduction occurs as a point process during life, thus mimicking the yearly births of wildlife, or the even more erratic reproduction pattern of humans. The conceptual framework is intuitive. Starting from an Eve, individuals live and give birth independently of one another. At birth each individual inherits a type from her mother. The type, in its turn determines the probability measure over all possible life careers, including a life span and a marked point process which reports the successive ages at bearing, and the types of children at the various bearings. Note that multiple births are not excluded. The branching property can be summarised into the fact that given her type and birth time, the daughter process of any individual born is independent of all individuals not in her progeny (into which she herself is included). We set out to prove that this branching property also holds for processes conditioned to die out. Initially, we shall not mention supercriticality, and only ask that the probability of extinction is nonzero for any starting type. (If that probability is one, the conditioning does not change anything!) Largely, the proof is a matter of conceptual clarity or discipline, which unfortunately forces us into the somewhat burdensome notation of probabilities on tree spaces, obscuring the essential simplicity of the matter. The main idea behind the proof is, however, easily outlined. Indeed, consider an individual, and condition upon her progeny ultimately dying out. Her own life career is then affected precisely through her only being able to have daughters whose progeny in their turn must ultimately face extinction. In all other respects her life is independent of all others, once her type is given. This reestablishes the branching character, but with a suitably amended probability measure over her life career, which clearly is non-supercritical in the sense that the probability of ultimate extinction is one, from any starting type that can be realised.

Notation
Throughout this paper we use the notation of [3], which may also serve as a reference for the reader interested in further properties of general multi-type branching processes.

The Ulam-Harris family space
We choose to work within the classical Ulam-Harris framework identifying individuals with sequences of natural numbers so that x = (x 1 , x 2 , . . . , x n ) denotes the x n th child of the . . . of the x 2 th child of the x 1 th child of the ancestor. The ancestor is denoted by an "empty" sequence e (mnemonic for "empty" or "Eve"), and the set of all possible individuals is The concatenation of x, y ∈ is x y, and thus e x = x e = x for all x ∈ .
For any e = x = (x 1 , x 2 , . . . , x n ) x's mother is mx = (x 1 , . . . , x n−1 ), her rank in the sibship is r x = x n , and x's generation g(x) = n. We agree that me = r e = e and g(e) = 0. Hence, mx r x = x for x ∈ , and m can be iterated so that m n x is x's nth grandmother, provided g(x) > n. Clearly x stems from y, usually written x y, if m n x = y for some n ∈ ∪ {0}, or equivalently if there exists a z ∈ : x = yz. In this terminology, x stems from herself, x x. In other words, ( , ) is a partially ordered set (a semilattice). We define We call a set L ⊂ a stopping line, or line for short, if no two members of L are in direct line of descent: We say that a line L is a covering line if for all x ∈ there exists a y ∈ L such that x ∼ y.

Life space and population space
Let (Ω ℓ , ℓ ) be a life space so that ω ∈ Ω ℓ is a possible life career of individuals. Any individual property, such as mass at a certain age or life span, is viewed as a measurable function (with respect to the σ-algebra ℓ ) on the life space. This should be rich enough to support, at least, the functions τ(k), σ(k) for k ∈ . Here τ(k) : Ω ℓ → + ∪ {∞} is the mother's age at the kth child's birth, 0 ≤ τ(1) ≤ τ(2) ≤ · · · ≤ ∞. If τ(k) = ∞, then the kth child is never born. σ(k) : Ω ℓ → is the child's type, obtained at birth. The type space has a (countably generated) σ-algebra . The whole reproduction process is then the marked point process ξ with For simplicity U x = U {x} and similarly Pr x = Pr{x}. The following σ-algebras are important: In the usual manner, the definition of the σ-algebras L can be extended to σ-algebras of events preceding random lines which are optional in the sense that events { L} ∈ L [3]. Functions ξ, τ(k) and σ(k) were defined on the life space but we want to be able to speak about these quantities pertaining to a given x ∈ . We write Note the difference between τ(k) and τ k , σ(k) and σ k , for k ∈ ⊂ .
Finally, the process is anchored in real time by taking Eve to be born at time 0, and letting later birth times t x , x ∈ be recursively determined by t e = 0 and t x = t mx + τ x for e = x ∈ . The meaning of t x = ∞ is that x is never born, so that is the set of realised individuals. This set is optional, L∩ is well defined [3], and so is the σalgebra of events pertaining only to realised individuals. The probability space restricted to such events is that where a branching processes really lives, cf. [5], [2].

The probability measure and branching property
The setup is that for each s ∈ there is a probability measure P(s, ·) on the life space (Ω ℓ , ℓ ), such that the function s → P(s, A) is measurable with A ∈ ℓ . For any s ∈ this kernel (the life kernel) defines a population probability measure s on (Ω, ) with an ancestor of type σ e = s and such that given σ x , x's life will follow the law P(σ x , ·) independently of the rest of the process, see [3]. Indeed, the basic branching property of the whole process can be characterised by a generalisation of this in terms of the mappings for lines L. This remains true for optional lines and in particular s x∈L∩ where the intersection over the empty set is taken to be Ω and the empty product is ascribed the value one. The interpretation is that the daughter processes of all realised individuals x in a line are independent given the prehistory of the line with the population probability measure σ x , the only dependence upon the past thus being channelled through the type σ x and the placing in time t x . This is the branching property. We shall see that it remains true for processes which are bound to die out.

Conditioning on extinction
Denote by E the event that the branching process starting from Eve dies out, i.e. that has only a finite number of elements. Let q s = s (E) and E x be the event that the branching process starting from x dies out, E x = T x E. Write˜ s (·) = s (·|E), which clearly only makes sense for s ∈ such that q s > 0, and let˜ s denote expectation with respect to˜ s .

Main result Theorem 1. Any branching process conditioned on extinction remains a branching process, but with extinction probability one. Its life kernel isP(s, A)
Thus, for any covering lines L and {A x : Furthermore, the Radon-Nikodym derivative d˜ s /d s with respect to the σ-algebra L∩ is given by Proof. First, note that for covering lines L. Indeed, since {L ∩ = } ⊆ E and intersection over an empty index set yields the full space, The branching property (1) implies that  (4), (1) and (5) yield and (2) follows.
Remark 1. With X := ξ e ( × + ) being Eve's total offspring and L = , the first generation, We thus obtain the conditioned life kernel showing both that the number of children under the conditioning tends to be lower, and that the children tend to have types with higher extinction probabilities, which is quite reasonable. For single-type processes with extinction probability q, equation (6) simplifies to and in the Bienaymé-Galton-Watson case the conditioned offspring distribution is thus in perfect agreement with [1, Theorem I.12.3].
Remark 2. As was pointed out to us by a referee, general multi-type branching processes are introduced in [3] as Markov fields on the set , and in this context˜ s may be seen as an htransform of s , since the product in the right hand side of (3) is a harmonic function defined on lines L ⊆ .   This process can also be reformulated due to the memorylessness of the exponential distribution. We interpret the birth of a child as the splitting of the mother into two new individuals (or particles), one being the child and one being the mother. Particles of type 1 have exponential life spans with expected value 1 3 and at death either split into two new particles of type 1 with probability p 11 = 2 3 or leave no offspring with probability p 10 = 1 3 . Particles of type 2 have exponential life spans with expected value 1 5 and split into two particles of type 2 with probability p 22 = 2 5 , two particles -one of either type -with probability p 21 = 2 5 , or leave no offspring with probability p 20 = 1 5 . The extinction probabilities solve (taking expected values in (5) with L = ) q 1 = p 11 q 2 1 + p 10 q 2 = p 22 q 2 2 + p 21 q 2 q 1 + p 20 , so In terms of the original process, the new, conditioned process will have individuals of type 1 whose life spans are exponential with expectation 1 3 · 1 p 10 = q 1 = 1 2 , and it will have individuals of type 2, with life spans also exponential, but the expected value 1 5 · 1 p 20 = 0.293. Children are still born according to Poisson processes with intensities 3 and 5, respectively. However, the proportion of types born by individuals of type 2 is changed: the probability of a child being of type 1 is nowp 21 = 0.631.

Super-and subcriticality
Finally, we address the question of whether a supercritical process conditioned on extinction is subcritical. In this section we leave the full generality of Theorem 1. This is partly because the notions of super-and subcriticality for general multi-type branching processes are quite involved: Several conditions on the life kernel are required in order to ascertain that the process is Malthusian, meaning roughly that the population grows or declines in an asymptotically exponential manner. The asymptotic growth rate is then given by the Malthusian parameter, and the process is called supercritical if this rate is positive, and subcritical if it is negative. We refer to [3] for details. Let us consider a supercritical process such that also the conditioned process is Malthusian. Such processes clearly exist, e.g. single-type Bienaymé-Galton-Watson processes, see [1], but here we do not delve into what conditions are necessary for this to hold for general processes. It is clear that the conditioned branching process has extinction probability one for any starting type but this would also be the case if the process were nontrivially critical, i.e. with zero growth rate. In the single-type case it follows from (7) that the expected total number of children per individual in the conditioned process satisfies˜ in terms of the offspring generating function f of the embedded Bienaymé-Galton-Watson process. It is well known that f ′ (q) < 1 if the original process was supercritical, see [1], and we conclude that the conditioned process is subcritical. For the multi-type case, besides requiring the unconditioned and conditioned processes to be Malthusian, we also need q := sup q s < 1. As in the single-type case we will consider the embedded generation counting process. Let X n (A) = card{x ∈ n ∩ : σ x ∈ A} denote the number of individuals of type A in nth generation. Theñ s [X n ( )] = s X n ( )e log q r X n (d r) /q s ≤ s X n ( )q X n ( ) /q s → 0, as n → ∞ (by dominated convergence, since X n ( ) must either tend to zero or to infinity). But the expected size of the embedded process tending to zero means exactly that the process is subcritical.