ELECTRONIC COMMUNICATIONS in PROBABILITY Consistent Markov branching trees

We study consistent collections of random fragmentation trees with random integer-valued edge lengths. We prove several equivalent necessary and sufficient conditions under which Geometrically distributed edge lengths can be consistently assigned to a Markov branching tree. Among these conditions is a characterization by a unique probability measure, which plays a role similar to the dislocation measure for homogeneous fragmentation processes. We discuss this and other connections to previous work on Markov branching trees and homogeneous fragmentation processes.


Introduction
Random tree models arise in population genetics when inferring unknown phylogenetic relationships among extant species.Phylogenetic trees are often used to represent these relationships, with leaves labeled by species and branch points corresponding to speciation events.The root of the tree corresponds to the most recent common ancestor of the species under consideration.In [1], Aldous provides some modeling axioms for phylogenetic trees; among these axioms are exchangeability and consistency (under subsampling).Typically, the species labeling the leaves are represented by distinct elements of [n] := {1, . . ., n}, and the exchangeability axiom reflects the assumption that the model should be invariant under arbitrary reassignment of elements to species.In a statistical setting, consistency reflects the assumption that the observed phylogenetic tree is a finite subtree sampled from the (possibly infinite) phylogenetic tree for all species.An admissible statistical model, therefore, corresponds to a family of probability measures on the space of infinite phylogenetic trees, that is, trees with leaves labeled in the natural numbers N.
Along with these axioms, Aldous introduced the beta-splitting family of Markov branching trees.In general, a Markov branching tree is a random tree for which nonoverlapping subtrees are conditionally independent.Within the phylogenetic framework, it is natural to consider random trees with edge lengths or weights (weighted Markov branching trees), where edge lengths are interpretted as time between speciation events.Previous authors [4,6] have considered the task of assigning continuous (Exponentially distributed) edge lengths to Markov branching trees in a consistent way as the size of the initial mass varies.In this paper, we undertake the related question of assigning discrete (Geometrically distributed) edge lengths to Markov branching trees.In a phylogenetic context, discrete edge lengths correspond to evolution occurring in discrete-time and, therefore, reflects the assumption that generations are nonoverlapping, an assumption shared by some classical population genetics models; see [7] for an extensive treatment of probability models in population genetics.
Aside from applications to phylogenetics, random tree models are of their own mathematical interest.Particularly, part of the treatment in [4] relates weighted Markov branching trees to homogeneous fragmentation processes [2], a class of continuoustime Feller processes on partitions of N. In our main theorem, we give precise conditions under which discrete edges can be consistently attached to a Markov branching tree; and we characterize these trees by a unique probability measure on the space of ranked-mass partitions.
We point out at least one novelty that distinguishes this paper from previous work.In contrast to [4], we do not appeal to Bertoin's theory of homogeneous fragmentations; rather, our proofs rely on a construction of discrete-weighted Markov branching trees as the projective limit of a sequence of finite weighted trees.At least some of the conclusions in [4] could be derived using our methods; however, as we explicitly consider trees with integer-valued edge lengths, we cannot appeal to the theory of homogeneous fragmentations, which evolve in continuous-time.Nevertheless, our characterization of discrete-weighted Markov branching models also ties into previous work on homogeneous fragmentations, which we discuss in Sections 3.1 and 3.4.
Probabilistically, discrete-weighted Markov branching models are complementary to continuous-weighted Markov branching models.Taken together, these weighted tree models illustrate a fundamental aspect of the memoryless property: the Exponential and Geometric distributions are, respectively, the unique memoryless distributions on the positive real numbers and positive integers.An interesting twist, however, is that, unlike the continuous weight case, it is not always possible to attach Geometric random edge weights consistently for all n ∈ N. Our main theorem states precisely when this embedding is possible.
An overview of the paper is as follows: in Section 2, we state our main theorem as well as give some preliminary definitions and notation; in Section 3, we discuss the components of the main theorem in detail, putting our observations in the context of previous literature on the topic; in Section 4, we formally define some concepts introduced in previous sections; in Section 5, we prove the main theorem.

Preliminaries and statement of main theorem
Throughout the paper, fragmentation formalizes the notion of a phylogenetic tree.
where t i is a fragmentation of A i for each i = 1, . . ., k.
We call the elements of π b , for b ∈ t A , the children of b and write Π t A = π A to denote the root partition of t A .We identify the set A ∈ t A as the root of A and we write T A to denote the collection of all fragmentations with root A. Alternatively, we may refer to a fragmentation as a fragmentation tree or, simply, a tree.
The illustration in (2.2) makes clear the connection between Definition 2.1 and the visual interpretation of a phylogenetic tree.Remark 2.2.Definition 2.1 is initialized by taking t {i} := {{i}, ∅} for each singleton {i} ⊂ N. Inclusion of the empty set in the definition of t A is done for notational convenience, which arises when taking restrictions of weighted trees in the sequel.
To any subset A ⊂ A, there is a natural restriction of any t ∈ T A to T A by . The projective limit of {T [n] } n∈N under the restriction maps {R m,n } m≤n is denoted T N and corresponds to the space of fragmentation trees with root N.For n ∈ N, we write R n : T N → T [n] to denote the restriction to T [n] of an infinite tree, as defined in (2.1) with A = [n] and A = N.We equip T N with the σ-field σ R n n∈N so that these maps are measurable.
(2.2) We are specifically interested in probability models for fragmentation trees with integer-valued edge lengths.From any t ∈ T A , we obtain a discrete-weighted tree t • by assigning a positive integer weight w b > 0 to every b ∈ t.The pair t • := (t, w), with w := {w b } b∈t , then determines a tree with edge lengths.We write T • A to denote the space of discrete-weighted trees with root A, for which there is also a natural restriction map R • A ,A , for every A ⊆ A, defined by removing elements and elongating edges as needed.These restrictions make the collection {T • [n] } n∈N of finite discreteweighted trees projective with limit denoted T • N .Weighted fragmentations are formally introduced in Section 4.2; a pictorial representation of a discrete-weighted tree is given in (4.1).
The probability models we consider are extensions of Markov branching models on T N .By the projective structure of T N , any probability measure Q on T N is determined by its finite-dimensional restrictions Q , for every n ∈ N. Specifically, we consider the task of assigning random Geometrically distributed edge lengths to exchangeable Markov branching trees.
In general, the collection Q := (Q [n] ) n∈N determines an exchangeable Markov branching model if, for every n ∈ N, T ∼ Q [n] is • exchangeable: the law of T is invariant under the obvious action of relabeling its leaves by an arbitrary permutation σ : for every m ≤ n; and, • Markovian: given any collection {A 1 , . . ., A k } of non-overlapping subsets in T, the collection {T |A1 , . . ., T |A k } of reduced subtrees is conditionally independent and distributed according to , respectively, where n j := #A j , j = 1, . . ., k.
Any exchangeable Markov branching model Q is determined by a family of exchangeable splitting rules p := (p n ) n≥2 , where each p n is a probability measure on the space 3) It has been shown, e.g. in [1,4,6], that p := (p n ) n≥2 determines an exchangeable Markov branching model if and only if p n is exchangeable and where e (n+1) p ) n∈N to denote the Markov branching model determined by the consistent splitting rule p.Note that (2.4) is merely the requirement that the marginal distribution of the root partition of T ∼ Q Given a Markov branching tree p , we randomly assign edge lengths to T n as follows.First, we specify τ := (τ n ) n≥0 , with τ 0 = τ 1 = 0 and τ n ∈ (0, 1] for all n ≥ 2. Given T n = t, we take independent random variables W n := {W n (b)} b∈t , where W n (b) ∼ Geo(τ #b ) has the Geometric distribution with parameter τ #b .(We define Geo(0) to be the point mass at ∞.) We write Q obtained in this way.Our main theorem considers the question of when the collection (Q p,τ ) n∈N of finite-dimensional distributions determines a unique probability measure Q • p,τ on the limit space T • N .We now state our main theorem.Theorem 2.3.Let p := (p n ) n≥2 be a family of exchangeable splitting rules satisfying (2.4).The following are equivalent.

The paintbox measure
The paintbox measure plays a key role in our discussion in the next section as well as in our proof of uniqueness of ν * in Theorem 2.3(iii).For s ∈ ∆ ↓ , we write s 0 := 1 − ∞ i=1 s i to denote the amount of dust in s and we define the paintbox measure s directed by s as the distribution of a random partition Π generated as follows.First, we take independent random variables X 1 , X 2 , . . .with distribution P s (X i = j) := s j , j ≥ 1 s 0 , j = −i.
Given (X 1 , X 2 , . ..), we define Π by i and j are in the same block of Π ⇐⇒ X i = X j .
We write Π ∼ s to denote that Π is distributed as a paintbox directed by s.Given a measure ν on ∆ ↓ , the paintbox measure directed by ν is the mixture of paintboxes: According to Kingman's correspondence [5], to any exchangeable random partition Π of N there corresponds a unique probability measure ν * on ∆ ↓ such that Π ∼ ν * .

Discussion of Theorem 2.3
We now discuss the components of Theorem 2.3 in some detail, paying attention to the interplay among (i)-(vi) as well as connections to previous literature.Roughly speaking, the six parts of the theorem can be decomposed into three motifs: (i)-(ii) is a condition in the vein of Markov branching trees with Exponentially distributed edge lengths; (iii)-(iv) gives a structure result reminiscent of the characterization of homogeneous fragmentations; (v)-(vi) describes the existence of Q • p,τ without explicit reference to τ ; in particular, both (v) and (vi) depend only on p.The connection between (v)-(vi) and existence of Q • p,τ is tied to the existence of a well-defined root partition of the limiting fragmentation tree.This also relates to the existence of a Markov branching tree with Exponentially distributed edge lengths; see Sections 3.4-3.6.
Since trivial partitions are assigned zero probability by any splitting rule, the measures ν * and ν K determine the same splitting rule through the generalization to (3.1): Indeed, from (3.4), we have, for π ∈ P , which coincides with (3.1).

The role of τ ∞
The quantity τ ∞ := lim n→∞ τ n plays an important role in the description of the limiting tree T • ∼ Q • p,τ in that it parameterizes its edge lengths.That is, the limiting object T • is an infinite Markov branching tree with independent Geometrically distributed edge lengths, all with success probability τ ∞ .Moreover, the special case τ ∞ = 1 corresponds to Geometric edge lengths all with success probability 1.Hence, almost surely, the edge lengths of the limiting tree T • are all identically 1.In this case, the randomness of the edge lengths disappears in the limiting object.Viewed another way, from (3.5), we notice that 1 − τ ∞ = ν * ({(1, 0, . ..)}) corresponds to the probability that a random partition of N is trivial.Since only non-trivial partitions correspond to dislocations in a fragmentation tree, τ ∞ = 1 − ν * ({(1, 0, . ..)}) naturally corresponds to a success probability in our Geometric weighting scheme.Note that (3.6) is identical to condition (2.5) of Theorem 2.3(ii); however, in the discrete case we encounter the additional constraint 0 ≤ τ n ≤ 1 for all n ≥ 0.Moreover, while continuous embedding is always possible for an infinitely exchangeable family of splitting rules, discrete embedding is not.Conditions (2.5) and (3.6) seem intimately tied to the memoryless property of the Exponential and Geometric distributions.Both (2.5) and (3.6) can be proven using the same strategy as in Theorem 5.1, with the modification that to prove (3.6) we use characteristic functions rather than probability generating functions.

Relation to homogeneous fragmentations
The definition of ν K in (3.4) connects the characteristic measure ν * to a collection of dislocation measures of homogeneous fragmentation processes.From Theorem 1 of [4], any exchangeable splitting rule p = (p n ) n≥2 satisfying (2.4) is associated to a pair (c, ν) (see equations ( 2) and (3) of [4]), where c ≥ 0 is the erosion coefficient and ν is the dislocation measure of a homogeneous fragmentation process T • .To ensure that each finite restriction of T • determines a fragmentation of a finite set with strictly positive edge lengths, the dislocation measure ν is subject to the constraint ν({(1, 0, . ..)}) = 0 and see also, Bertoin [3] (Theorem 3.1).The measure ν K constructed in (3.4) trivially satisfies (3.7) and, therefore, is the dislocation measure of some homogeneous fragmentation.As shown in Section 3.1, for K, K ∈ (0, ∞), any two pairs (ν K , τ ∞ ) and (ν K , τ ∞ ) defined from the same characteristic measure ν * determine the same splitting rule and, hence, the same discrete-weighted Markov branching model.Similarly, by Theorem 1 of [4], (c, ν) determines the same splitting rule as (Kc, Kν) for all K ∈ (0, ∞).

Root partitions
The erosion coefficient c ≥ 0 also relates to (v) and (vi) of our theorem.In particular, the erosion coefficient is the rate at which "erosion" of a single element occurs, that is, the event that the initial split of the entire mass N is into {N \ {n}, {n}}.Assuming the dislocation measure ν is finite, the total rate at which a (c, ν)-fragmentation process with initial mass [n] } .As a result, we see that λ n → ∞ whenever c > 0 and λ n → ν(∆ ↓ ) < ∞ when c = 0. Therefore, (iv) and (vi) together imply that discrete-weighted fragmentations correspond to homogeneous fragmentations with zero erosion coefficient and finite dislocation measure.Furthermore, Theorem 2.3(v) asserts that the existence of a collection τ for which Q • p,τ exists depends on whether T ∼ Q p possesses a well-defined root partition.Intuitively, there will be such a root partition only if λ ∞ is finite because if λ ∞ = ∞ then the root edges of the finite trees must be getting shorter as n increases.Thus, Theorem 2.3(v) separates Markov branching trees into two classes, those with root partition and those without.By (v), Markov branching trees with a root partition can be assigned Geometrically distributed edge lengths, while those without a root partition cannot.To be explicit, given λ ∞ < ∞, we can choose any λ * ∈ [λ ∞ , ∞) and put τ n = λ n /λ * for each n ≥ 2. By (2.7), (τ n ) n≥2 chosen this way satisfies (2.5).Moreover, relating to Section 3.2, we have τ ∞ = λ ∞ /λ * ∈ (0, 1].

Some formalities
In preparation for the proof of Theorem 2.3, we now formally introduce some concepts from previous sections.

Root partitions
of non-empty, disjoint subsets for which k i=1 A i = A. We write P A to denote the collection of all partitions of A. The collection {P [n] } n∈N of spaces of finite set partitions is projective under the deletion maps (2.3).We write P N to denote the projective limit of partitions of N, which we furnish with the discrete σ-algebra σ n∈N P [n] .For each n ∈ N, we write D n := D n,∞ to denote the deletion operation P N → P [n] , where [∞] := N in (2.3).Partitions appear in the study of Markov branching trees through the splitting rule, which is a distribution on P [n] \{1 [n] } that determines the law of the branching below a child of size n in a random fragmentation.Also, in Theorem 2.3(v), partitions of N arise in the notion of a limiting root partition.For any A ⊂ f N, #A ≥ 2, every t ∈ T A has a well-defined root partition denoted by Π t R • m,n to T • [n] and we define T • N as the projective limit of {T • [n] } n∈N under these restriction maps.The space T ..,∞} to denote the product of discrete σ-fields on subsets of {0, 1, . . ., ∞}, we equip T • [n] with the σ-field T [n] ⊗D n and T • N with the σ-field σ R • n n∈N so that the restriction maps are measurable.

Proof of Theorem 2.3
Theorem 2.3 summarizes the conclusions of a series of theorems and propositions that we prove in this section.Throughout this section, assume p := (p n ) n≥2 is a collection of splitting rules satisfying (2.4) and τ := (τ n ) n≥0 is a collection of success probabilities.The pair (p, τ ) determines a family (Q p,τ ) n∈N of finite-dimensional probability distributions through (4.4).By Kolmogorov's extension theorem, (Q ,n for all m ≤ n, for every n ∈ N. )) for every n ≥ 2. (5.2) Proof.Clearly, τ 0 = τ 1 = 0 is both necessary and sufficient for Q to satisfy (i) w in the definition of a weighted fragmentation tree, for every n ∈ N.
For Q [n] p,τ defined as in (4.4), let T • n+1 = (T n+1 , W n+1 ) ∼ Q of element n + 1, is the same as the distribution of the root partition under Q[n] p ,for every n ≥ 2.
) n∈N determines a unique probability measure Q • p,τ on T • N if and only if
to denote the image of ν * by the obvious restriction map P N → P [n] .Condition (2.6) ensures that (3.1) is a well-defined probability distribution on P [n] \{1 [n] } and the success probabilities τ n are strictly positive for every n ≥ 2. *