A time-invariant random graph with splitting events

We introduce a process where a connected rooted multigraph evolves by splitting events on its vertices, occurring randomly in continuous time. When a vertex splits, its incoming edges are randomly assigned between its offspring and a Poisson random number of edges are added between them. The process is parametrised by a positive real $\lambda$ which governs the limiting average degree. We show that for each value of $\lambda$ there is a unique random connected rooted multigraph $M(\lambda)$ invariant under this evolution. As a consequence, starting from any finite graph $G$ the process will almost surely converge in distribution to $M(\lambda)$, which does not depend on $G$. We show that this limit has finite expected size. The same process naturally extends to one in which connectedness is not necessarily preserved, and we give a sharp threshold for connectedness of this version. This is an asynchronous version, which is more realistic from the real-world network point of view, of a process we studied in arXiv:1506.02697, arXiv:1703.09011.


Introduction
We consider a random network model with reproduction which evolves in continuous time. Each vertex independently, at rate 1, splits into two. When a vertex splits, each of its existing edges is randomly rerouted to one of the two vertices produced, and these two vertices are connected by a random number of edges with distribution Po(λ/2), where λ > 0 is a fixed parameter. If the resulting graph is disconnected, only the component of the root is retained (the precise definition is given in the next section). We show that there is a unique random multigraph M (λ) which is time-invariant under this evolution and has finite average degree (Theorem 1.4), and analyse some of its properties. As a consequence, if we run our process starting from any finite graph G, it will almost surely converge in distribution to M (λ).
This model arose naturally in our recent work [9]: there, we considered the variant of the above evolution where all vertices split simultaneously in regular time intervals. We observed that there is a unique finite-degree random multigraph G(λ) which is time-invariant under this evolution too. We will refer to G(λ) as the synchronous version of M (λ). Moreover, we showed that G(λ) is identically distributed with the cluster of the origin in an instance of long-range percolation on the infinitely-generated group i∈N Z 2 . Perhaps surprisingly, given its alternative definition as a cluster of a percolation model on a group, and given that most percolation models on finitely generated groups undergo a phase transition [6], G(λ) is almost surely finite for any value of the intensity λ, and its expected size is finite. In this paper we show the analogous result for M (λ) (Theorem 1.5).
Our splits can be thought of as reproduction of vertices, in the sense that a vertex produces a child and then passes on some of its connections to its child. In this sense, our first definition of G(λ) is reminiscent of the models for random reproducing graphs studied by Jordan [12], building on earlier deterministic models for social networks [15,5], with the key distinction being that in Jordan's model all connections of the parent are retained, whether or not they are inherited by the child.
However, simultaneous, discrete-time reproduction by the whole population is not a realistic model for real-life networks. It is therefore natural to consider a variant in which reproduction events are independent and may occur at any time, which is part of the motivation of the current paper. Mechanisms for growing networks based on repeated vertex duplications have previously been proposed as plausible for the development of the web graph [13] and for evolution of biochemical networks [3,17]. Mathematical analysis of such a model was carried out non-rigorously by Pastor-Satorras, Smith and Sole [14], suggesting a limiting degree distribution which is power-law with an exponential cutoff, although subsequent rigorous work by Bebek, Berenbrink, Cooper, Friedetzky, Nadeau, and Sahinalp [2] showed that this is not the case. Another related model, motivated by duplication of genetic material, has been studied by Thörnblad [16] and by Backhausz and Móri [1]; however, the graph structure of this model is particularly simple, being a collection of disjoint cliques. A similar model for a fixed population size, which has richer behaviour owing to the random loss of individual edges, was introduced by Bienvenu, Débarre, and Lambert [4].
Although the continuous-time model M (λ) studied here is more natural in certain respects, its analysis is significantly more challenging than that of the synchronous version G(λ) for the following reason. A basic tool in the analysis of both models is the underlying genealogical tree T , containing all vertices in our evolution, and joining each vertex to its children by an edge. Starting with T , we can alternatively define our random graphs by joining pairs of leaves of T with random independent edges with appropriately chosen probabilities. In the synchronous case, this T is very simple: it is a binary tree of depth n when we run the process for n steps starting from a single vertex, and it is the so-called canopy tree when we start with G(λ). When we start with M (λ) however, T is a random tree with a non-trivial distribution: it can be thought of as the local limit of the ball B(t) of radius t in first passage percolation on the full binary tree after re-rooting B(t) at a leaf (see Section 2 for more details). Thus our main results Theorem 1.4 and Theorem 1.5 below were much harder to prove than their analogues in [9].

Model and results
It will be convenient for some proofs and statements of results to define both the main process defined above and a "full" version of the process in which other components are not discarded. In fact it is simpler to define the latter first. A multigraph is a graph in which two vertices may be joined by several parallel edges. The multigraphs of this paper do not have loops, i.e. edges that start and end at the same vertex. Definition 1.1. For a rooted connected multigraph, (G, o), the full process (G t , o t ) t≥0 with parameter λ > 0 is defined as follows. Set (G 0 , o 0 ) = (G, o). Give each vertex v a splitting time τ v , where splitting times are i.i.d. Exp(1) variables. When t = τ v , replace v with two new vertices v 1 , v 2 , and give each a splitting time of t + Exp(1). Add Po(λ/2) edges between v 1 and v 2 . Moreover, replace each edge of the form uv with one of the edges uv 1 , uv 2 chosen uniformly at random. If v was the root, update the root to be v 1 or v 2 , each with probability 1/2. All these random choices are made independently from each other. Set (G t , o t ) to be the resultant graph.
We will frequently consider a single-vertex starting graph; we write G • t in this case. Remark 1.2. The number of vertices of G • t over time, which is independent of all edge-related events, is a Yule process with rate r = 1, that is, a pure birth process where the birth rate is r times the population. Its value at time t has a geometric distribution with mean e rt ; see [7,Section XVII.3]. Definition 1.3. The cluster process (G t , o t ) with parameter λ is the rooted connected multigraph formed by the component of the root in G t .
It is natural to think of the cluster process as a reproduction process where individuals die when they leave the component of the root. In this sense it resembles a general branching process, or Crump-Mode-Jagers process, (see e.g. [11,Chapter 6]); however, these processes assume independence of the lifespans of different individuals, whereas in our model death events are highly interdependent.
We prove three main results about these processes, listed below.
Theorem 1.4. For each λ > 0 there is a unique random rooted connected multigraph with finite expected root degree, (M (λ), o), which is invariant under the cluster process in the sense that (M (λ) t , o t ) has the same distribution for any t ≥ 0.
It is not immediately obvious that M (λ) is almost surely finite. However, we prove a much stronger result.
When considering the full process, a natural question is when it becomes disconnected, or equivalently when the full and cluster processes first differ.
Theorem 1.6. The time t = λ is a sharp threshold for both connectedness of G • t and the existence of isolated vertices, that is, for any ε > 0, with high probability as λ → ∞ the graph G • (1−ε)λ is connected but G • (1+ε)λ is disconnected with isolated vertices.

Questions
In [9] we conjectured that E(|G(λ)|) ∼ λ cλ in agreement with computer simulation data. Simulations on E(|M (λ)|) showed a similar behaviour to E(|G(λ)|), and the same conjecture can be made. We know that E(|G(λ)|) is an analytic function of λ because of results in percolation theory [10]. For E(|M (λ)|) we do not even have a proof of continuity. Apart from obtaining more detailed results about the behaviour of M (λ), it would also be interesting to modify our splitting rule in order to obtain other random graph models with temporal invariance.

Convergence to a limit
In this section we prove Theorem 1.4; throughout the section we assume the parameter λ > 0 is fixed. Let (G, o) be a random rooted graph such that E(d(o)) is finite. Let (G • , o) be the single-vertex loopless graph with the same root o. Run the cluster process (G t , o t ) given in Definition 1.3, and let H t be the subgraph of G t induced by descendants of o. Note that o t ∈ H t and (H t , o t ) evolves according to the law of the cluster process (G • t , o t ), so has the same distribution. Proof. We refer to edges of G t which were added after time 0 as new edges, and those which correspond (after replacements when vertices split) to edges of G as old edges. Let e ∈ E(G) be an edge from the root, and let the corresponding edge at time t meet o ′ t , where o ′ t is a descendant of the root. We say that e has been killed by time t if, for some s ≤ t, we have o ′ s = o s and no new edges meet o ′ s . If e has been killed by time t, then at time s all paths from o s to o ′ s must use at least one old edge, and this property is preserved by splitting events, so the same is true for t. Thus, if a path from the root in G t uses any old edge, the first old edge in that path must not have been killed by time t, meaning that the old edges which have not been killed by time t form a cut separating H t from the rest of G t . It therefore suffices to show that with probability 1 eventually every old edge has been killed.
For a specified edge e, consider the first time that the root splits and o ′ t = o t ; call this t 1 . Let t 2 , t 3 , . . . be the subsequent times that o ′ t splits, and let X k be the number of new edges meeting o ′ t k . Then X k+1 ∼ Bin(X k , 1/2) + Po(λ/2). This gives an irreducible Markov chain on N with a stationary distribution Po(λ). As a result, the chain is positive recurrent and in particular hits 0 in finite time, killing e, with probability 1. Since there were finitely many old edges, all of them are killed in finite time with probability 1.
Before proceeding to the proof of Theorem 1.4, we first recall the Poisson edge model of [9]. This is a long-range percolation model on the leaves of the canopy tree. We may label the complete binary trees of height 0, 1, . . . in such a way that each tree is a subtree of the next, with each leaf also being a leaf of the next tree. The (binary) canopy tree is then the union of this sequence of trees, and has an infinite sequence of leaves. The Poisson edge model is a random multigraph whose vertices are the leaves of the canopy tree, and whose edges are given by independently placing Po(2 1−d(x,y) λ) edges between each pair of leaves x, y, where d(x, y) is the graph distance on the canopy tree. In [9] it is shown that the unique random rooted connected multigraph having finite expected root degree which is invariant under the synchronous version of the cluster process is given by the cluster of the root in the Poisson edge model. For the cluster process of Definition 1.3, the picture will be more complicated. Note that we may define the T -Poisson edge model for any binary tree T in the same way: it is the random multigraph on the leaves of T , with Po(2 1−d T (x,y) λ) edges independently between each pair of leaves x, y. We shall need a simple observation about the T -Poisson edge model.
Let T be any binary tree, and fix an edge uv. We say that an edge of the T -Poisson edge model crosses uv if its endpoints are in different components of T − uv.
Lemma 2.2. The probability that the T -Poisson edge model has no edges which cross uv is at least e −λ .
Proof. Write L u , L v for the leaves of the components containing u and v respectively. The number of such edges is Po(zλ) where We must therefore check that z ≤ 1. Consider a random walk on the component of T − uv containing u started at u and constrained to increase the distance from u at every step, stopping if it reaches a leaf. Then for x ∈ L u the probability this walk stops at x is 2 −d T (x,u) , since there are two possible moves at each step. Thus x∈Lu 2 −d T (x,u) ≤ 1, and the same argument applies to L v , giving the result. Remark 2.3. In fact provided that T has countably many ends we have equality in Lemma 2.2, since both walks terminate almost surely.
Proof of Theorem 1.4. We will construct a random multigraph (M (λ), o) with the property that (M (λ) t , o t ) has the same distribution for any t ≥ 0. To show uniqueness, we will show that (G • t , o t ) converges in distribution to (M (λ), o), and apply Lemma 2.1. Our construction of (M (λ), o) will use the T -Poisson edge model, working with a random tree T . (This tree can be thought of as the local limit of the Yule tree at time t, or equivalently the ball of radius t in first passage percolation on the full binary tree with Exp(1) edge costs, after re-rooting at the leaf reached by a simple forward random walk from the root.) To begin with, we construct some finite random trees T (t) that will form the building blocks in the construction of T . Given a parameter t > 0, we define a random rooted binary tree T (t) as follows. Start from a single-vertex rooted tree, with an exponential clock of rate 1 on the root. Whenever a clock on a vertex v rings, add two children of v, each with their own independent exponential clocks of rate 1 (do not replace the clock on v; each vertex rings at most once). Continue until time t. Note that T (t) is almost surely finite. Next we construct an infinite random tree T . Start from an infinite path P = v 0 v 1 · · · , and label its edges with an infinite sequence s 1 , s 2 , . . . of i.i.d. Exp(1) random variables. For each i > 0, sample a copy T i of T ( j≤i s j ), denote its root by w i , and join T i to P with the edge v i w i . Here each T i is sampled independently.
Having constructed T , consider the T -Poisson edge model. We let M (λ) be the component of v 0 in this random multigraph, and let v 0 be the root of M (λ). For n ∈ N, let L n be the leaves Proof of Claim. Starting from k = 0, iteratively reveal the number of edges of the T -Poisson edge model between pairs of vertices until an edge crossing v k v k+1 is found. If this happens, update k to be the smallest value such that no edge yet revealed crosses v k v k+1 and continue revealing. By Lemma 2.2, for each different value of k considered there is a probability of at least e −λ that no suitable edge is ever found, no matter what was previously revealed. Thus almost surely one of the edges v k v k+1 is not crossed, meaning that V (M (λ)) ⊆ L k . ♦ Thus M (λ) almost surely contains vertices from finitely many of the subtrees T i . In particular, since each T i is almost surely finite, so is M (λ).
Proof of Claim. Recall that the construction of M (λ) was based on the randomly edge-labelled path P . Let us denote by G(P, λ) the random graph constructed from any path P with edges bearing positive real labels by following the above procedure. To compare M (λ) with M (λ) t , we will express the latter as G(P t , λ) for an appropriate randomly labelled path P t : consider a Poisson point process R = (−t 1 , −t 2 , . . . , −t k ), k ≥ 0 on the interval [−t, 0] (where we assume that t i ≥ t i+1 ) governed by Lebesgue measure and with duration 1. We obtain P t from P as follows. We change the label s 1 of the first edge of P into s 1 + t k if k ≥ 1, or into s 1 + t if k = 0. Moreover, we append k edges at the start of P , and label them as follows. The first edge is labelled t − t 1 , and for i = 2, . . . , k, the ith edge is labelled t i−1 − t i . It is straightforward to check that G(P t , λ) is identically distributed with (M (λ) t , o t ) by identifying the times at which the root is split with the reversal t k , . . . , t 2 , t 1 of R, using the fact that t i−1 − t i has distribution Exp(1), and so do t k and t − t 1 .
To finish the proof that (M (λ) t , o t ) = G(P t , λ) has the same distribution as (M (λ), o) = G(P, λ), it suffices to prove that P t has the same distribution as P . To prove this, note that we can sample the labels s 1 , s 2 , . . . of P as a Poisson point process on the real axis [0, ∞) governed by Lebesgue measure and with duration 1. Similarly, we can sample the labels of P t as the gaps of a Poisson point process on [−t, ∞). But these two Poisson point processes are identically distributed once we shift by t, as required. ♦ Next, we show that G • t converges in distribution to M (λ). To begin with, we can obtain G • t by a construction similar to that of M (λ), by keeping track of the genealogical tree T t of the vertices of G • t : the vertex set of T comprises all vertices that appeared throughout the process parallel edges independently between any two leaves x, y of T t , and identify G • t with the component of o in the resulting multigraph. The times t 1 , . . . , t k when the root of G • t splits are, by definition, given by a Poisson point process on [0, t] governed by Lebesgue measure on that interval. Consequently, the "reversed" sequence of times t − t k , . . . , t − t 1 has the same distribution as t 1 , . . . , t k . Using this fact, we may equivalently construct G • t using t − t k , . . . , t − t 1 as the splitting times of the root, while leaving the rest of the construction unchanged. This realisation of G • t coincides, by definition, with the following construction. Start with a random path P t with k edges e 1 , . . . , e k , where as above k is the number of splittings of o in the time interval [0, t], labelling e i with the time gap s i = t k+1−i − t k−i if i = 2, . . . , k or s i = t − t k if i = 1. Attach to the endvertex v i of e i an independent copy of T ( j≤i s j ) as above, and finally define a random graph on the leaves of the resulting tree by taking the component of the root in its Poisson edge model.
Appropriately coupled, M (λ) and G • t therefore give the same result so long as M (λ) does not reach the end of the finite path P t in the above construction. Write E n for the event that M (λ) does not extend past v n . Given ε > 0, choose n such that P(E n ) < ε/2 (which is possible by Claim 2.4) and t such that P(Po(t) < n) < ε/2.
Thus G • t converges in distribution to M (λ) as t → ∞. The uniqueness of M (λ) now follows from Lemma 2.1, since if G is a random graph with G t identically distributed for every t, that lemma implies that the distribution of G is the limit of the distribution of G • t .
The random multigraph M (λ) described above differs from the corresponding multigraph G(λ) for the synchronous case studied in [9], that is, the component of the root in the original Poisson edge model on the canopy tree. To see this, it is sufficient to consider the probability, conditional on d(o) = 2, of a double edge from the root. For M (λ) this is x =o 4 1−d(o,x) , where the sum is taken over all other leaves of the random tree T . Note that the probability that w 1 is a leaf is P(τ (w 1 ) < s 1 ), where τ (w 1 ) is the length of w 1 's clock. Since τ (w 1 ) and s 1 are i.i.d., we have P(w 1 a leaf) = 1/2; clearly w i is less likely to be a leaf than w 1 if i > 1, so each w i is a leaf with probability at most 1/2. For each i ≥ 1, the probability of a double edge to a descendent of w i is 4 −i if w i is a leaf, and at most 4 −i−1 otherwise (being maximised when both its offspring are leaves). So the probability of a double edge is at most i≥1 (4 −i + 4 −i−1 )/2 = 1/4. For the canopy tree version G(λ), the probability of a double edge is h≥1 2 h−1 4 1−2h = 2/7, and so M (λ) has a strictly smaller double-edge probability.

Finite expected size
In this section, we consider the expected size E(|M (λ)|). While the expected size of G(λ) is finite for every λ > 0 [9], it is not immediately clear whether the same is true of M (λ). Since M (λ) arises from the T -Poisson edge model on a random tree T , and we know that the expected cluster size is finite for the Poisson edge model on the canopy tree, and that the cluster size of the Poisson edge model on any binary tree is almost surely finite (Claim 2.4), one might hope to prove a universal bound (depending on λ) on the expected cluster size for any binary tree, whence the desired result would follow by averaging. However, no such bound exists; indeed, there are binary trees on which the expected cluster size of the Poisson edge model is infinite for sufficiently large λ. One example may be obtained by replacing each edge of the canopy tree by a two-edge path with a pendant leaf attached to the new vertex. If v was a leaf of the canopy tree at distance 2k from o, then the new tree contains a sequence of 2k + 2 leaves, starting at o and ending at v, such that each consecutive pair is at distance 4. Each of these pairs is adjacent in the Poisson edge model on this tree with probability 1 − e −λ/8 , and so every such v is in the component of o with probability at least (1 − e −λ/8 ) 2k+1 . Provided λ ≥ 8 log(2 + √ 2), it follows that the expected size of this component is infinite.
Of course, the initial sections of such a tree are not typical Yule trees, and so this example does not rule out the possibility of exploiting the large-scale structure of the tree T constructed in the previous section. However, we will find it easier to use a more local approach: rather than showing that an initial section of T is typically well-behaved everywhere, we work directly with Definition 1.3 and explore the component of the root in G t . This means that we only need T , which corresponds to the splitting events, to behave well in such parts as we encounter during this exploration. In the remainder of the section, we prove Theorem 1.5.

Outline of proof
Fix λ > 0. Note that since G • Lemma 3.1. Fix times t ≥ 0 and ε > 0, and let X ε = |G • ε | be the total number of vertices in the full process at time ε. Then we have Proof. Conditioning on the value of X ε , we have Note that, conditioned on X ε = 1, G • t+ε is just the result of letting the single vertex at time ε evolve for an additional time t, and P(X ε = 1) = e −ε < 1 − ε + ε 2 , so Also, P(X ε = 2) < ε, which gives the required second term.
To deal with the third term, recall from Remark 1.2 that X ε ∼ Geo(e −ε ) and so P(X ε > 2) = (1 − e −ε ) 2 < ε 2 . Now suppose that X ε > 2. This means that there is some random time η 2 < ε at which the second splitting event occurs. Nothing that happens after η 2 can affect the event X ε > 2, and so we may condition on η 2 . At time η 2 there are three vertices, which may or may not be connected by edges. Certainly |G • t+ε | is dominated by |G • t+ε |, which, conditioned on η 2 , has expectation 3e t+ε−η 2 < 3e t+ε . Thus the final term is less than 3ε 2 e t+ε , as required.
Conditioned on X ε = 2, G • t+ε is distributed as two independent copies of the full process run for time t with some edges between them, rooted at the root of the first copy. We will show that the probability of some of these edges touching the component of the root in the first copy is exponentially small. If this does happen, we argue that the expected number of edges between the two copies is not much larger than its unconditional expectation (i.e. λ/2), and that consequently we connect together (on average) not too many components. The main issue with this is that conditioning on this unlikely event might change the expected size of a component significantly, so we must control this. If we can do this, we will have shown that where h(t) is some function that decays exponentially in t. It will follow, from (1) and Lemma 3.1, that for any fixed t ≥ 0 we have, , Write G •• t for the result of running the cluster process for time t starting from two vertices with N ∼ Po(λ/2) number of edges between. We consider the descendants of the two original vertices in the corresponding full process G •• t as two independent copies of G • t , with the "left" copy being descendants of the original root, and say that the N edges between the two copies are old, and others are new. Note that the component of the root in the subgraph induced by the left copy is distributed as G • t . We follow what happens to the left-endpoints of all old edges, and to the root. Recall that an old edge is killed by a splitting event if after that event its left-endpoint is not the root, and meets no new edges. Consider the following four events, for a fixed time t and 0 < α < 1; note that some of these events may depend on what happens after time t.
A: the left-endpoint of some old edge splits less than αt times by time t.
B: after the left-endpoint of some old edge splits αt/3 times, it is either the root or the leftendpoint of more than one old edge.
C: B does not occur, but some new edge meets the left-endpoint of some old edge for the entire period between the (αt/3)th and (2αt/3)th splits of the latter.
D: B and C do not occur, but some old edge is not killed between its (2αt/3)th and αtth splits.
Writing I A , etc., for the indicator functions of these events, we have

Dealing with event A
Note that the left-endpoint of a given edge splits Po(t) times in time t, and Since lim α→0+ α α = 1, we may choose α > 0 such that We next define a variant of the full process: the singleton-free process S t starts from a single vertex with Po(λ/2) tokens. It proceeds as the full process with tokens distributed randomly between the offspring when a vertex splits, but with the exception that any vertex which is isolated and has no tokens is immediately discarded.
First we will show that E(|S t |) is bounded by the expected size of a Yule process of rate r = r(λ) < 1. The intuition here is that each splitting event has at least a constant probability of producing an isolated vertex, and we can just ignore these events, resulting in a thinning of the rate by a constant factor. However, we need to be slightly careful to check that the lower bound on the probability of creating an isolated vertex still holds even conditioned on the splitting vertex not having been isolated at any point in its history. We will need the following lemma, which will be used again for the other events.
Proof. We handle the case p = 1; the case p < 1 follows by noting that Y ∼ Po(pm) and X − Y are independent. We may sample X | (X ≥ k) by repeatedly sampling X, keeping the first value which is at least k. Since we can take the rth sample of X as the number of points occurring in the interval [r − 1, r] in a Poisson process of intensity m, this is the same as letting the Poisson process run until the first time we have seen k points since the last integer, then continuing until the next integer. This is clearly dominated by letting the process run to the first time we have seen k points since the last integer, then continuing for time 1, which gives the required distribution.
If we only have the weaker condition that X ≤ st Po(m) we cannot get any bound on E(X | Y ≥ 1), but the following bounds are sufficient for our purpose.
Proof. We can couple X with a variable X ′ ∼ Po(m), and couple Y with Y ′ ∼ Bin(X ′ , p) in the natural way, so that (Y ≥ 1) ⊆ (Y ′ ≥ 1). Then, since 0 ≤ X ≤ X ′ we have We may sample S t by using a Yule process Y t of rate 1 for the splitting events, then determining the movement of new edges and removing any vertices which were isolated at any point in their history. Similarly, we can simulate a Yule process Y (r) t of rate r < 1 from the same copy of Y t by, independently for each splitting event, removing all descendants of one offspring with probability 1 − r. Conditional on Y t , each vertex which has undergone k splitting events has probability r+1 2 k of surviving in Y (r) t . In S t , conditional on a vertex having survived j splits without being isolated, we argue by induction on j that the number of edge-ends meeting it is dominated by Po(1 + λ). This is true for j = 0. Assuming the statement holds for j, a vertex which has split j + 1 times may inherit the first edge-end from its parent, and receives at most Po(λ) other edge-ends; conditioning on not being isolated does not change the number of additional edge-ends if it did inherit the first edge, and increases it to at most 1 + Po(λ) if not. So the result holds for all j. Consequently, given that a vertex has survived j splits without being isolated, its offspring after the next split have at most Ber(1/2) + Po(λ) edge-ends, so are each isolated with probability at least e −λ 2 ; it follows that each vertex in Y t which underwent k splitting events has probability 2−e −λ 2 k of surviving in S t . For r = 1 − e −λ , each vertex has a higher probability of surviving in Y (r) t than S t , and so we have Lemma 3.2 implies that N | A ≤ st 1 + Po(λ/2). To see this, note that we may first condition on the tree of splitting events. For each possible tree T , each old edge independently has some probability p T of following a path in the tree which splits fewer than αt times; we may ignore trees for which p T = 0. Thus N | T, A ≤ st 1 + Po(λ/2), and the result follows by averaging over T . Now we consider the singleton-free process conditional on A. Suppose a vertex meeting an old edge or root splits, and one of the new vertices created, v, does not meet a new edge or the root. Conditioning on A does not affect the future evolution of v, and it evolves as the non-root half of a singleton-free process (or is discarded if it has no new edges). Thus the expected number of descendants of v is at most E(|S t |)/2 = e (1−e −λ )t . For each old edge, the expected number of times it splits is t before conditioning on A, and cannot increase after conditioning; the same applies to the root. Thus the expected number of times such a vertex is created is at most (2 + λ)t. Since every vertex at time t is either a descendant of such a vertex, meets an old edge, or is the root, we have and so (recalling (3)) we have

Dealing with event B
Since there are X ∼ Po(λ/2) left-endpoints of old edges and one root, and each pair has probability 2 −αt/3 of coinciding after αt/3 splits, a union bound gives Consider the full tree of possible locations for left-endpoints after αt/3 splits. Order these locations v 1 , . . . , v 2 αt/3 ; without loss of generality we may assume the root is at v 1 after αt/3 splits. Writing X i for the number of old edges at location v i after αt/3 splits, the X i are i.i.d. Po(2 −αt/3 λ/2) random variables. We will control the expected number of old edges conditioned on B. B occurs if and only if either X 1 ≥ 1 or X i ≥ 2 for some i > 1. For each i ≥ 2, let B i be the event that X i ≥ 2, and X j ≤ 1 for each j > i. Let B 1 be the event that X 1 ≥ 1 but X j ≤ 1 for each j > 1. Now the events ( The old edges therefore combine, on average and conditional on B, at most λ/2 + 3 components from the two copies. Since B does not depend on splitting times or new edges, each component has expected size E(|G • t |). Thus, recalling (5), we have

Dealing with event C
Randomly designate one end of each new edge to be the "head", and the other the "tail", so that the number of edges xy with head x and the number with head y are independent. We set C h (C t ) to be the event that the head (the tail) of some new edge coincides with the left end of some old edge for the period in question. Since C = C h ∪ C t and by symmetry of Given B ∁ , each of the N | B ∁ old edges coincides with the head of a new edge for the period between its (αt/3)th and (2αt/3)th splits independently and with equal probability p. Since the number of heads coinciding with a given old edge at the start of the period is distributed Po(κ) for some fixed κ ≤ λ/2, a union bound gives p ≤ 2 −αt/3 λ/2. Thus Also, Corollary 3.3 gives P(C h | B ∁ )E(N | C h ) ≤ 2 −αt/3 (λ/2) 2 (1 + λ/2).
Next we bound P(C h )E(|G •• t | | C h ). Lemma 3.2 implies that the number of heads which coincide with a given old edge, after conditioning on C h , is dominated by 1 + Po(pκ), and the number of other new edges is independent of C h . Thus we may couple the new edges conditioned on C h as a subgraph of the unconditioned new edges together with at most N | C h additional new edges. The expected size of the component of a given vertex in the subgraph of unconditional new edges is E(|G • t |), and since the old edges and additional new edges merge at most 2E(N | C h ) + 1 components on average,

Dealing with event D
Now suppose that B and C do not occur. Again, we have N | B ∁ ∩ C ∁ ≤ st Po(λ/2). Since C does not occur, all new edges that meet left ends of old edges after 2αt/3 splits were created after the (αt/3)th split, and since B does not occur, each of these meets only one old edge. Thus every old edge is not killed between its (2αt/3)th and αtth splits independently with some probability p. Corollary 3.3 therefore gives P(D)E(N | D) ≤ p(λ/2)(λ/2 + 1). We next bound p. Note that being killed is monotone on adding new edges. Suppose that after a given split an old edge e meets X 0 ≤ st 1 + Po(λ) new edges. Adding extra new edges, if necessary, we may assume e meets 1 + Po(λ) new edges. After the next split, conditioned on e meeting at least one new edge, we claim that it meets X 1 ≤ st 1 + Po(λ) new edges. If the first of the 1 + Po(λ) new edges still meets e, there are Po(λ/2) + Po(λ/2) ∼ Po(λ/2) other new edges meeting e, whereas if not we have Po(λ) edges meeting e, conditioned to be positive, and by Lemma 3.2 this is dominated by 1 + Po(λ). Thus conditioning on not having been killed at the previous step leaves at most 1 + Po(λ) new edges meeting e, giving a probability of at least 1 2 e −λ/2 of being killed at the next step; write c λ = 1 − 1 2 e −λ/2 . It follows that p ≤ c αt/3 λ and so For each old edge e, we associate each new edge e ′ which meets e at any point between its (2αt/3)th and αtth splits with the interval for which it meets e, i.e. the set of indices in {αt/3, . . . , αt} of splits after which e and e ′ meet. Denote the number of new edges meeting e for an interval I by X e,I ; note that X e,I ∼ Po(κ I ) for some κ I depending only on I, and all these are independent. We now condition on the number of old edges and which pairs e, I have X e,I ≥ 1; this is sufficient information to determine whether D occurs. Lemma 3.2 gives X e,I | D ≤ st 1 + Po(κ I ) for each e, I. We can thus couple the new edges conditioned on D as a subgraph of the unconditioned new edges together with at most (N | D)(2αt/3) 2 additional new edges. As in Section 3.4, it follows that Combining (3), (4), (5), (6), (7), (8), (9) and (10) using (2), we have for some ζ < 1, as required for (1), which thus completes the proof of finiteness.

A sharp threshold for connectedness
In this section we prove Theorem 1.6, giving a sharp threshold for connectedness of G • t . We show that, as for the binomial random graph, it coincides with the threshold for isolated vertices to appear. Our methods in this section will follow those of [9] closely. For both directions we will need the following simple concentration bound. Proof. Set n 1 = ⌈e t−f (t) ⌉ and n 2 = ⌊e t+f (t) ⌋. Since |G • t | ∼ Geo(e −t ), we have We first show that isolated vertices appear soon after time λ. Proof. For technical reasons we prove the same statement for the modified process obtained by adding Po(λ/2) extra edges at the first splitting event. This ensures that each vertex has the same probability e −λ of being isolated. Conditioned on the tree T (t) defined in the proof of Theorem 1.4, [9, Lemma 7.1] applies and gives P(X | T (t)) ≤ 2/(2 + |G • t |e −λ ), where X is the event that no vertex is isolated. Note that λ + f (λ)/2 < t − g(t), where g(t) is another function satisfying g(t) → ∞. By Lemma 4.1, with high probability |G • t | > e t−g(t) ; conditional on this we have P(X) ≤ 2/(2 + e f (λ)/2 ) = o(1).
To complete the proof of Theorem 1.6, we must show that with high probability G • t (λ) is connected shortly before t = λ. For this we need another result from [9], but first we define some terms used. Fix a finite binary tree T representing descendants of a marked apex vertex, and k ∈ N. We say that two vertices are siblings if they have the same parent, and two pairs of siblings are k-cousins if they have a common ancestor which is at most distance k on T from all of them. Let G be a graph whose vertices are leaves of T . We say two siblings x, y are strongly linked by G if G contains an edge between a descendant of x and a descendant of y, and weakly linked by G if there is some vertex z of T which is a sibling of one of the k lowest ancestors of x, y, such that G contains edges between a descendant of x and one of z, and between a descendant of y and one of z. [9, Lemma 7.2] says that the following set of conditions is sufficient for G to be connected: (i) every pair of siblings in T is either strongly linked or weakly linked by G; (ii) of every two pairs which are k-cousins, at least one is strongly linked by G; (iii) any pair of siblings within the top k layers of T are strongly linked by G. Proof. Regarding G • t as a random graph on the leaves L(t) of T (t), we will show the conditions above hold with high probability, for some suitable k. Choose α ′ > 0 such that α − α ′ > 1; then, by Lemma 4.1, with high probability |L(t)| < e t+α ′ log t < e t+α ′ log λ , so |T (t)| < 2e t+α ′ log λ .
Suppose (11) holds, and set k = log 2 λ. The probability that a particular pair of siblings fails to be strongly linked is e −λ/2 , and since each pair of siblings has at most k2 k = λ log 2 λ pairs of k-cousins, the total number of ways to choose two pairs of siblings which are k-cousins is at most e t+α ′ log λ λ log 2 λ = e t+(1+α ′ ) log λ log 2 λ = o(e λ ). For each such choice, the probability that neither pair is strongly linked by G • t (λ) is e −λ and so with high probability (ii) holds. There are at most λ pairs of siblings in the top k layers of T (t), and so (iii) also holds with high probability. Finally, for a fixed pair of siblings below this point the probability that they are not strongly or weakly linked by G • t is e −λ/2 1 − (1 − e −λ/4 ) 2 · · · 1 − (1 − e −λ/2 k−1 ) 2 < 2 k−1 e −λ(1−2 −k ) .