Universal height and width bounds for random trees

We prove non-asymptotic stretched exponential tail bounds on the height of a randomly sampled node in a random combinatorial tree, which we use to prove bounds on the heights and widths of random trees from a variety of models. Our results allow us to prove a conjecture and settle an open problem of Janson (https://doi.org/10.1214/11-PS188), and nearly prove another conjecture and settle another open problem from the same work (up to a polylogarithmic factor). The key tool for our work is an equivalence in law between the degrees along the path to a random node in a random tree with given degree statistics, and a random truncation of a size-biased ordering of the degrees of such a tree. We also exploit a Poissonization trick introduced by Camarri and Pitman (https://doi.org/10.1214/EJP.v5-58) in the context of inhomogeneous continuum random trees, which we adapt to the setting of random trees with fixed degrees. Finally, we propose and justify a change to the conventions of branching process nomenclature: the name"Galton-Watson trees"should be permanently retired by the community, and replaced with the name"Bienaym\'e trees".


Introduction
This paper concerns the height and width of random plane trees, and applications of bounds thereof to the study of random simply generated trees and to the family trees of branching processes.Our results in particular allow us to settle two conjectures from [13], and to nearly settle two others.
By a plane tree, we mean a finite rooted tree t = (v(t), e(t)) in which the set of children of each node is endowed with a left-to-right order.The root of t is denoted r(t).The degree of a node v ∈ v(t), denoted d t (v), is its number of children in t, so leaves have degree 0 and all other nodes have strictly positive degree.
The degree statistics of t is the sequence n t = (n t (c), c ≥ 0), where n t (c) = |{v ∈ v(t) : d t (v) = c}| is the number of nodes of t with c children.Note that A sequence n = (n(c), c ≥ 0) is the degree statistics of some tree if and only if c≥0 n(c) = 1 + c≥0 cn(c).For such sequences, we write T n for the set of plane trees with degree statistics n, and write A marked tree is a pair (t, v) where t is a plane tree and v ∈ v(t); so the elements of T • n are precisely the marked trees with degree statistics n.
For a node v ∈ v(t), the height of v, denoted |v|, is the graph distance from v to r(t).The height of t, denoted ht(t), is max(|v| : v ∈ v(t)).The width of t at level k, denoted wid(t, k), is |{v ∈ t : |v| = k}|, and the width of t, denoted wid(t), is max(wid(t, k), k ≥ 0).
Given a a sequence n = (n(c), c ≥ 0) of non-negative real numbers, for p > 0, we write |n| p = ( c≥0 c p n(c))1/p .Note that for a plane tree t, we have |n t | 1 + 1 = |t|.
We prove the following non-asymptotic tail bounds on the height of a randomly sampled node in a random plane tree with given degree statistics.For a finite set S we write X ∈ u S to mean that X is a uniformly random element of the set S.
Theorem 1.1.Fix degree statistics n = (n(c), c ≥ 0) and let (T, V ) ∈ u T • n .Then for all β > 17 3/2 , and if n(1) = 0 then for all ℓ ≥ 1, A related bound, recently proved by Marzouk [16,Proposition 5], strengthens the first of the two bounds stated in the preceding theorem, up to constant factors.These bounds, interesting in their own right, also have several consequences for the family trees of branching processes, which are summarized in our other main theorems, below.In order to state the theorems, we need a little more terminology.
Given a tree t, write t ≤k for the subtree of t consisting of all nodes u ∈ v(t) with |u| ≤ k.
Let µ be a probability distribution with support N (by which we mean that µ(N) = 1).By a Bienaymé tree with offspring distribution µ, we mean the family tree T of a branching process with offspring distribution µ. 1 The law of T is uniquely determined by the property that for any plane tree t of height at most k, In the preceding formula and below, we write µ(k) = µ({k}) for readability.
For n ∈ N, if P {|v(T )| = n} > 0 then we define a Bienaymé tree conditioned to have size n in the natural way: this is a random tree T n such that for any plane tree t with n vertices, Finally, for a measure µ on R, for p > 0 we write |µ| p := ( R |x| p µ(dx)) 1/p .This agrees with the above notation |n| p for sequences n = (n(c), c ≥ 0), by interpreting the sequence as the discrete measure assigning mass n(c) to each non-negative integer c.Theorem 1.2.Fix a probability distribution µ with support N, with |µ| 1 ≤ 1 and |µ| 2 = ∞.For n ∈ N, let T n be a Bienaymé tree with offspring distribution µ, conditioned to have size n, and let V n be a uniformly random node of T n .Then wid(T n )/n 1/2 → ∞, |V n |/n 1/2 → 0, and ht(T n )/(n 1/2 log 3 n) → 0. All convergence results hold both in probability and in expectation, as n → ∞.
Theorem 1.3.Fix a probability distribution µ with support N, with |µ| 1 < 1 and with c≥0 e tc µ(c) = ∞ for all t.For n ∈ N let T n be a Bienaymé tree with offspring distribution µ, conditioned to have size n, and let V n be a uniformly random node of T n .Then wid(T n )/n 1/2 → ∞, |V n |/n 1/2 → 0, and ht(T n )/(n 1/2 log 3 n) → 0. All convergence results hold both in probability and in expectation, as n → ∞.
The results of Theorems 1.2 and 1.3 are close analogues of Conjectures 21.5 and 21.6 and Problems 21.7 and 21.8 from [13], but those conjectures are stated for the slightly more general model of simply generated trees.In Section 5 we define simply generated trees, state the aforementioned conjectures and problems precisely, and explain how to use Theorem 1.1 to prove Conjecture 21.6 and solve Problem 21.8 from [13], and to nearly prove Conjecture 21.5 and nearly solve Problem 21.7 from the same paper. 2The key fact about simply generated trees is that, like conditioned Bienaymé trees, they are uniformly random conditional on their degree statistics, which allows us to apply Theorem 1.1 to them.
We also prove height and width bounds for conditioned Bienaymé trees which hold without any assumptions on the offspring distribution at all, aside from the requirement that the resulting family trees have both leaves and branch points.
Theorem 1.4.There exists a constant C > 0 such that the following holds.Fix a probability distribution µ with support N with µ(0) + µ(1) < 1.For n ∈ N, let T n be a Bienaymé tree with offspring distribution µ, conditioned to have size n, and let V n be a uniformly random node of T n .Then

1.1.
Discussion.There is a substantial amount of past work on the heights and widths of random Bienaymé trees and random combinatorial trees [1-3, 14, 15], and bounds on these quantities, particularly the height, often feature in scaling limit theorems for random trees and associated objects [4,5,8,17,19].The works [1][2][3] all bound the height via the study of the depth-first exploration process of the tree.This technique gives bounds which are frequently tight, up to constant factors, for trees whose offspring distributions are sufficiently light tailed (e.g. with finite variance).However, it does not appear well-suited 2 "Nearly prove" and "nearly solve" rather than "prove" and "solve" due to the presence of a log 3 n factor in two of our bounds.
to studying trees with heavy-tailed degrees (in which case the depth-first queue length is a poor proxy for the height).
For critical conditioned Bienaymé trees with finite variance (|µ| 1 = 1, |µ| 2 < ∞), sub-Gaussian tail bounds for n −1/2 ht(T n ) and n −1/2 wid(T n ) are known [3].However, the authors of that paper state that they "are not aware of any results [for the height and width] that hold for arbitrary offspring distributions."As far as the authors of the current paper are aware, this is still the case, and this paper is the first work to provide such results.
We do not expect that the stretched exponential tail bound of Theorem 1.1 is tight.However, it is not completely clear what form an optimal bound ought to take.We now record some observations which limit how quickly the optimal bounds can decay, to help provide a sense of the potential complexities.These observations in particular show that the exponents 1/3 and 2/3 in Theorem 1.1 can not be replaced by any values strictly greater than 1, which means that one can not hope for sub-Gaussian tail bounds like those we prove for degree statistics with n(1) = 0 to hold in general.(The computations underlying the observations are not fully spelled out here but are not too complicated.) First, fix α ∈ (1, 2), and suppose that |µ| 1 = 1 and µ(k, ∞) = (1 + o(1))ck −α as k → ∞, so µ is a critical offspring distribution in the domain of attraction of an α-stable law.In this setting, it is known [11,Theorem 1.5] that as first n → ∞, then c → ∞.Such a tree T n will typically have Θ(nk −α−1 ) nodes of degree k for k ≤ n 1/α and no nodes of degree much larger than n 1/α , and so will satisfy |n Tn | 2 2 ≍ n 2/α .Since α can be taken arbitrarily close to 1, comparing the upper bound from Theorem 1.1 with (1.1) shows that the exponent 2/3 in Theorem 1.1 can not be replaced with anything strictly greater than one.
Second, consider degree statistics of the form n = (k, k, . . ., 0, 1, 0, . ..), corresponding to a tree with n = 2k + 1 nodes, with a single node of degree k, and k nodes each of degrees 0 and 1.For such degree statistics, |n| 1 /(|n| 2 2 − n(1)) 1/2 = Θ(1).Moreover, it is not hard to see that with high probability a random tree with these degree statistics has height Θ(log n), so there is δ > 0 such that the probability that a randomly sampled node has height at least δ log n is at least (δ log n)/n.Combining these observations shows that neither the exponent 1/3 nor the exponent 2/3 in the first bound in Theorem 1.1 can in general be replaced with anything greater than 1. (Marzouk's result [16,Proposition 5] shows that one can essentially replace both constants 1/3 and 2/3 by 1; the above observations show that this is then best possible.) We conclude the discussion with a word about Theorem 1.4.The dependence on µ(0) and µ(1) in that theorem is necessary; if µ(0) + µ(1) = 1 then with probability one T n is a path with n vertices, which has width 1 and height n − 1.Moreover, the form of the dependence in the theorem is essentially optimal.To see this, suppose that µ(1) = 1 − ǫ and µ(0) = µ(2) = ǫ/2.Then with high probability T n will have (1 + o(1))(1 − ǫ)n vertices with exactly one child.Let Tn be the tree obtained from T n by suppressing all vertices with exactly one child, so that Tn has only nodes with 0 or 2 children, and T n can be recovered from Tn by subdividing edges.Then Tn has size (1 + o(1))ǫn with high probability, and is a uniform binary tree conditional on its size, so has height Θ((ǫn) 1/2 ) and width Θ((ǫn) 1/2 ) in probability.Each edge of Tn is subdivided Θ(ǫ −1 ) times on average in T n , from which it is easy to believe (and not too hard to prove) that T n has height Θ((n/ǫ) 1/2 ) = Θ((n/(1−µ(0)−µ(1))) 1/2 ) and width Θ((ǫn) 1/2 ) = Θ(((1−µ(0)−µ(1))n) 1/2 ) in probability and in expectation.
1.2.Notation.For a sequence (r n , n ≥ 1) of real numbers, we write r n = oe(1) if there exists c > 0 such that r n ≤ e −cn for all n sufficiently large.Given a sequence of events (E n , n ≥ 1), we say that E n occurs with very low probability (and that E c n occurs with very high probability) if P {E n } = oe(1).
2. An overview of the proofs.

2.1.
A sampler for the height of the marked node.Fix degree statistics n, and let (T, V ) ∈ u T • n .The tool which unlocks all the results of the paper is a sampling procedure which generates a random variable with the same law as |V |.To describe the sampling procedure, some notation is needed.
Given degree statistics n = (n(c), c ≥ 0), we say a random vector Finally, let M = min(i : In the above proposition, note that since D n = 0, when i = n we have Finally, let σ = inf(i : and if n(1) = 0 then for all ℓ ≥ 1, Note that in Theorem 2.2, we have . Since a+1 b+1 ≥ a b whenever a b < 1, it follows that the random variable σ = inf(i : B i = 1) stochastically dominates the random variable M = min(i : A i = 1).Thus, upper tail bounds for σ automatically apply to M .Since in view of this observation, the bounds of Theorem 1.1 follow immediately from those of Theorem 2.2.

2.2.
Moving from random marked trees to Bienaymé trees.Once Theorem 1.1 is established, the primary work in proving the other results of the paper is to understand the degree statistics of conditioned Bienaymé trees, under various assumptions on their offspring distributions.We prove the following bounds.Proposition 2.5.Fix a probability distribution µ with support N and with µ(0)+µ(1) < 1.
For n ∈ N let T n be a Bienaymé tree with offspring distribution µ, conditioned to have size n, and let n Tn be the degree statistics of T n .Then for any ǫ > 0, with very high probability In the remainder of this section, we explain how Theorems 1.2, 1.3 and 1.4 follow from Propositions 2.3, 2.4 and 2.5 together with the bound from Theorem 1.1.
Proof of Theorem 1.2.For n ≥ 1, let V n be a uniformly random node of T n .Next, fix ǫ > 0 small, and let

Now fix any degree statistics n with P {n
, so for all β ≥ 17 3/2 we have where for the second equality we have used the bound from Theorem 1.1 together with the fact that |n| 2 ≤ |n| 1 so We now use the bound The first term is o( 1) since E n occurs with very high probability by Proposition 2.3.To bound the second term we use that for any non-negative integer random variable X and any z > 0, we have Using the bound from (2.2), it follows that sup Taking β = 1/ǫ, for ǫ small the first integral is O(e −ǫ −1/3 /3 ) and the second is O(e −ǫ −2/3 /24 ), so (2.3) gives that Since ǫ 2 e −ǫ −1/3 /3 = O(ǫ) and ǫ 2 e −ǫ −2/3 /24 = O(ǫ) for ǫ > 0 small, this implies that and so 1 ) as ǫ > 0 can be chosen arbitrarily small.Since |n| 1 = (n − 1), it follows that n −1/2 |V n | → 0 in probability and in expectation.
Next, for any ǫ > 0, with C = 1 + ǫ −4 as above, taking β = (6 log n) 3 in (2.2), we obtain that On the other hand, since V n is a uniformly random node of T n , for any positive integer h we have Since this holds for any ǫ > 0, it follows that ht(T n )/((n − 1) 1/2 log 3 n) → 0 in probability, and since Finally, fix ǫ > 0. For all n large enough that E|V n | ≤ ǫ 2 n 1/2 , by Markov's inequality, On the other hand, since V n is a uniformly random node of T n , Further, if {u ∈ T n : |u| ≥ ǫn 1/2 }| < n/2 then there are more than n/2 nodes in the first ǫn 1/2 levels of the tree, so wid(T n ) ≥ n 1/2 /(2ǫ).It follows that since ǫ > 0 was arbitrary, it follows that n −1/2 wid(T n ) → ∞ in probability and in expectation.
Theorem 1.3 follows from Proposition 2.4 in exactly the same way as Theorem 1.2 follows from Proposition 2.3, so we omit the details.The proof of Theorem 1.4 from Proposition 2.5 is quite similar but not identical, so we do provide a (somewhat terser) explanation.
Proof of Theorem 1.4.Fix any degree statistics n and let (T, V ) ∈ u T • n .Then integrating the tail bound from Theorem 1.1 over β ≥ 17 3/2 , in essentially the same way as in the proof of Theorem 1.2, it follows that where C > 0 is a universal constant.Since (|n| 2 2 − n(1)) 1/2 ≤ |n| 2 ≤ |n| 1 = n − 1, using the tail bound from Theorem 1.1 with β = (6 log n) 3 , we also obtain that the last bound holding whenever |n| 1 is sufficiently large.Since with C > 0 again a universal constant.Now let T n be a Bienaymé tree with offspring distribution µ as in the statement of Theorem 1.4, and let V n be a uniformly random node of 1)).By Proposition 2.5, E n occurs with very high probability.
On the event E n we have so by (2.4) we have where in the second inequality we have used that P {E c n } = oe (1).The lower bound on E [wid(T n )] follows from this upper bound on E|V n | just as in the proof of Theorem 1.2.Finally, the second bound holding by (2.5) and since P {E c n } = oe(1).This establishes the requisite bound on E [ht(T n )], and completes the proof.

Proof of Proposition 2.1
We begin with some combinatorial definitions and facts which we will require for the proof.A forest is an ordered sequence f = (t 1 , . . ., t a ) of plane trees.The degree statistics of f is the sequence n f = (n f (c), c ≥ 0) where n f (c) is the number of nodes of f with c children.
Fix integers 1 ≤ a ≤ n and let n = (n(c), c ≥ 0) be a sequence of non-negative integers with c≥0 n(c) = n and c≥0 cn(c) = n − a.Any forest with degree statistics n has n nodes and is composed of a trees.We write T n to denote the set of forests with degree statistics n.A single tree t can be interpreted as a forest f = (t), which makes this notation agree with and extend the previously introduced notation T n for the set of trees with given degree statistics (in which case a = 1, i.e., |n| 1 = n − 1).By [18, Exercise 6.2.1], it holds that Next write T • n for the set of forests with degree statistics n with a marked node, and T (1) n for the subset of T • n where the mark is in the first tree: For any forest f ∈ T n there are n ways to choose a node to mark, so . ., t a ) and v ∈ t i , then ((t i , t i+1 , . . ., t a , t 1 , . . ., t i−1 ), v) ∈ T (1)  n .
It follows that Some of the definitions of the coming paragraph are illustrated in Figure 1.For nodes x, y of a tree t, we write x ≺ y if x is an ancestor of y in t, and for a node z ∈ v(t) \ {r(t)}, we denote its parent by p(z).Given a marked tree (t, v), the spine S(t, v) of (t, v) is the subtree of t with vertices {w : p(w) ≺ v} ∪ {r(t)}.For 0 ≤ k ≤ |v|, write v k for the unique ancestor of v with |v k | = k; so v 0 = r(t) and v |v| = v.If k ≤ |v| then the k-spine S k (t, v) of (t, v) is the subtree of t with vertices {w : p(w) ≺ v k } ∪ {r(t)}, and the marked k-spine of (t, v) is the marked tree (S k (t, v), v k ).The spinal degree sequence of S k (t, v) is (d t (v 0 ), . . ., d t (v k−1 )).Given a sequence d = (d 0 , . . ., d k−1 ) of non-negative integers and degree statistics n = (n(c), c ≥ 0) satisfying c≥0 cn(c) = c≥0 n(c) − 1, write Using these definitions, we establish the following combinatorial result, whose probabilistic corollary is then used to prove Proposition 2.1.(This result is closely related to the backbone decomposition of trees given in [8,Proposition 4 (a)] in order to prove convergence of large random trees with fixed degrees to the Brownian continuum random tree after rescaling, under suitable assumptions on the degree sequences.) Proof.To describe an element (t, v) of T • n (d), it is necessary and sufficient to specify the marked k-spine (S k (t, v), v k ), the subtrees of t rooted at the leaves of S k (t, v), and the identity of the mark v, which must lie within the subtree of S k (t, v) rooted at v k .
The number of marked k-spines with spinal degree sequence d is k−1 i=0 d i , since to specify such a tree it is necessary and sufficient to indicate which of the The subtrees rooted at the leaves of S k (t, v) form a rooted forest with degree statistics n − w(d), with a marked vertex in a specific tree; by (3.2) the number of such marked forests is The result follows.
For the next corollary we introduce the falling factorial notation (m On the event that (D 1 , . . ., D k ) = (d 1 , . . ., d k ), we have M ≥ k + 1 if and only if it follows that
To prove the theorem, we construct a size-biasing of n using a Poissonization trick similar to one introduced in [9] in the context of inhomogeneous continuum random trees.Several of the definitions of the next two paragraphs are illustrated in Figure 2.
This implies that, writing n ′ = n − n(0), the sequence (D(1), . . ., D(n)) defined by When parsing the definitions of the coming paragraph, Figure 2 will again be useful.For each 1 ≤ i ≤ n, let r i = l i + max(0, ( ), so for each interval I i which contains at least one point from U 1 , . . ., U ℓ , the region C ℓ contains a sub-interval of I i of length 1/(n − 1).Let By the definition of the indices (M (ℓ), ℓ ≥ 1), we have τ ∈ {M (ℓ), ℓ ≥ 1)}, since the points (U M (ℓ) , ℓ ≥ 1) are precisely those which, on their arrival, land in a previously empty interval, whereas U τ falls in a subinterval of an interval which already contains one of U 1 , . . ., U τ −1 .
Figure 2. The black dots represent the atoms ((S i , U i ), i ≥ 1) of the Poisson process N. The union of the striped blue regions is the "forbidden" region up to the stopping time S τ ; the projection of the striped blue regions onto the y-axis is C τ −1 .

It follows by induction that
proving the second inequality in Theorem 2.2.We now turn to proving the first inequality in Theorem 2.2; this bound is an immediate consequence of the next proposition.
The first bound of Theorem 2.2 follows from the proposition since i: . The proposition's proof is where the Poisson process setup comes into its own.Write N(t) = N([0, t] × [0, 1]) for the number of points of N arriving by time t, and let N i (t) = N([0, t]×[l i , r i )) be the number of points arriving in the interval [l i , r i ) by time t.Note that for all i ∈ [n], if N i (t) ≥ 2 then τ ≤ N(t).Thus, letting T = inf{t ≥ 0 : max i∈[n] N i (t) ≥ 2}, we have τ ≤ N(T ).It follows that for all h ∈ N, if N(t) ≤ h and T ≤ t then τ ≤ h, so We control the first of these probabilities using standard Poisson tail estimates.The second requires a little more work.The random variables (N i (t), i ∈ [n]) are independent and N i (t) is Poisson(t(r i − l i ))-distributed.Note that r i − l i = 0 when d i ≤ 1. Writing (1 + p i t)e −p i t .
(4.4) Combining this with (4.3) and the tail bound P {Poisson(t) > h} = e −t((h/t) log(h/t)−h/t+1) , which holds for h ≥ t and can be found in, e.g., [7, Page 23], we obtain that We next focus on proving bounds for the second term on the right-hand side of (4.5); our approach is based on that of Lemma 9 in [9]. .
Proof.First, note that k≥2 for 0 ≤ t < 1/p max , since the inner sum has finitely many summands, each of which decreases geometrically in k.By a Taylor expansion of log(1 + x) around x = 0 and Tonelli's theorem, it follows that log g(t, d) Next, note that and thus log g(t, d) Using that p i = d i /(2(n − 1)) and p max = d max /(2(n − 1)) and v = i: , this is precisely the bound claimed in the lemma; this completes the proof. .
Proof.For t ≤ (n − 1)/d max we have , and the bound in the lemma then gives .
Proof of Proposition 4.1.First, if d max = 1 then v = 0 and the lemma asserts a nonnegative upper bound on P {τ > ∞}, so clearly holds.We thus assume that d max > 1 for the rest of the proof.
≤ n−1 dmax , by Lemma 4.2 we have , and noting that by (4.5) we obtain that Now suppose that d max > ( i: . By construction the entries of (d 1 , . . ., d n ) are non-decreasing, so d n = d max .For any positive real K ≥ 2, if at least two of the points U 1 , . . ., U ⌊K⌋ lie in the interval [l n , r n ) then τ ≤ K, so where for the second inequality we have used the lower bound on K to deduce that ⌊K⌋ − 1 > K/2 and that (⌊K⌋−1)dmax , the lower bound on d max implies that Kd max /(n − 1) ≥ x; since 2xe −x/4 is decreasing for x ≥ 4, the bound (4.7) then implies that To finish the proof, we combine (4.6) and (4.8) to get a bound which does not depend on the value of d max .Take β ≥ 17 3/2 , let C = β 1/3 > 2 and x = β 2/3 ≥ 4. Then 2C ≤ β and xC = β.Whatever the value of d max , one of (4.6) and (4.8) applies, so we obtain that Finally, it is straightforward to check that e −y/24 + 2ye −y/4 ≤ 2e −y/24 for y ≥ 17, which combined with the previous inequality yields the first bound of the proposition.
5. Proofs of the conjectures from [13] and of Propositions 2.3, 2.4 and 2.5 The sort of random trees considered by Janson [13] are called simply generated trees; they are defined as follows.Fix non-negative real weights w = (w k , k ≥ 0) with w 0 > 0. For a finite plane tree t, the weight of t is For positive integers n write Write Φ(z) = Φ w (z) = k≥0 w k z k for the generating function of the sequence w, and ρ = ρ w for the radius of convergence of Φ.For t > 0 such that Φ(t) < ∞, define the function Ψ is strictly increasing on [0, ρ) by [13, Lemma 3.1(i)], so this limit exists.In all cases, write ν = ν(w) = Ψ w (ρ).Note that Ψ(t) ∈ (0, ∞] for all t > 0, so ν = 0 if and only if ρ = 0. The questions from [13] that we address in this paper concern exclusively weight sequences with ν ≤ 1, and we assume this is the case from now on.We define σ 2 = ρΨ ′ (ρ); this is a slight simplification of the definition from [13, Theorem 7.1], made possible by the assumption that ν ≤ 1.
The following conjecture summarizes Conjectures 21.5 and 21.6 and Problems 21.7 and 21.8 from [13].
Conjecture 1 ( [13]).Let w = (w k , k ≥ 0) be a weight sequence with w 0 > 0 and with w k > 0 for some k ≥ 2, and for n ≥ 0 with Z n (w) > 0 let T n be a simply generated tree of size n with weight sequence w.
In all four statements, the convergence is as n → ∞ along integers n such that Z n (w) > 0.
The results of this work establish points (2) and (4) of this conjecture, and establish (1) and (3) up to polylogarithmic factors.To make these deductions, we rely on the following result from [13] about the typical degree statistics of simply generated trees.
Theorem 5.1 ([13]).Let w = (w k , k ≥ 0) be a weight sequence with w 0 > 0 and with w k > 0 for some k ≥ 2. Whenever Z n (w) > 0 let T n be a simply generated tree with weight sequence w and size n.
Then π = (π(k), k ≥ 0) is a probability distribution, with expectation ν and variance σ 2 = ρΨ ′ (ρ), and the degree statistics n Tn satisfy that for every integer k ≥ 0 and real ǫ > 0, This theorem is essentially a special case of [13,Theorem 11.4].The error bounds stated above are not made explicit in the statement of that theorem, but are recorded in the course of its proof (see [13, page 164]).
We also require a version of Theorem 5.1 which addresses the case that ν = ρ = 0. Before stating this result, note that if ρ w = 0 then the probability distribution π defined by (5.1) has π(0) = 1 and π(k) = 0 for k > 0.
Theorem 5.2.Let w = (w k , k ≥ 0) be a weight sequence with w 0 > 0 and with w k > 0 for some k ≥ 2. Whenever Z n (w) > 0 let T n be a simply generated tree with weight sequence w and size n.Suppose that ρ w = 0. Then the degree statistics n Tn satisfy that for every real ǫ > 0, This theorem asserts that when the radius of convergence of Φ is zero, with very high probability T n has n − o(n) leaves.
Proof.Fix δ > 0 and integer L > 2. We claim that (We can achieve this by multiplying all weights by w −1 0 ; this does not change the distribution of T n .)Now fix K large enough that K δ > 2(L + 1).Note that if lim sup k→∞ (log w k )/k = r < ∞, then ρ w ≥ 1/r; since we assume ρ w = 0, it follows that lim sup k→∞ (log w k )/k = ∞, so we may further choose an integer M > 2L such that In what follows, given a sequence n which is the degree statistics of a tree (so c≥0 n(c) = -so if t is a tree with degree statistics n then w(t) = w(n).Now, for such a sequence n, form degree statistics n as follows.For Then |n| 1 = |n| 1 and c≥0 cn(c) = c≥0 n(c), so n is again the degree statistics of a tree with |n| 1 + 1 vertices.Since w 0 = 1, we also have Note that for all i ∈ N we have n(i) − M m(i) ≤ M − 1, so 0<c≤L n(c) ≤ (M − 1)L, and thus if n ≥ (M − 1)L/δ then 0<c≤L n(c) ≤ δn.For such n, if 0<c≤L n(c) ≥ 2δn, then we also have 0<c≤L M ⌊n(c)/M ⌋ ≥ ( 0<c≤L n(c))− (M − 1)L ≥ δn, so it follows from the previous lower bound on w(n)/w(n) that To use this bound to complete the proof, it remains to control (a) the number of degree statistics n that can give rise to a given degree statistics n, and (b), for a given pair of degree statistics n and n, the relative numbers of trees with these degree statistics.
To control (a), fix a sequence n ′ which is the degree statistics of a tree of size n.Then for any degree statistics n with n = n ′ , there are non-negative integers m 1 , . . ., m L with 0<c≤L cm c < n such that n ′ (c) = n(c) − M m c for 0 < c ≤ L.Moreover, n may be reconstructed from n ′ and the values m 1 , . . ., m L .It follows that To control (b), note that if n is the degree statistics of a tree of size n, then by the formula (3.1) for the number of trees with given degree statistics, we have where in the last inequality we have used the fact that the final fraction is a multinomial coefficient and that L c=0 n(c) ≤ c≥0 n(c) = n.To conclude, write N n (resp.Nn ) for the set of degree statistics n with c≥0 n(c) = n = |n| 1 + 1 and such that 1≤c≤L n(c) ≥ 2δn (resp.such that 1≤c≤L n(c) ≤ δn), and note that if n ∈ N n then n ∈ Nn provided that n ≥ (M − 1)L/δ.We have where we have used (5.3) and (5.4) for the final bound.For each n ′ ∈ Nn , there are at most n L sequences n ∈ N n with n = n ′ , so for n ≥ (M − 1)L/δ the above bound yields Since K δ > 2(L + 1), the term (n(M − 1)!) L ((L + 1)/K δ ) n tends to zero as n → ∞, which establishes (5.2) and completes the proof.
The next corollary is the key takeaway from Theorems 5.1 and 5.2, for the purposes of this work.In the same way that Theorems 1.2 and 1.3 follow from Propositions 2.3 and 2.4, Corollary 5.3 implies that if ν < 1, or if ν = 1 and σ 2 = ∞, then ht(T n )/(n 1/2 log 3 n) → 0 and wid(T n )/n 1/2 → ∞ in probability.This proves the second and fourth points of the above conjecture and nearly proves the first and third points, up to the polylogarithmic factors.
We now argue in three cases.First, suppose that 0 < ν < 1.In this case the support of w is infinite, so for any fixed K ∈ N, k≤K kπ(k) < k≥0 kπ(k) = ν.It follows by Theorem 5.1 that with very high probability k≤K kn Tn (k) ≤ νn − 1.Since Since K k=2 E n (k) occurs with very high probability, the result follows.

Proposition 2 . 3 . 2 ≥ 1 . 2 . 4 . 2 ≥
Fix a probability distribution µ with support N with |µ| 1 ≤ 1 and |µ| 2 = ∞.For n ∈ N let T n be a Bienaymé tree with offspring distribution µ, conditioned to have size n, and let n Tn be the degree statistics of T n .Then for any C > 0, with very high probability |n Tn | 2 C|n Tn | Proposition Fix a probability distribution µ with support N, with |µ| 1 < 1 and with c≥0 e tc µ(c) = ∞ for all t > 0. For n ∈ N let T n be a Bienaymé tree with offspring distribution µ, conditioned to have size n, and let n Tn be the degree statistics of T n .Then for any C > 0, with very high probability |n Tn | 2 C|n Tn | 1 .

Figure 1 .
Figure 1.A visualization of a spine S(t, v) and of a k-spine S k (t, v), for k = 3.
plane trees t:|v(t)|=n w(t) , and when Z n > 0 define a random tree T n = T n (w) byP {T n = t} = w(t) Z nfor plane trees t with |v(t)| = n.Then T n is called a simply generated tree of size n with weight sequence w.If k≥0 w k = 1 then T n (w) is distributed as a Bienaymé tree with offspring distribution w conditioned to have n vertices.