The Horton-Strahler Number of Conditioned Galton-Watson Trees

The Horton-Strahler number of a tree is a measure of its branching complexity; it is also known in the literature as the register function. We show that for critical Galton-Watson trees with finite variance conditioned to be of size $n$, the Horton-Strahler number grows as $\frac{1}{2}\log_2 n$ in probability. We further define some generalizations of this number. Among these are the rigid Horton-Strahler number and the $k$-ary register function, for which we prove asymptotic results analogous to the standard case.


Introduction
Rooted trees, i.e., connected acyclic graphs with one node distinguished as the root, are one of the most important structures in graph theory and computer science. Many possible functions can be defined on them, one of which is the Horton-Strahler number. It was originally conceived by geologists to classify real-world river networks and has since then been applied in multiple fields; for instance, it is known as the register function in computer science. The study of its asymptotics for various families of trees has seen considerable attention. Background. The Horton-Strahler number was introduced in 1845 by Robert E. Horton [15] and redefined by Arthur N. Strahler [31] in the context of hydrogeomorphology. This field represents a river network as a tree with a planar embedding, where the point furthest downstream corresponds to the root and junctions between two streams correspond to nodes in the tree. In his original work [15], Horton described a geometric decay of the number of branches of increasing Horton-Strahler order in a large river basin. Empirical findings from classical geological studies showed that in fact, many other key physical characteristics of river networks (e.g., basin area, stream width and length, flow velocity, etc.) can be modelled using the Horton-Strahler number [27,28]. In computer science, the Horton-Strahler number is known as the register function or register number [10], modulo the value at the leaf. It is equal to H(T )+1 and corresponds to the minimum number of cpu registers needed to evaluate an expression tree. The probability and theoretical computer science communities have mostly devoted their attention to the register function of random equiprobable binary trees -Catalan trees. Already in 1966, Shreve [30] made some conjectures about its value based on simulations in a random topology model equivalent to a uniform distribution on planar binary trees. Flajolet et. al. [13], Kemp [17] and Meir et. al. [25] independently found the register function of a Catalan tree with n leaves to be log 4 n + O(1). Later, Devroye and Kruszewski [7] offered a simple probabilistic proof of this result. As for other families of trees, Flajolet and Prodinger showed similar asymptotics for Motzkin trees [12]. A visualization of a tree with Horton-Strahler number 4, where all nodes in the tree are collapsed into edges, other than the root and nodes forming an embedded complete binary tree. In fact, the Horton-Strahler number of a tree is equal to the height of the largest embedded complete binary tree.
Many quantities related to the Horton-Strahler number have been studied, mostly in the Catalan tree setting. Moon and others investigated the behaviour of the bifurcation ratio, i.e., ratio of number of branches with successive Horton-Strahler numbers [26,34,35]. The Horton-Strahler numbers have also been related to the self-similar (fractal) structure of trees. They are connected to the Horton pruning operation, which iteratively erases a tree; Burd et. al. [5] studied this pruning operation for critical binary Galton-Watson trees. Other references on this topic can be found in the review by Kovchegov and Zaliapin [22].
In our work, we consider a generalization of the Horton-Strahler number to general rooted trees. Another such definition for trees with any number of children was given by Auber et. al. [2]. Drmota and Prodinger [8] showed that the distribution of this number for a uniformly chosen t-ary tree is also highly concentrated around log 4 n.
Galton-Watson processes. These processes were first studied in the context of disappearance of family names in 1845 by Bienaymé [3] and in 1874 by Galton and Watson [14]. A Galton-Watson tree [1] with offspring distribution ξ is a rooted ordered tree in which each node reproduces according to ξ, i.e., has i children with probability p i = P{ξ = i}. Excluding the distribution where p 1 ≡ 1, it is well known that these trees are finite with probability one if and only if E{ξ} ≤ 1. Simultaneously, the first moment of the size of trees with E{ξ} = 1 is infinite. We will consider these critical trees with mean µ := E{ξ} = 1 and variance σ 2 := V{ξ} ∈ (0, ∞).
Let T denote a ξ-Galton-Watson tree, which from now on we will call unconditional Galton-Watson tree. We distinguish this type of tree from from the trees we study in this paper, that are conditioned to have size |T | = n. We will denote such a conditional tree as T n . Conditional Galton-Watson trees [18] are an especially interesting structure to study, as certain offspring distributions have been shown to correspond to families of "simply-generated trees" [24], such as k-ary trees, Motzkin trees and planted plane trees. Picking a tree uniformly at random from such a family is thus equivalent to generating a corresponding conditional Galton-Watson tree.
In this paper, we will show that for a critical conditional Galton-Watson tree T n with variance σ 2 ∈ (0, ∞), the Horton-Strahler number of the root satisfies in probability as n → ∞. This expression synthesizes all previously known first order results; however, higher order concentration information is not presented here. Furthermore, we present other definitions of possible Horton-Strahler numbers (see section 5), and offer partial or full results about these numbers. For instance, included in these definitions is a k-ary register function, which corresponds to a computational model in which each register in a computer takes k inputs to produce an output in one step. We show that the k-ary register function of a critical conditional Galton-Watson tree grows as log 2 log 2 n log 2 k/2 in probability as n → ∞.

Unconditional Galton-Watson Trees
We begin by determining the distribution of the Horton-Strahler number of a Galton-Watson tree with no size conditioning. These results, particularly Theorem 2, will be crucial to later proofs of the upper and lower bounds in Sections 3 and 4. Indeed, unconditional Galton-Watson trees are a part of the construction of Kesten's limit tree [19], which we will introduce and heavily use in the next section.
Let us first define some notation for an unconditional Galton-Watson tree with mean µ = 1 and variance 0 < σ 2 < ∞. Let the generating function of the offspring distribution ξ be f (s) = ∞ i=0 p i s i on 0 ≤ s ≤ 1, where we recall that p i = P{ξ = i}. Furthermore, let us define for i ∈ N, the probability that the Horton-Strahler number of the root is i, as well as the partial sums which are involved in the recursion for q i and other elements of proofs in this section. Our first lemma formalizes the intuition that nodes with one child are irrelevant to the Horton-Strahler number, as these nodes simply pass on the number of their only child. We will show that, in fact, altering the offspring distribution by removing the probability of having one child still preserves the original Horton-Strahler number.
Lemma 1. Let ξ be an offspring distribution with µ = 1 and 0 < σ 2 < ∞. Let ζ be an altered distribution defined by P{ζ = 1} = 0 and for i = 1, P{ζ = i} = p i /(1 − p 1 ). Then, defining H and H ′ to be respectively the Horton-Strahler number of an unconditional ξ-and ζ-Galton-Watson tree, we have This can be proven via induction on i ∈ N. The details of the proof are relatively tedious and therefore relegated to Appendix A. Armed with this lemma, we will be able to simplify proofs by trivially removing single-child nodes from any offspring distribution we are given without changing the distribution of the Horton-Strahler number; this altered distribution still has µ = 1 and 0 < σ 2 < ∞.
We now come to the main theorem regarding the Horton-Strahler number of unconditional Galton-Watson trees: this number has an exponentially decreasing probability. This has already been shown for Catalan trees; for instance, see Devroye and Kruszewski [7]. We note that a random Catalan tree can be generated as a Galton-Watson tree with offspring distribution p 0 = 1/4, p 1 = 1/2, p 2 = 1/4. A Simple Proof for Catalan Trees. We can prove very simply by induction that for a Catalan tree T , Proof. First, we have from Lemma 1 that the Horton-Strahler number of a Catalan tree is distributed identically to that of a fully binary tree generated via the distribution p 0 = 1/2, p 1 = 0, p 2 = 1/2. The base case is trivial: Then, supposing that P{H(T ) = x − 1} = 2 −x , we have which can be simplified to p = 1 2 (2 −x ), completing the proof.
Curiously, trees generated from all other critical offspring distributions give rise to a very similar formulas for their Horton-Strahler numbers! Theorem 2. Let T be an unconditional Galton-Watson tree with offspring distribution ξ with µ = 1 and 0 < σ 2 < ∞. Then, with x ∈ N, The main ingredient of the proof of this theorem is the recursion () obtained in the proof of Lemma 1. Without loss of generality, assuming that p 1 = 0, the probability that the Horton-Strahler number of the root is i ∈ N satisfies This yields an inequality for q i involving q + i and q i−1 , and the result then follows from some computations. The details are relatively technical and provide little intuition; we thus include the proof in Appendix A.
We can also show that the Horton-Strahler number of any critical unconditional Galton-Watson tree (including those with infinite variance) have an exponentially decreasing upper bound.

Lower Bound via Kesten's Limit Tree
In order to prove the lower bound, we will be using the notion of Kesten's limit tree [19]. This limit tree T ∞ is an infinite tree consisting of a central spine and unconditional trees hanging off the spine. To define how this tree and its spine is generated, we define a new size-biased random variable ζ as P{ζ = i} = ip i , where p i correspond to our original offspring distribution ξ. This is a valid probability distribution since we are considering distributions ξ with mean E{ξ} = ∞ i=1 ip i = 1. The spine of Kesten's tree thus consists of one node on each level that reproduces according to ζ; note that ζ ≥ 1, making this tree infinite. One of the children of each spine node, picked uniformly at random, is assigned to be the spine node of the next level, and all others are roots of an unconditional Galton-Watson tree with offspring distribution ξ. There is an important way in which conditional Galton-Watson trees converge to this infinite tree. Denote for any tree T the finite tree τ (T, h), which is T cut off after level h. We have that for all fixed heights h > 0 and all trees t, lim In the case where the variance of ξ is finite, Benedikt Stufler proved that this convergence does not in fact require the truncation height h to be a constant -it can also depend on the size n of the tree. Theorem 5.2 of Stufler's paper [32] states that the sequence of truncation heights h n must then satisfy Intuitive "proof". The view of the conditional Galton-Watson tree converging to a Kesten tree gives us the intuition for the Horton-Strahler number of the root being 1 2 log 2 n. It is well known [11] that conditional Galton-Watson trees have expected height O( √ n). Then, for approximation, let's consider a Kesten tree cut off at height √ n/σ, denoted τ (T ∞ , √ n/σ). This tree has a spine of length √ n/σ and each spine node indexed i = 1, . . . , √ n/σ has ζ i − 1 unconditional Galton-Watson trees hanging from it. We can define the j-th unconditional tree hanging from spine node i as T ij , j = 1, . . . , ζ i − 1. The Horton-Strahler number of the root then satisfies max Using Wald's inequality [33] with E{ζ i } = σ 2 + 1, and noting that T ij are all i.i.d. and distributed as T , which tends to zero if x = (1/2 + ǫ) log 2 n for some ǫ > 0. For the lower bound, the following is slightly incorrect, as it assumes that each spine node has at least one hanging tree. We present it here to illustrate the main idea; see the proof of Theorem 4 for the rigorous statement.
since the unconditional trees T ij are i.i.d. distributed as T . Then, applying Theorem 2 yields which tends to zero if x = (1/2 − ǫ) log 2 n for ǫ > 0.
We thus have that the Horton-Strahler number of Kesten's limit tree truncated at level √ n/σ tends to 1 2 log 2 n. Intuitively, since conditional Galton-Watson trees converge to this limit tree as n → ∞, in the sense of (), the Horton-Strahler number of our conditional trees should be the same as n → ∞. Indeed, the lower bound for conditional Galton-Watson trees can be proven using the same method as what we have just used in this intuitive proof.
Theorem 4. Given a critical conditional Galton-Watson tree with offspring distribution ξ with variance satisfying 0 < σ 2 < ∞, then Proof. Recall some notation: T denotes an unconditional Galton-Watson tree, and T ∞ denotes Kesten's limit tree. For some integer ℓ, we can cut off T ∞ by taking all the nodes on the spine including the node at distance ℓ from the root, but no further. To this, we can add all unconditional trees hanging from these ℓ + 1 spine nodes. This forms a finite tree that we denote T ∞ ℓ ; a diagram is shown in Figure 2. ρ ℓ Figure 2. A visualization of T ∞ ℓ : Kesten's limit tree T ∞ rooted at a node ρ, with its spine truncated after the spine node on level ℓ. Each triangle represents a hanging unconditional Galton-Watson tree.
Let h(T ) denote the height of a tree T , and let T n denote the tree T conditioned to have size |T | = n.
For some x ≥ 1, define the three probabilities Let's start with the two terms that do not depend on x. As discussed earlier in this section, Stufler [32] showed that Now for the first term, let ℓ = √ n/ log n. We recall our notation of ζ i as the number of children of the i-th node on the spine, and further define T ij to be the j-th Galton-Watson tree hanging from this i-th spine node. These unconditional trees are i.i.d. and distributed as T . We thus have and by Wald's identity [33], as E{ζ} = σ 2 + 1, Finally, using Kolmogorov's estimate [20,23], this grows as which approaches zero as n → ∞.
For the third term, note that the first and second events included in the probability imply that the truncated Kesten limit tree T ∞ ℓ at ℓ = √ n/ log n is completely included in our conditional Galton-Watson tree T n . This inclusion implies that H(T ∞ ℓ ) ≤ H(T n ), which yields Note that now this is exactly the form of what we had in the intuitive proof! We can thus follow exactly in the steps outlined in the derivation of (). Let T ij again denote the j-th unconditional Galton-Watson tree hanging from the i-th spine node. Let N = ℓ i=0 (ζ i − 1) be the number of hanging trees, which has mean E{N } = (ℓ + 1)σ 2 . Note that the hanging trees are i.i.d. distributed as T , the number N is a sum of ℓ + 1 i.i.d. random variables. Therefore, using the law of large numbers, we can bound We then have which tends to zero as n → ∞ for x = (1/2 − ǫ) log 2 n; ǫ ∈ (0, 1/2).
Thus, modulo some details regarding the convergence of the conditional Galton-Watson tree to Kesten's limit tree, the intuitive proof idea miraculously works to show the lower bound of our result. However, the upper bound cannot be shown following this proof sketch; the contributions of terms underneath any given cutoff cannot be ignored. We will instead offer a proof based on the construction of rotationally invariant events.

Upper Bound via a Rotationally Invariant Event
For the upper bound, we note that in order for a tree to have a Horton-Strahler number equal to k, we must be able to embed a complete binary tree of height k in the original tree (see Figure 1).
We therefore immediately have a deterministic upper bound of H n ≤ log 2 n + 1 2 for any tree of size n. We seek to do better than this.
Random walk view of a Galton-Watson tree. Numbering the nodes in a Galton-Watson tree T in preorder traversal, each node has a tree degree ξ i independently distributed as ξ. This sequence of random variables defines a tree of size Thus, for a tree of size |T | = n, the event must be true, and furthermore, the random walk must stay positive until the last time step where it reaches −1, i.e., for all t < n, t i=1 (ξ i − 1) ≥ 0.
Rotationally invariant events. Any event B on a tree T of size |T | = n is determined by the degree sequence ξ 1 , . . . , ξ n of the tree. We say that this event B satisfies rotation invariance if it remains true when applied to ξ i , . . . , ξ n , ξ 1 , . . . , ξ i−1 for all i. We have a powerful tool to deal with such events on a conditional Galton-Watson tree T . Letting A be the event defined in () and using Dwass' cycle lemma [9], it can easily be shown [4] that Note that a rotation of these random variables ξ i , . . . , ξ n , ξ 1 , . . . , ξ i−1 defines a forest in which the last tree is unfinished -let us call it the i-forest. Each of the trees in the forest is obtained as follows: for a tree starting at index i, we simply pick the first index j > i for which the degree sequence ξ i , . . . , ξ j defines a tree, i.e., satisfies () for an appropriate tree size. Denote this tree size by |T (ξ i , ξ i+1 , . . . )|. If there is no such index j ∈ {1, · · · , n}, then the tree starting at i is undefined.
In wielding (), we are aided by the fact that we have an exact asymptotic limit for P{A} due to Kolchin [21]. Letting the period of In order to make use of () and () in our current setting, we must define a rotationally invariant event that is related to the Horton-Strahler number. Given some i.i.d. degree sequence ξ 1 , . . . , ξ n , for each i ∈ {1, . . . , n}, let T i be the first tree in the i-forest. Define η i to be the Horton-Strahler number of this tree: a rotationally invariant quantity. Since H(T n ) = η 1 given |T | = n, this is an upper bound to the Horton-Strahler number: The upper bound we seek to show will follow from the following theorem linking the Horton-Strahler number of a conditional Galton-Watson tree to the η 1 we just defined.
Theorem 5. Given a critical conditional Galton-Watson tree with offspring distribution ξ and 0 < σ 2 < ∞, for some constant c, Proof. We have from () that If η i ≥ x, then there must exist some j with η j ≥ x − 2 such that the tree size determined by ξ j , ξ j+1 , . . . is at most n/4. This is since there must be at least four disjoint subtrees with Horton-Strahler number greater or equal to x − 2, and the smallest of these subtrees must have size |T (ξ j , ξ j+1 , . . . )| at most n/4. We thus define 8 for all i. Thus, the numerator of () satisfies Next, we define the cumulative sums S 0 = 0, and for 1 ≤ i ≤ n, We define η * i for i ≤ n/4 as follows: i) if the first tree in the i-forest is defined within the nodes ξ i , . . . , ξ n/2−1 , let η * i = η i ; ii) if the first tree in the i-forest is unfinished, let η * i be the maximal Horton-Strahler number for any subtree of node i, i.e., for any tree occurring in the forest defined by ξ i+1 , . . . , ξ n/2−1 .
Note that we again consider the Horton-Strahler number of any unfinished tree in this forest to be zero. As such, the tree defined by η * i has size less than n/2, and for all 1 ≤ i ≤ n/4, Then, defining the events the inequality () becomes We must now analyze the event D i : where the third equality holds by independence of the ξ i 's. In order to bound this, we make use of Rogozin's inequality [29], which we recall states that if X 1 , . . . , X n are i.i.d. random variables and p = sup for some universal constant α. In our case, we consider offspring distributions ξ satisfying 0 < σ 2 < ∞, which guarantees p 0 > 0. We therefore have p < 1, and arrive at for some constant c ′ . Further defining the event where the last line follows from a rotational argument on ξ 1 , . . . , ξ i−1 . Then, by Cauchy-Schwartz, Thus, considering the two cases i = 1 and i > 1, Therefore, returning to the numerator of (), we have for some constant c ′′ . Finally, we have from () that there exists another constant c such that completing the proof.
Everything we require follows from this theorem. The following corollary gives us the upper bound of the classical Horton-Strahler number.
Corollary 6. For a critical conditional Galton-Watson tree T with 0 < σ 2 < ∞, Proof. We have that where H(T (ξ 1 , ξ 2 , · · · )) is the Horton-Strahler number of the first tree in the infinite sequence ξ 1 , ξ 2 , . . . , i.e., the Horton-Strahler number of an unconditional Galton-Watson tree. Note that we have the inequality since there is a possibility for the first tree to be unfinished, in which case η 1 = 0. Recall that we had from Theorem 2 that as x → ∞ for an unconditional Galton-Watson tree T . Thus, by Theorem 5, there exists a constant c such that which tends to zero if x = (1/2 + ǫ) log 2 n, for any ǫ > 0.

Generalizations of the Horton-Strahler Number
Our definition () is not the only possible one. In this definition, the number increments at each river branching where two rivers attain the same maximal flow. We can define various generalizations of this number for non-binary trees, ranging from less to more strict. We will discuss three additional natural definitions. All of them will be recursively defined from the values of all subtrees, and leaf nodes u with subtree size |T [u]| = 1 will always have the value 0.
i) The French Horton-Strahler number, where French refers to its source, Auber et. al. [2]. If the root of the tree T has k children with subtrees taking values F 1 ≥ F 2 ≥ · · · ≥ F k ≥ 0 (sorted in decreasing order), then the tree has French Horton-Strahler number ii) The Canadian Horton-Strahler number. If the root of the tree T has k children with subtrees taking values C 1 ≥ C 2 ≥ · · · ≥ C k ≥ 0 (sorted in decreasing order), and we have r children with the maximal value C 1 = · · · = C r > C r+1 ≥ · · · , then the root has Canadian Horton-Strahler number C(T ) := C 1 +(r − 1) = max iii) The (standard) Horton-Strahler number studied earlier in this paper was given in (). Following similar notation as given in this list, for k children with subtrees taking values H 1 ≥ H 2 ≥ · · · ≥ H k ≥ 0, then the Horton-Strahler number of the root is iv) The rigid Horton-Strahler number. Again, with the same notation of k children with subtrees taking values R 1 ≥ R 2 ≥ · · · ≥ R k ≥ 0, we have Note that all these definitions coincide for binary trees.  We also have the following ordering: Lemma 7. For any tree T , the different Horton-Strahler numbers are ordered according to The proof proceeds by induction on the height of the tree, and is given in Appendix B. From this lemma, we immediately get that (1/2) log 2 n is a universal lower bound for both the French and the Canadian Horton-Strahler numbers F(T n ) and C(T n ) of any critical conditional Galton-Watson tree T n with 0 < σ 2 < ∞. Indeed, the French Horton-Strahler number F(T n ) for a uniformly random k-ary tree T n of size n was shown to satisfy in probability by Drmota and Prodinger [8]. They in fact show that F(T n ) is quite concentrated about (1/2) log 2 n, regardless of the value of k ≥ 2. We recall that a uniformly random k-ary tree of size n is a conditional Galton-Watson tree with offspring ξ ∼ Binomial(k, 1/k). Therefore, from what we have shown in this paper, its (standard) Horton-Strahler number also scales as (1/2) log 2 n. One may then be tempted to believe that () holds for the French Horton-Strahler number of conditional Galton-Watson trees T n generated from any offspring distribution ξ with finite variance σ 2 , but that is false. The definition of F(T n ) is quite sensitive to the degree distribution: it is easy to see that if M n is the maximal degree of any node in T n , then F(T n ) ≥ M n − 1.
Maximal degrees of conditional Galton-Watson trees are well understood; see for example Janson's complete treatment [16]. If ξ has a polynomial tail, then the maximal degree M n grows at a polynomial rate as well. For exponential tails, M n grows as a constant multiple of log n. Thus, for general critical offspring distributions, a (1/2) log 2 n upper bound for the French Horton-Strahler number does not hold. However, it seems plausible that for distributions with bounded degree or exhibiting a faster-than-exponential decrease in the tail, () would remain true.
The Canadian Horton-Strahler number C(T n ) is much less sensitive than F(T n ). Just like the French number, it satisfies the lower bound for all ǫ > 0; but C(T n ) can still be much larger than (1/2) log 2 n.
Finally, from Lemma 7, the rigid Horton-Strahler number has (1/2) log 2 n + o(1) as a strict upper bound. We can further study it using the tools developed in this paper. We will find that it tends as either log 2 log 2 n or log 2 n, modulo constant multiplicative factors. Our results are presented in section 6.
Another possible generalization of the Horton-Strahler can be given from the structural view of the number. We will recall the structural definition of the standard Horton-Strahler number (i.e., the register function) and define the k-ary register function for any tree T .
i) The register function (i.e., the standard Horton-Strahler number) H(T ) is the height of the largest complete binary tree that can be embedded in T . ii) Similarly, we define the k-ary register function K(T ) for any given k ≥ 2 to be the height of the largest complete k-ary tree that can be embedded in T . The definition can also be written recursively. First, set the value of a leaf node u with |T [u]| = 1 to be 0. Then, if the root of the tree T has ℓ ≥ k children with values K 1 ≥ K 2 ≥ . . . K ℓ (sorted in decreasing order), the tree has k-ary register function If the tree has ℓ < k children, then K(T ) = K 1 .
Note that as stated in the introduction, the register function corresponds to H(T ) + 1 in the literature (which amounts to letting the leaves have value 1). We omit this difference in our discussion for clarity of notation. The definitions of the regular register function and the k-ary register function coincide for k = 2. We also have that K(T ) ≤ H(T ) for any k. However, K(T ) does not fit cleanly into the chain of inequalities in Lemma 7; its relationship to the rigid Horton-Strahler number depends on the specific offspring distribution.
The asymptotic behaviour of the k-ary register function for a conditional Galton-Watson tree can be determined quite simply using the tools developed in this paper. The result will be presented in section 7. We prove a lemma regarding the unconditional tree, and then the theorem follows by the same proof as for the rigid Horton-Strahler number.

The Rigid Horton-Strahler Number
We begin with analogs of Lemma 1 and Theorem 2 regarding unconditional Galton-Watson trees for the rigid Horton-Strahler number. Note that we only need to deal with trees satisfying P{ξ > 2} > 0, since all the definitions of the Horton-Strahler number coincide for binary trees. Lemma 8. Let ξ be an offspring distribution with µ = 1 and 0 < σ 2 < ∞. Consider the altered distribution ζ defined in Lemma 1 with the probability of one child set to zero. Then, defining R and R ′ to be respectively the Horton-Strahler number of an unconditional ξ-and ζ-Galton-Watson tree, we have This lemma is once again proved via induction, with details laid out in Appendix C. It is used to show the following analog of Theorem 2 for the rigid Horton-Strahler number.

()
Otherwise, if d > 2, then there exist constants α i > 0 such that The proof of this theorem proceeds similarly to that of Theorem 2, and is included in Appendix C. Note that for binary critical trees T , we have p 0 = p 2 , implying p 1 = 1−2p 2 , and σ 2 = p 1 +4p 2 −1 = 2p 2 . Therefore, σ 2 /2p 2 = 1 and, as expected, the rigid Horton-Strahler number is equal to the regular Horton-Strahler number: We can now derive asymptotics for the rigorous Horton-Strahler number just as we did in sections 3 and 4. As shown in the preceding theorem, the parameter d matters a lot, determining whether the growth scales as log 2 n or log 2 log 2 n. The results are formalized below.
Theorem 10. Consider a critical Galton-Watson tree T n conditioned to be of size |T | = n, and define d as in the previous theorem. If d > 2, we have in probability as n → ∞. On the other hand, if d = 2, letting γ = 1 + σ 2 /2p 2 , in probability as n → ∞.
Proof. Let us begin with the d > 2 case. The upper bound can be proven very simply.
where T is an unconditional Galton-Watson tree. We can then bound it using Theorem 9: there exist constants α i > 0 such that This tends to zero for x = (1 + ǫ) log 2 log 2 n log 2 d/2 . The lower bound can be proven following the outline of the "intuitive proof" from section 3, using the same method as Theorem 4. We have the same decomposition as in (): where I, II and III are exactly as defined in (), except with H's switched for R's in the definition of the third term. We showed in () and () that both I and II are o(1). To upper bound III, we can once again consider the truncated Kesten limit tree T ∞ ℓ at ℓ = √ n/ log n depicted in Figure 2, with unconditional hanging trees T ij i.i.d. distributed as T . Recall from () that the number of hanging trees N satisfies We can thus bound for some α, β > 0. As we wished to show, this tends to zero for x = (1 − ǫ) log 2 log 2 n log 2 d/2 for any ǫ > 0.
For the d = 2 case, we note that the form of P{R(T ) = x} in () is identical to that of P{H(T ) = x}, where the base of the exponent changes from 2 to γ. The proofs of the upper and lower bound for the regular Horton-Strahler number thus translate to this case exactly. We have which tends to zero for x = 1 2 log γ − ǫ log 2 n for any ǫ > 0, completing the lower bound. For the upper bound, there exists c such that which tends to zero for x = 1 2 log γ + ǫ log 2 n for any ǫ > 0.

The k-ary Register Function
The k-ary register function K(T ) was defined in () as the height of the largest complete k-ary tree that can be embedded in T . We can show that the k-ary register function of a critical Galton-Watson tree converges to (log 2 k/2) −1 log 2 log 2 n in probability. Recall that the asymptotic behaviour of the rigid Horton-Strahler for the unconditional tree -Theorem 9 -was quite tedious to prove. In contrast, we present a relatively simple proof of the analogous result for the k-ary register function, albeit with an extra restriction on the moments of the offspring distribution.
Before proving this theorem, note that this is exactly the same as the tail bounds of the rigid Horton-Strahler number when d > 2; see Theorem 9, with k taking the place of d. Therefore, with minor modifications, Theorem 10 gives us the asymptotic behaviour of K(T ) for a conditional Galton-Watson tree: Corollary 12. Let k ≥ 3 and let ξ be as specified in the previous theorem. Then, letting T n denote a conditional Galton-Watson tree of size n, as n → ∞ in probability.
We can now proceed to the proof of the result about unconditional conditional Galton-Watson trees.
Proof of Theorem 11. Let us begin by defining q x = P{K(T ) = x}, as well as {p i }, q + x and q − x analogously to how they were defined in previous sections. We can first solve for a finite value of q 0 . Then, for x > 0, by multiple uses of the inclusion-exclusion formula, Noting that E ξ 1 = 1, E ξ 2 = σ 2 /2 and for any ℓ ≤ k, E ξ ℓ := µ ℓ < ∞, we have for k ≥ 3, It is easy to see that q x → 0 as x → ∞. Therefore, for any ǫ > 0, we can find x * such that for all x ≥ x * , q x ≤ ǫ and q + x ≤ (1 + ǫ)q x . We then have Picking an ǫ > 0 and corresponding x * such that the term in parentheses belongs to [σ 2 /3, σ 2 /2), for every x ≥ x * , i.e., Similarly, we have for the lower bound from which we deduce for ǫ > 0 and corresponding x * * chosen such that for all x ≥ x * * , µ k (1+ǫ) k ǫ k−2 ≤ σ 2 /2 and (k + 1)µ k+1 ǫ ≤ µ k 2 , Therefore for all x ≥ x * * . The theorem statement is obtained by taking x ≥ max{x * , x * * } and combining the two estimates () and ().
A related result. Cai and Devroye showed that the height H n of the maximal complete k-ary tree occurring as a terminal element in a critical Galton-Watson tree T n satisfies H n log 2 log 2 n → 1 log 2 k in probability (see Lemma 4.2,[6]). These elements are called fringe subtrees. They also showed the same behaviour for the height H ′ n of the maximal complete k-ary non-fringe tree which is allowed to occur as a non-terminal element in T n (see Lemma 5.7,[6]).
In this paper, we allow the complete k-ary tree to be embedded in T n rather than an element of it. We show that asymptotically, the height of the root is still a constant factor of log 2 log 2 n. The constant is now larger than in the case analyzed by Cai and Devroye: (log 2 k/2) −1 rather than (log 2 k) −1 .

Conclusion and Future Work
In this work, we considered the setting of critical conditional Galton-Watson trees. We showed that their Horton-Strahler number scales as Θ(log 2 n) in probability. This result was proven using the convergence of a conditional Galton-Watson tree to Kesten's limit tree, as well as the construction of a rotationally invariant event using the random walk view of a tree.
We then defined several other generalizations of the Horton-Strahler number to non-binary trees, including the rigid Horton-Strahler number and the k-ary register function. For the rigid Horton-Strahler case, we identify a key parameter d denoting the first integer i ≥ 2 for which the offspring distribution has nonzero probability of having i children. We then used the same methods introduced earlier in the paper to prove that the k-ary register function and the rigid Horton-Strahler number both scale as Θ(log 2 log 2 n), respectively when k ≥ 3 and d ≥ 3.
Our main result from sections 3 and 4 generalizes all previously known first order results for the regular Horton-Strahler number. However, higher order concentration information is not presented here. It seems plausible that the variance of H(T n ) is O(1); such a result would be very desirable.
For the ξ distribution, we have p 1 = 0. Thus, for the root to have a Horton-Strahler number of zero, the tree must be a path, yielding q 0 = p 0 + p 1 p 0 + p 2 1 p 0 + · · · = p 0 1 − p 1 .
Now suppose that q ′ j = q j for all j < i. For the root to have Horton-Strahler number i > 0, either it has one child with this same number, or it has ℓ ≥ 2 children. In the second case, there are two further possibilities, where either the Horton-Strahler number does not change from the maximal number of the root's children, or it increases by one, with r ≥ 2 children with Horton-Strahler number i − 1. We define a function ψ(q i , q − i−1 , q i−1 , q − i−2 , {p ℓ }) to encapsulate the probability of the root having Horton-Strahler number i given these two possibilities: Then, and since q ′ i has Pr{ζ = 1} = 0, where the second line was obtained from the inductive hypothesis. Thus, we have shown that q ′ i = q i . By induction, this thus holds for all i ∈ N.
Proof of Theorem 2. By Theorem 1, we can without loss of generality assume that p 1 = 0. We have a recursion from (): Rearranging gives us and we can use the binomial theorem and the definition of the generating function f (s) of {p i } to obtain . .
Let's now show (). From q i ≤ q i−1 γ, we have that Then, we have giving us For the upper bound, we have that and thus Similarly to the lower bound, this gives us Now consider the map γ → ; it has a fixed point at γ = 1/2 since (1 − 1/4)/2 = 1/2. More precisely, let γ be the solution of Then γ = 1/2 + g(ǫ) for some g(ǫ) > 0, g(ǫ) → 0 as ǫ → 0. Therefore, recalling the lower bound (), we have for all i ≥ n 0 (ǫ), and we have shown ().

B. Proofs for Alternate Horton-Strahler Numbers
Proof of Lemma 7. We proceed by induction on the height of the tree to show (). Consider a tree T with k children, and consider all the required orderings of the French, Canadian, standard and rigid Horton-Strahler numbers (F i , C i , H i and R i for i = 1, . . . , k) of these children. Note that for a leaf node with subtree size |T | = 1, the base case holds: F(T ) = C(T ) = H(T ) = R(T ) = 0.
i) To show F(T ) ≥ C(T ), suppose that for each 1 ≤ i ≤ k children, C i ≤ F i . Suppose C 1 = · · · = C r for some r ∈ {1, . . . , k}. Then, C = C r +(r − 1) ≤ F r +(r − 1) ≤ F, and we are done. ii) To show C(T ) ≥ H(T ), suppose that for each 1 ≤ i ≤ k children, H i ≤ C i . Then, in the case where C(T ) > C 1 , we are done, as H(T ) ≤ H 1 +1 ≤ C 1 +1 ≤ C(T ). Otherwise, C(T ) = C 1 and r = 1, so we have the strict ordering C 1 > C 2 ≥ · · · . This leads to the two cases: • if H 1 < C 1 , then we are again done since H(T ) ≤ H 1 +1.
All of these were shown at the root. Thus, the inequality holds by induction.

C. Proofs for the Rigid Horton-Strahler Number
Proof of Lemma 8. We prove this lemma by induction. Define q i and q ′ i as and recall the definitions of q − i and q + i offered at the start of section 2. Further recall that we defined p i = P{ξ = i} and p ′ i = P{ζ = i}, with p ′ 1 = 0 and The base case: we have by the same argument as in the proof of Lemma 1 that q ′ 0 = p 0 1 − p 1 and q 0 = p 0 + p 1 p 0 + p 2 1 p 0 + · · · = p 0 1 − p 1 .
Then suppose for induction that q ′ j = q j for all j < i. For the root to have Horton-Strahler number i > 0, either it has one child with the same number, or ℓ > 2 children. In this second case, either all children have the same Horton-Strahler number i − 1, or some number r ∈ {1, ℓ − 1} of children have Horton-Strahler number i. Defining ψ(q i , q i−1 , q − i−1 , {p ℓ }) as Then, we have that for the modified distribution, from the inductive hypothesis. Then, we note that since only terms p i for i ≥ 2 are involved, completing the proof.
Proof of Theorem 9. Let q k = P{R(T ) = k}. We once again assume by Lemma 8 that p 1 = 0. Note that σ 2 will be involved in the proof and the results, and when the offspring distribution is changed from ξ to ζ as in Lemma 8, the standard deviation changes by a factor of (1 − p 1 ) −1 . However, we will find that the final form of the result is such that this change in distribution does not matter.
Using the same notation as in the proof of Theorem 2, we have that By the Taylor series with remainder, for some θ, θ ′ , θ ′′ ∈ [0, 1], we have approximations of the terms in (): Recall that q k → 0 as k → ∞ and f (i) (0) = i!p i for all i ≥ 0. Then, since f and all its derivatives are continuous, increasing and convex on [0, 1], for any ǫ > 0, there is some n 0 (ǫ) such that for all k ≥ n 0 (ǫ), for all r ≥ 1, p r ≤ 1 r! f (r) (q k ) ≤ p r (1 + ǫ).
Furthermore, since f ′ (1) = 1 and f ′′ (1) = σ 2 , we also have These two facts can be used to simplify respectively the first two and the third equations in (). Then, plugging the terms back into our original form () gives an upper bound for all k ≥ n 0 (ǫ) of .