On joint properties of vertices with a given degree or label in the random recursive tree

In this paper, we study the joint behaviour of the degree, depth and label of and graph distance between high-degree vertices in the random recursive tree. We generalise the results obtained by Eslava and extend these to include the labels of and graph distance between high-degree vertices. The analysis of both these two properties of high-degree vertices is novel, in particular in relation to the behaviour of the depth of such vertices. In passing, we also obtain results for the joint behaviour of the degree and depth of and graph distance between any fixed number of vertices with a prescribed label. This combines several isolated results on the degree and depth of and graph distance between vertices with a prescribed label already present in the literature. Furthermore, we extend these results to hold jointly for any number of fixed vertices and improve these results by providing more detailed descriptions of the distributional limits. Our analysis is based on a correspondence between the random recursive tree and a representation of the Kingman $n$-coalescent.


Introduction
The random recursive tree model has, since its introduction by Na and Rapoport [30], received a wealth of interest and many properties have been studied. This wide range of topics includes, among others, the degree distribution [20,26,27], the degree of vertices with a prescribed label [7,22], the maximum degree [1,3,8,17,34], the height of the tree [32], the insertion depth of the tree [7,24], and the graph distance between vertices [9,15]. Beyond these statistics, real-world applications of random recursive trees have been considered as well [16,28,31]. See also [10,25] for two surveys on random trees that include a more extensive overview of the research literature on random recursive trees.
Different approaches for studying the random recursive tree model have been considered throughout the literature. Using the recursive definition of the model and the fact that the random recursive tree with n vertices is defined to be a uniform tree among all increasing trees with n vertices (labelled trees where the vertices on a path from the root to any vertex have increasing labels) are among the most prevalent. Other methods include using continuous-time embedding in Crump-Mode-Jagers branching processes, first introduced by Athreya and Karlin for Pólya urns in [2] and later used for a wide range of recursive tree models such as the random recursive tree (see e.g. [4,18,19,32]), Pólya urns [20] and a representation of Kingman's coalescent [1,12,32].
In most studies found in the literature regarding the random recursive tree model, statistics like those mentioned above are considered in isolation, rather then studying their joint behaviour. As far as the author is aware, only a handful of papers consider the joint behaviour of different statistics for the random recursive tree. In [12], Eslava studies the depth of high-degree vertices, Banerjee and Bhamidi study the label size of the vertex attaining the maximum degree in [3], and the author studies the labels of high-degree vertices in the more general weighted recursive tree model [23], of which the random recursive tree model is a particular example.
The aim of this paper is to extend what is known about the joint behaviour of several statistics of the random recursive tree. We consider, in particular, two settings. First, we study the joint behaviour of the depth and label of and graph distance between any fixed number of vertices selected uniformly at random, conditionally on having a degree that exceeds a certain quantity. We combine, extend, improve and recover the results of the author [23] (in the particular case of the random recursive tree) and Eslava [12]. We also recover the results of Addario-Berry and Eslava [1] and Eslava, the author, and Ortgiese [14] (again, in the particular case of the random recursive tree).
Let T n denote the random recursive tree with n vertices. Eslava considers in [12] the vector (d i n − log 2 n , (h i n − µ log n)/ σ 2 log n) i∈ [n] , where d i n and h i n denote the degree and depth of the vertex with the i th largest degree (ties broken uniformly at random), respectively, and sets µ := 1 − 1/(2 log 2), σ 2 := 1 − 1/(4 log 2). Eslava shows this vector converges in distribution along suitable subsequences (n t ) t∈N to a marked point process on (Z ∪ {∞}) × R, where the marks are independent standard normal random variables. The author proves a similar result for the vector (d i n − log 2 n , ( i n − µ log n)/ (1 − σ 2 ) log n) i∈[n] in [23], where i n denotes the label of the vertex with degree d i n (ties broken uniformly at random). Again, along suitable subsequences, this vector converges in distribution to a marked point process on (Z∪{∞})×R, where the marks are independent standard normal random variables. Our results here combine these results to show that the vector (d i n − log 2 n , (h i n − µ log n)/ σ 2 log n, ( i n − µ log n)/ (1 − σ 2 ) log n) i∈[n] converges along suitable subsequences to a marked point process on (Z ∪ {∞}) × R 2 , where the marks are i.i.d. copies of (M 1 − µ/σ 2 + N µ/σ 2 , M ), with M, N, two i.i.d. standard normal random variables. This recovers both results and, additionally, provides a novel and interesting dependence between the scaling limit of the depth and label of high-degree vertices. It describes exactly how large the largest degrees in the tree are, as well as where and when they appear in the tree. This natural extension of the current knowledge provides a rather complete picture of the behaviour of high-degree vertices.
Moreover, we also obtain the distributional convergence of the (properly rescaled) depth and label of and graph distance between any finite number of vertices selected uniformly at random, conditionally on their degrees growing infinitely large as n → ∞. The graph distance between such high-degree vertices has not been studied previously, and we are, in particular, able to characterise the limiting law of the graph distance in terms of the limiting law of the depth of these vertices.
Second, we study the joint behaviour of the degree and depth of and graph distance between any fixed number of vertices with a prescribed label. This combines, extends, improves and recovers a range of results on the degree [7,22] and depth [7,24] of and graph distance [9,15] between vertices with a prescribed label. Given any fixed k ≥ 2 vertices with labels (v i,n ) i∈ [k] such that v i,n diverges with n, we obtain the joint distributional convergence of the degree and depth of and graph distance between vertices v 1,n , . . . , v k,n . Again, we characterise the limiting law of the graph distances in terms of those of the depths of vertices v 1,n , . . . , v k,n , which is novel.
Our extensions of the aforementioned results arise mainly due to two contributions. First, we are able to analyse the joint behaviour of multiple statistics beyond what was known already in the literature. Second, we obtain these results for any finite number of vertices, whereas only a single vertex or single pair of vertices is considered in most results available to date. It is exactly the correlations that arise due to considering several statistics and many vertices at once that prove to be the most challenging aspects of the analysis. The improvement of the existing results is mostly due to the fact that considering the joint behaviour of several statistics allows us, in certain cases, to obtain more detailed descriptions of their limiting laws beyond what was known previously.
The analysis in this paper is based on the Kingman n-coalescent construction of the random recursive tree. This construction was first observed by Pittel in [32] and later recovered and used by Addario-Berry and Eslava [1], and Eslava [12,13]. This construction provides several advantages compared to the more common recursive construction of the random recursive tree. First, rather than in the recursive construction in which distinct vertices have different arrival times (which influence their degree, depth, label, and graph distance), the coalescent construction allows for a perspective in which all vertices are exchangeable. Second, the coalescent construction enables a more natural decoupling of the statistics of distinct vertices, which provides us with tools to tackle the correlations between these statistics in a more refined manner. Finally, in particular the degree, label and depth of a vertex can be expressed in terms of random numbers of coin flips, simplifying the analysis of these statistics. The degree of a vertex equals the length of the first streak of heads, the label equals the step at which the first tails occurs and the depth equals the total number of tails thrown.
Notation. Throughout the paper we use the following notation: we let N := {1, 2, . . .} denote the natural numbers, set N 0 := {0, 1, . . .} and let [t] := {i ∈ N : i ≤ t} for any t ≥ 1. For x ∈ R, we let x := inf{n ∈ Z : n ≥ x} and x := sup{n ∈ Z : n ≤ x}. For x ∈ R, k ∈ N, we let (x) k := x(x − 1) · · · (x − (k − 1)) and (x) 0 := 1 and use the notationd to denote a ktuple d = (d 1 , . . . , d k ) (the size of the tuple will be clear from the context), where the d 1 , . . . , d k are either numbers or sets. For sequences (a n , b n ) n∈N such that b n is positive for all n we say that a n = o(b n ), a n = ω(b n ), a n ∼ b n , a n = O(b n ) if lim n→∞ a n /b n = 0, lim n→∞ |a n |/b n = ∞, lim n→∞ a n /b n = 1 and if there exists a constant C > 0 such that |a n | ≤ Cb n for all n ∈ N, respectively. For random variables X, (X n ) n∈N we let X n d −→ X, X n P −→ X and X n a.s.
−→ X denote convergence in distribution, probability and almost sure convergence of X n to X, respectively. Also, let Φ : R → (0, 1) denote the cumulative density function of a standard normal random variable.
We also provide a table with the most important symbols used throughout the paper and their definitions, in order of appearance.

Symbol
Definition T n Random recursive tree on n vertices d Tn (u) In-degree of vertex u in T n dist Tn (u, v) Graph distance between vertices u, v in T n h Tn (u) Depth of vertex u in T n (graph distance to the root, dist Tn (u, 1)) v j j th vertex in T n , in decreasing order of in-degree , the first coalescence of vertices 1, . . . , k S n,1 (i) Truncated selection set of vertex i in T (n) S n,1 (S n,1 (i)) i∈[k] R n,1 (R n,1 (i)) i∈ [k] , where each element is an independent copy of S n,1 (1) h n,1 (i) Truncated depth of vertex i in T (n) h n,2 (i) h n (i) − h n,1 (i), the remaining depth

Definitions and main results
The random recursive tree model is defined as follows: Definition 2.1 (Random recursive tree model). Let (T n ) n∈N be a sequence of trees. Initialise T 1 by a root with label 1. For every n ∈ N, construct T n+1 from T n by adding a vertex with label n + 1 to T n and connecting it by a directed edge to a vertex v ∈ [n] which is selected uniformly at random.
Due to the temporal nature of the random recursive tree model, it is natural to think of the edges as directed towards the root. Throughout, for any n ∈ N and u, v ∈ [n], we write h Tn (u) := depth of vertex u in T n = dist Tn (u, 1).
The graph distance between vertices u and v denotes the number of edge on the unique path between vertices. Here we do not take the direction of the edges into account. This only matters for the in-degree.
Addario-Berry and Eslava study behaviour of high-degree vertices in the RRT in [1] and Eslava extends this to the joint convergence of the degree and depth of such high-degree vertices in [12]. We further extend this joint convergence by including the rescaled label of the vertices as well in the following result.
Theorem 2.2 (Degree, depth and label of high-degree vertices in the RRT). Consider the random recursive tree (RRT) model as in Definition 2.1. Let v 1 , v 2 , . . . , v n be the vertices in the RRT in decreasing order of their in-degree (where ties are split uniformly at random) and let (d s n , h s n , s n ) s∈[n] denote their in-degree, depth, and label, respectively. Fix ε ∈ [0, 1], define ε n := log 2 n − log 2 n , and let (n t ) t∈N be a positive, diverging, integer-valued sequence such that ε nt → ε as t → ∞. Finally, let (P s ) s∈N be the points of the Poisson point process P on R with intensity measure λ(dx) = 2 −x log 2 dx, ordered in decreasing order, let (M s , N s ) s∈N be two sequences of i.i.d. standard normal random variables and define µ := 1 − 1/(2 log 2) and σ 2 := 1 − 1/(4 log 2). Then, as t → ∞, Moreover, it provides the relation and dependence between the depth of a high-degree vertex and its label, which only becomes apparent in the second-order scaling and the limit.
Beyond studying the behaviour of vertices with 'near-maximum' degree, we are also interested in a more general setting. Here, we select k ∈ N many vertices uniformly at random from T n and condition on their degree. We can then provide the following detailed results on the joint behaviour of their depths, labels and the graph distances between them. The following result is instrumental in proving Theorem 2.2 as well.
where the (H i ) i∈ [k] are independent standard normal random variables. Additionally assume that for all i ∈ [k], d i diverges as n → ∞. Then, the tuple are independent standard normal random variables.
(ii) When a i = 0 for all i ∈ [k], we obtain the behaviour of the insertion depth of k uniform vertices, as well as the graph distance between them.
(iii) The conditional convergence of the tuple in (2.1) recovers, improves, and extends the result of Eslava in [12, Theorem 1.1]. When we omit the distance between the vertices v i , v j and set , we obtain [12, Theorem 1.1]. Our result allows for a greater freedom in the choice of the degrees d i rather than the parametrised setting used by Eslava. We extend Eslava's result even further by including the graph distance between any pair of vertices and, in (2.2), by also including the label of the vertices v 1 , . . . , v k . The latter also allows for a more precise description of the limiting distribution of the depth compared to [12,Theorem 1.1]. We observe that the scaling of the graph distance suggests that the graph distance between vertices v i and v j , for any distinct i, j ∈ [k], is the sum of their depths. Though this sum is a trivial upper bound, we show that it is of the correct order by using the fact that the largest common ancestor of v i and v j , LCA i,j , forms a tight sequence of random variables (in n ∈ N).
Next to conditioning on the degree of vertices selected uniformly at random, we also have the following result on the degree and depth of and graph distance between vertices with a fixed label. Though the marginal convergence of the degree and depth of a vertices and graph distance of a pair of vertices with a fixed label has been studied previously (see [22,7,24,9,15]), we combine, extend, and improve these results by considering the joint convergence and by allowing for any number of (pairs of) vertices.
Theorem 2.6. Consider the random recursive tree model as in Definition 2.1. Fix k ∈ N and let (v i,n ) i∈[k] ∈ [n] k be k distinct integer-valued sequences such that v i,n increases with n, diverges as n → ∞ and such that be k independent standard normal random variables. We also define for each i ∈ [k], , and let (Z i ) i∈[k] be k independent random variables (also independent of (N i ) i∈ [k] ) such that, for Remark 2.7. (i) The theorem partially recovers a result from Feng, Lui, and Su [15,Theorem 1], where the distance between vertices i n and n for any integer sequence (i n ) n∈N such that i n ∈ [n−1] holds is covered. In our setting, we require the labels v i,n to be increasing in n and to diverge with n, as we are unable to characterise the limiting distributions of the depth and degree otherwise. We also recover the less general results (compared to Feng et al.) of Dobrow [9, Theorems 3 and 4] on the graph distance between vertices i n and n with i n = n − 1 or i n = λn , λ ∈ (0, 1). Moreover, we are able to provide a more detailed description of the scaling limit of the distance between the vertices v 1,n , . . . , v k,n in relation to their depth, which is not present in [15] or [9].
(ii) The theorem recovers the results of Devroye [7] and Mahmoud [24] on the insertion depth. (v) The constraint that all v i,n are increasing in n arises due a technicality, which we illustrate with the following example. Suppose k = 2 and v 1,n = n/2 1 {n is even} + n/3 1 {n is odd} , v 2,n = n/3 1 {n is even} + n/2 1 {n is odd} .
In this case, c 1,2 = c 2,1 = 1/ √ 2 both exist, so that the limiting law of the graph distance can be obtained, but the limiting laws of d * Tn (v 1,n ) and d * Tn (v 2,n ) do not exist. Indeed, it holds that . Such cases are circumvented when the v i,n are increasing with n. When omitting the degree, any diverging sequences (v i,n ) i∈ [k] such that the (c i,j ) 1≤i<j≤k exist can be considered.
The main approach to proving Theorems 2.2, 2.4, and 2.6 is to use a 'reversed-time' construction or coalescent construction of the random recursive tree, known as the Kingman n-coalescent construction (see Section 3). This construction has several advantages compared to the construction in Definition 2.1. First, the depth, degree, and label of vertices in the Kingman n-coalescent are exchangeable, which simplifies the analysis of their joint behaviour. Second, the coalescent construction simplifies dealing with correlations that appear when considering the depth, degree, and label of multiple vertices at once. In particular, it provides an elegant way to decouple the degree, depth, and label of distinct vertices. Finally, the size of the depth, degree, and label of a vertex can be understood in terms of sums of independent indicator random variables and independent fair coin flips. As a result, standard central limit theorem results can be applied to obtain the desired results.

Outline of the paper
The paper is organised as follows: We first provide some theoretical preparations, necessary to prove the Theorems stated in Section 2. We provide a perspective for Theorem 2.2 in terms of marked point processes, and provide a construction of the random recursive tree, called the Kingman n-coalescent construction, that aids in the analysis of the properties of interest here. In particular, we rephrase Theorems 2.4 and 2.6 in terms of the Kingman n-coalescent in Theorems 3.5 and 3.7, respectively. Section 4 is then dedicated to developing some preliminary results based on the Kingman n-coalescent construction. These preliminary results are used in Sections 5 and 6 to obtain intermediate results on the behaviour of high-degree vertices and vertices with a given label, respectively. Finally, these intermediate results are used in Section 7 to prove Theorem 2.2 and in Section 8 to prove Theorems 2.4 and 2.6.
3. The degree, depth, and label of high-degree vertices in the random recursive tree: theoretical preparations In this section we provide a new perspective of Theorem 2.2, alongside a different construction of the random recursive tree compared to Definition 2.1. The latter will be of aid in proving all results presented in Section 2.
To prove Theorem 2.2, we use the convergence of marked point processes. Recall that d s n , h s n and s n denote the degree, depth, and label of the vertex with the s th largest degree in the random recursive tree, respectively, with s ∈ [n], where ties are split uniformly at random. Let µ := 1 − 1/(2 log 2) and σ 2 := 1 − 1/(4 log 2). We view the tuples  x , ξ x ) x∈P be independent standard normal random variables. For ε ∈ [0, 1], we define the ground process P ε on Z * and the marked process MP ε on Z * × R 2 by where δ is a Dirac measure. Similarly, we define

(3.2)
We then let M # Z * and M # Z * ×R 2 be the spaces of boundedly finite measures on Z * and Z * × R 2 , respectively, and observe that P (n) , P ε and MP (n) , MP ε are elements of M # Z * and M # Z * ×R 2 , respectively. Theorem 2.2 is then equivalent to the weak convergence of MP (nt) to MP ε in M # Z * ×R 2 along suitable subsequences (n t ) t∈N , as we can order the points in the definition of MP (n) (resp. MP ε ) in decreasing order of their degrees (resp. of the points x ∈ P). We remark that the weak convergence of P (nt) to P ε in M # Z * along subsequences has been established by Addario-Berry and Eslava in [1] (later generalised to weighted recursive trees by Eslava, the author, and Ortgiese in [14] and extended to marked point processes by the author in [23]) and that Eslava established the weak convergence of MP (nt) along subsequences, which is MP (nt) with each mark restricted to the first element (i.e. not considering the label), in [12]. We extend these results here to the tuple of degree, depth, and label, which also shows an interesting dependence in the limit of the rescaled depth and rescaled labels.
Recall the Poisson point process P used in the definition of P ε in (3.1) and enumerate its points in decreasing order. That is, P v denotes the v th largest point of P (ties broken uniformly at random). We observe that this is well-defined, since P([x, ∞)) < ∞ almost surely for any x ∈ R. Also, let (M v , N v ) v∈N be two sequences of i.i.d. standard normal random variables. To prove the weak convergence of the marked point process MP (n) , we define, for s ∈ Z, B ∈ B(R 2 ), the counting measures Moreover, when s 1 , . . . , s K = o( √ log n) and a m = 1/ log 2 for all m ∈ [K], As the counting measures defined in (3.3) are sums of indicator random variables, their factorial moments can be expressed in terms of probabilities .
distinct vertices selected uniformly at random, and k ∈ N. The first probability on the right-hand side is studied by Addario-Berry and Eslava in [1], and the latter is the subject of Theorem 2.4. This can in turn be used to prove Proposition 3.1, which finally leads to Theorem 2.2. We provide more details alongside the proof of Proposition 3.1 and Theorem 2.2 in Section 7.
3.1. The Kingman n-coalescent. We now provide an alternative construction of the random recursive tree (RRT), which we use to prove Theorems 2.2, 2.4 and 2.6.
This alternative construction of the RRT, (a variant of) the Kingman n-coalescent construction, was first discussed by Pittel in [32] and recovered and used by Addario-Berry and Eslava to study high degrees in RRTs [1]. Later, Eslava extended this to the joint convergence of the depth and degree of vertices with large degree [12] and also provides a more general coupled recursive construction of a tree T and a permutation σ on the labels of the vertices of T , coined Robin-Hood pruning [13]. Here, we further extend Eslava's results from [12] on the depth and degree of high-degree vertices to also include the label of and graph distance between such high-degree vertices. We also obtain results on the joint behaviour of the degree and depth of and graph distance between vertices with a given label, which combine, extend and improve several known results from the literature on the degree [22] and depth [7] of a vertex with a given label and the graph distance between vertices n and i n , for any sequence i n [15].
The variant of the Kingman n-coalescent we use here is a process which starts with n trees, each consisting of only a single root. At every step n through 2 (counting backwards), a pair of roots is selected uniformly at random and independently of this selection a directed edge is formed between the two roots, each direction being equiprobable. This reduces the number of trees by one and, after completing step 2, yields a directed tree. It turns out that a particular relabelling of this directed tree yields a tree equal in law to the random recursive tree. Moreover, using the Kingman n-coalescent construction simplifies the analysis of degrees, depths, and labels in the RRT model, among other reasons because the degree, depth, and label of the vertices are exchangeable random variables in the Kingman n-coalescent.
We now formally introduce the Kingman n-coalescent construction of the random recursive tree. Let CF n := {f : V (f ) = [n]} denote the set of all forests with exactly n vertices. An n-chain is a sequence (f n , . . . , f 1 ) of elements of CF n , where for each integer 1 < j ≤ n, f j−1 is obtained from f j by adding a directed edge between the roots of two trees in f j . We write f j = {t ordering the trees in increasing order of their smallest-labelled vertex. In particular, f n consists of n trees, each of which is a root with no edges, and f 1 consists of exactly one tree. Also, we let r(T ) denote the root of the tree T and write F j = {T Definition 3.2 (Kingman n-coalescent). For each 1 < j ≤ n, choose a pair {a j , b j } ⊆ {{a, b} : 1 ≤ a < b ≤ j} independently and uniformly at random; also let (ξ j ) 1<j≤n be a sequence of independent Bernoulli(1/2) random variables. Initialise the coalescent by F n : a forest of n trees, each consisting of a root and no edges. For 1 < j ≤ n, F j−1 is obtained from F j as follows: Add an edge e j−1 between the roots r(T See Figure 1 for an example of the process. When at step j the edge e j = v j u j is directed towards u j , we say that the associated random variable ξ j (which we can interpret as flipping a fair coin) favours the root u i . Similarly, we might also say that ξ j favours w or that the associated coin flip at step j favours w, where w is any vertex in the tree that contains u j .
The link between the final tree in the coalescent and the RRT is as follows. Let us define the mapping σ C : V (T (n) ) → [n] by σ C (r(T (n) )) := 1 and for each edge As all edges are directed towards the root, v j = v j for all j = j ∈ [n−1], so that σ C is well-defined. σ C is the relabelling of T (n) into an increasing tree. If we let I n denote the set of all increasing trees on n vertices, then it is clear that the RRT is a uniform element in I n . The most important attribute of the n-chain in the Kingman n-coalescent is that it has a uniform distribution over all possible n-chains and that the relabelling of T (n) by σ C yields a uniform element of I n , as outlined in the following proposition.  [12]). The Kingman n-coalescent C is uniformly random in CF n , the set of n-chains. Moreover, for each C = (f n , . . . , f 1 ) ∈ CF n , relabel the vertices in f 1 with σ C to obtain a tree φ(C) ∈ I n . Then the law of φ(C) is that of a random recursive tree of size n.
Recall that d Tn (u), h Tn (u) and dist Tn (u, v) denote the in-degree and depth of vertex u ∈ [n] and the graph distance between vertices u, v ∈ [n] in the random recursive tree T n , respectively. Similarly, for a realisation of the final tree T (n) in the coalescent C, let d T (n) (i), h T (n) (i) and dist T (n) (i, j) denote the in-degree and depth of vertex i and the graph distance between i and j, respectively, and let T (n) (i) := σ C (i) denote the relabelling of vertex i, i ∈ [n]. That is, T (n) (i) denotes the label that vertex i in C obtains in the random recursive tree φ(C). We can then formulate the following corollary.
Moreover, jointly for all i, j ∈ N and all sets B ⊆ [n], we have In what follows, we replace the subscript T (n) with n for ease of writing, since we work with the coalescent from now on instead of the RRT. As a direct result from Corollary 3.4, Theorem 2.4 follows from the following result (which is a reformulation of Theorem 2.4 in terms of the Kingman n-coalescent).  [12].
where the (H i ) i∈[k] are independent standard normal random variables. Additionally assume that for all i ∈ [k], d i diverges as n → ∞. Then, the tuple where the (M i , N i ) i∈[k] are independent standard normal random variables.
with an almost identical proof.
Moreover, Theorem 3.5 can be used to prove Proposition 3.1. By Corollary 3.4, we can redefine the random variables X (3.7) We can also reformulate Theorem 2.6 in terms of the Kingman n-coalescent. As is the case with Theorem 2.4, combining Corollary 3.4 with the following theorem immediately implies Theorem 2.6.
Theorem 3.7. Consider the Kingman n-coalescent as in Definition 3.2. Fix k ∈ N and let ( i ) i∈[k] ∈ [n] k be k distinct integer-valued sequences such that i increases with n, diverges as n → ∞ and such that where the Z i are independent and also independent of the (N i ) i∈ [k] . The tuple despite this not being the case in Theorem 2.6. Since vertices 1, . . . , k in the Kingman n-coalescent obtain a random label in the relabelled tree φ(C) (which is equal in law to the random recursive tree by Proposition 3.3), the need to condition on their relabelling n (i) = i , i ∈ [k], arises.
In the next sections we analyse the Kingman n-coalescent construction to prove Theorems 3.5 and 3.7 and Proposition 3.1.

Preliminary results
In this section we provide some important intermediate results related to the Kingman n-coalescent construction, provided in Section 3. We focus on two things in this section. First, we study the evolution of the degree, depth, and label of vertices 1, . . . , k in the Kingman n-coalescent, which is an important first step in proving the theorems in Section 3. Second, we investigate the correlations between the steps j ∈ [2, n] at which vertices 1, . . . , k are selected in the coalescent.
Though the theorems presented in Section 3 are concerned with the graph distance between vertices 1, . . . , k as well as their degree, depth, and label, we do not include this in our analysis yet. While the latter quantities are easier to explicitly understand in terms of the Kingman n-coalescent, the graph distance does not lend itself to an equally elegant analysis. As it turns out, though, there is a close relation between the depth of and graph distance between the vertices 1, . . . , k which allows us to infer the scaling limit of the graph distances from the results on the depth. We make use of this relation in Section 8 when proving Theorems 3.5 and 3.7.

4.1.
Analysis of the Kingman n-coalescent. We start by introducing some notation related to the Kingman n-coalescent. For an n-chain C = (f n , . . . , f 1 ) and some i, j ∈ [n], let T (j) (i) denote the tree in f j that contains vertex i. For i ∈ [n], let s i,j be the indicator that bj } and let h i,j be the indicator that the edge e j is directed outwards from r(T (j) (i)), 2 ≤ j ≤ n. That is, s i,j equals one if i is part of one of the two trees selected to merge at step j, and h i,j is one if s i,j is one and if the new edge e j causes vertex i to increase its depth by one, see Figure 2.
Since the trees selected to be merged at every step are independent and uniformly distributed, the variables (s i,j ) 2≤j≤n are independent Bernoulli random variables for any fixed i ∈ [n], with Figure 2. For i ∈ [n] and 2 ≤ j ≤ n, let r j := r(T (j) (i)) denote the root of the tree in f j that contains vertex i, and suppose that If e j is directed towards r j , then the degree of r j increases by one in F j−1 . If e i is directed outwards from r j , then the depth of each v ∈ T (j) (i) increases by one in F j−1 . From [12].
E [s i,j ] = 2/j. Similarly, since the direction of the edge e j depends only on ξ j , the variables (h i,j ) 2≤j≤n are also independent Bernoulli random variables for any fixed , and set S n (i) := |S n (i)|. We refer to S n (i) as the selection set of vertex i. We can express the quantities d n (i), h n (i) and n (i) in terms of S n (i) and the indicator variables where we set h i,1 = 1 for all i ∈ [n], so that max{j ∈ [n] : h i,j = 1} = 1 if there is no 2 ≤ j ≤ n such that h i,j = 1 (which corresponds to vertex i being the root of T (n) , so that its relabelling with σ C as in (3.4) yields n (i) = 1). Note that there is always a unique vertex i for which h i,j = 0 for all 2 ≤ j ≤ n, so that n (i) = n (i ) whenever i = i . Explaining (4.1) in words, the degree of a vertex i is equal to the length of the first streak of zeros of the indicators (h i,j ) ∈[Sn(i)] , the relabelling of vertex i in the RRT is equal to the first step directly after this streak when h i,j = 1, and the depth equals the total number of steps j for which h i,j = 1.
We are interested in the behaviour of the degree, depth, and label of the vertices 1, . . . , k for any fixed k ∈ N. While these quantities are easily expressed in terms of the selection sets (S n (i)) i∈ [k] and the associated coin flips, as in (4.1), considering k vertices provides some additional difficulties in terms of correlations between the selection sets of these k vertices. The main issue is the following: whenever two distinct vertices i, i ∈ [k] are both selected at the same step, say step λ i,i , there is a dependence between the outcome of the associated coin flip of vertices i and i .
As these correlations between the vertices 1, . . . , k are difficult to handle, we define Since the trees in the Kingman n-coalescent are ordered based on their smallest-labelled vertex, τ k is the first step at which two vertices i, i ∈ [k] are both selected (in the sense that the root of the tree they belong to is selected), and thus up to step τ k the vertices 1, . . . , k are contained in disjoint trees. As a result, this implies that the sets [τ k + 1, n] ∩ S n (1), . . . , [τ k + 1, n] ∩ S n (k) are disjoint, and since the associated coin flips of these disjoint sets are independent, the evolutions of the degree, depth, and label of vertices 1, . . . , k, up to step τ k are independent. This helps to avoid correlations and simplifies the analysis. Eslava (implicitly) shows in the proof of [12, Lemma 3.2] that the sequence (τ k ) n∈N is a tight sequence of random variables. As a result, for any integervalued sequence (t n ) n∈N which diverges to infinity as n → ∞, we know that P(τ k < t n ) = 1 − o(1). This justifies, for t n ≤ n, the definition of the sets, for each i ∈ [n], and we let S n,1 (i) := |S n,1 (i)| and h n,1 (i) := |H n,1 (i)|, h n,2 (i) := h n (i) − h n,1 (i). We refer to the sets (S n,1 (i)) i∈[n] as the truncated selection sets, to h n,1 (i) as the truncated depth of vertex i, and to (t n ) n∈N as the truncation sequence. Though S n,1 (i), h n,1 (i), h n,2 (i) depend on t n , we omit this in their notation for ease of writing. The truncated depth h n,1 (i) and h n,2 (i) can be described similar to h n (i) in (4.1), as The following lemma uses (4.1) to provide a description of the relation between the joint distribution of d n (1), h n,1 (1) and n (1) and the truncated selection set S n,1 (1). Since the vertices are exchangeable, as follows from Corollary 3.4, the lemma also holds for any vertex i ∈ [n].
Remark 4.3. The constraint ≥ t n ensures that the events n (1) ≥ and n (1) = , as in (4.4) and (4.5), respectively, can be determined by step t n of the Kingman n-coalescent. In what follows, we let t n grow sufficiently slow so that this constraint is satisfied for any choice of that is of interest.
Proof. Let us start by proving (4.4). We define E n := {h n,1 (1) ≤ h, n (1) ≥ , d n (1) ≥ d}. If we condition on the event {S n,1 (1) = J} for some set J ⊆ Ω 1 , then we can express the occurrence and probability of the event E n in terms of J: That is, X n, ,1 + X n, ,2 ≤ h.
Combining all of the above, we can then write, where we remark that we can omit the conditioning due to the fact that the coin flips are independent of everything else.
We now prove (4.5). Let us set E n := {h n (1) ≤ h, n (1) = , d n (1) ≤ d}. Again, we express the occurrence and the probability of the event E n in terms of J: to the selection set J, is at most h − 1 (since the height of 1 equals one after step ). That is, X n, ,2 ≤ h − 1.
Combining this, we can write We remark that in the last step, as in the proof of (4.4), we can omit the conditional event We now extend this result to multiple vertices, which we can do with relative ease as long as the truncated selection sets of the vertices 1, . . . , k are disjoint. For ease of writing, we define S n,1 := (S n,1 (i)) i∈[k] andJ : Lemma 4.4. Fix k ∈ N and consider a truncation sequence (t n ) n∈N such that t n ≤ n for all ∈ Ω k 1 such that the (J i ) i∈[k] are pairwise disjoint. Then, If, additionally, we let i ∈ Ω 1 for all i ∈ [k], and Proof. The first result follows from [12, Lemma 3.1]. We prove (4.6), the proof of the last result follows an analogous approach.
The proof is similar to that of [ , depend on disjoint sets of independent random variables, from which (4.6) follows. A similar reasoning proves the final result.
To end the first part of this section, we recall a result from Addario-Berry and Eslava on the degree of vertices 1, . . . , k in the Kingman n-coalescent.
4.2. Truncated selection sets. As we have seen in the first part of this section, we can obtain explicit formulations for the probability of events related to the size of the degree, depth, and label of vertices 1, . . . , k in the Kingman n-coalescent, under certain conditions on the truncated selection sets S n,1 . In this part of the section, we formalize these conditions and show that they are met with high probability. We also introduce some other properties of the truncated selection sets that are useful in the analysis that follows in Sections 5 through 8.
Recall Ω 1 from (4.3) and recall that we write S n,1 = (S n,1 (i)) i∈[k] ,J = (J i ) i∈ [k] . For δ ∈ (0, 2) andd = (d i ) i∈[k] ∈ Z k , define (4.7) Ad consists of all possible outcomes of the truncated selection sets that enable the event {d n (i) ≥ d i , i ∈ [k]}, and B n,δ consists of all truncated selection sets which enable the decoupling of the depth, label and degree of the vertices 1, . . . , k, as follows from Lemma 4.4.
We now present some results related to the sets Ad and B n,δ , which are based on several results from [12]. Though we defined the truncated selection sets and the truncated depth in terms of a general truncation sequence t n , it suffices to consider only the case t n = (log n) 2 in the following lemmas (as we will mostly use this choice for t n in what follows).
We have already discussed that τ k < t n with high probability when the truncation sequence t n diverges as n → ∞ infinity. The concentration of the size of S n,1 (i) around 2 log n for any i ∈ [k] when t n = (log n) 2 (or, more generally, when t n = o(n), which follows from a direct application of Bernstein's inequality, see also [12, (32)] for a more formal statement) yields the following result: Lemma 4.7 (Lemma 3.2, [12]). Fix an integer k ∈ N and δ ∈ (0, 2) and let t n = (log n) 2 . Then, We also know that the elements of S n,1 are asymptotically independent for any k ∈ N, uniformly over the set B n,δ . Let R n,1 := (R n,1 (1), . . . , R n,1 (k)) be k independent copies of S n,1 (1). Then, we have the following result: Lemma 4.8 (Lemma 3.2, [12]). Fix an integer k ∈ N and δ ∈ (0, 2) and let t n = (log n) 2 . Uniformly overJ ∈ B n,δ , P S n,1 =J = (1 + o(1))P R n,1 =J .
The following lemma provides bounds for the decay of the tail distribution of τ k , conditionally on certain events. Lemma 4.9. Fix k ∈ N and recall τ k from (4.2). We have that (τ k ) n∈N is a tight sequence of random variables. Furthermore, fix c ∈ (0, 2) and let (d i ) i∈[k] ∈ N k 0 such that d i ≤ c log n for all i ∈ [k]. Then, Furthermore, let ( i ) i∈[k] ∈ [n] k be distinct such that i diverges as n → ∞ for all i ∈ [k]. Then, Proof. We first prove the tightness of (τ k ) n∈N . Fix ε > 0 and set K ε := 2 + k 2 /ε . We recall that in Definition 3.2, {a j , b j } denotes the two trees selected at step j in the Kingman n-coalescent, for 2 ≤ j ≤ n. Also, the trees are ordered by their smallest-labelled vertex, so that τ k < K ε is implied by {a j , b j } ⊆ [k] for all K ε ≤ j ≤ n. Since the selection of these roots is independent at each step, we obtain We then bound the product from below to obtain the lower bound As a result, P(τ k ≥ K ε ) ≤ ε for all n ∈ N, from which the tightness follows.
We then prove (4.8) and set s n := (log n) 2 for ease of writing. Using Bayes' theorem, the bound in (4.10) and that P( (1)) by Proposition 4.5, we obtain Conditionally on {τ k < s n }, we know that all these coin flips occurs at different steps for all vertices 1, . . . , k, so that they are independent. Moreover, they are independent of the selection sets, so that we obtain the lower bound (4.11) Again, the last step uses the conditional event, on which we have that all S n (i)∩[s n , n] are disjoint, so that |S n (i) ∩ [s n , n]| ≥ d i for all i ∈ [k] is equivalent to the cardinality of the union of all these sets being greater than the sum of the d i . We also know, conditionally on {τ k < s n }, that for every j ∈ [s n , n], at most one s i,j can equal one among all i ∈ [k]. So, for every j ∈ [s n , n], Hence, if we let ( s j ) n j=sn be independent indicator random variables such that P( s j = 1) = 2k/(j + k − 1), we can write, conditionally on {τ k < s n }. Again using that d i ≤ c log n for each i ∈ [k] and all n sufficiently large, where c < 2, we obtain for some c ∈ (0, 2 − c) by using Chebychev's inequality, which, combined with (4.11), completes the proof of (4.8).
We now prove (4.9) and we set t n = min i∈[k] log i and note that t n diverges with n. As in the proof of (4.8), Here, omitting the conditional event for j = i for any i ∈ [k] yields a lower bound. Indeed, for any two distinct i, i ∈ [k], if j > max{ i , i } then {a j , b j } = {i, i } cannot occur conditionally on n (i) = i . Furthermore, we isolate the steps 1 , . . . , k , since the conditional event prescribes that vertex i is selected at step i . For any j ∈ [k], As a result, we obtain where the last step follows from (4.10), and which concludes the proof.
Beyond the sets Ad and B n,δ and the random variable τ k , we also want to control of the probability of the events { n (i) = i , i ∈ [k]} and {d n (i) ≥ d i , i ∈ [k]} conditionally on the truncated selection sets S n,1 . To this end, we define, for¯ : Also, when the truncation sequence t n diverges with n, Finally, let (d i ) i∈ [k] , N k 0 and let t n = (log n) 2 . IfJ ∈ Ad, (4.14) Proof. The first result in (4.12) follows from Corollary 3.4, as each vertex obtains a uniform label from [n] after the relabelling of the final tree F 1 in the Kingman n-coalescent and all i are distinct.
To prove (4.13), we write where the last step follows from (4.12). It thus remains to argue to that probability on the righthand side is o(1). For S n,1 ∈ B c n to hold, the truncated selection sets should overlap at some step t n ≤ j ≤ n, i.e. τ k ≥ t n should hold. Conditionally on the event { n (i) = i , i ∈ [k]}, however, the truncated selection sets in S n,1 cannot overlap at certain steps. Namely, for j > max i∈[k] i , j ∈ S n,1 (i) can hold for at most one i ∈ [k]. Indeed, if the converse would be the case, i.e. j ∈ S n,1 (i) and j ∈ S n,1 (i ) for some distinct i, i ∈ [k], then one of the vertices i, i , let us assume this is vertex i, would lose the associated coin flip at step j and hence its label in the random recursive tree would be j > i . This clearly contradicts the conditional event. As a result, on the Hence, by Lemma 4.9 and since t n diverges with n, The final result in (4.14) is proved in [12,Lemma 3.2].
In Lemma 4.4 we saw that, as long as the truncated selection sets (S n,1 (i)) i∈[k] are pairwise disjoint, then the events {h n, , are independent, conditionally on S n,1 . Furthermore, when the truncation sequence t n diverges as n → ∞, we already observed that the event {τ k < t n } holds with high probability by Lemma 4.9, which implies that the (S n,1 (i)) i∈ [k] are disjoint.
On the other hand, we use the truncated depths (h n,1 (i)) i∈[k] merely for technical reasons, and are really interested in the depths (h n (i)) i∈ [k] . As a result, choosing a truncation sequence (t n ) n∈N that diverges with n 'too quickly', may lead to different behaviour of h n,1 (1) compared to h n (1). In other words, if t n grows 'too quickly', then h n,2 (1) = h n (1) − h n,1 (1) might become 'too large'. In the following lemma we make this informal statement more precise and provide constraints on t n to avoid such discrepancies between h n (1) and h n,1 (1). Lemma 4.11 (Partially from Lemma 2.7, [12]). Fix k ∈ N and c ∈ (0, 2). If d i ≤ c log n for all i ∈ [k] and t n = (log n) 2 , then for any j ∈ [k] and any ε > 0, To prove (4.16), we consider j = 1 only by the exchangeability of the vertices. We first note that t n ≤ min i∈[k] i by the assumption on t n and since the i diverge with n. As a result, the event { n (i) = i , i ∈ [k]} is solely dependent on the truncated selection sets S n,1 and the associated coin flips of the truncated selection sets, whereas h n,2 (1) is determined by the set S n (1) ∩ [2, t n − 1] and its associated coin flips. It thus follows that h n,2 (1) is independent of the event The result then follows from the Markov inequality and by the assumption on t n , as by the assumptions on t n , which concludes the proof.

Joint properties of high-degree vertices
In this section we use the preliminary results proved in Section 4 to study the joint behaviour of the depth and label of high-degree vertices.
We When, instead, d diverges as n → ∞ such that lim n→∞ d/ log n = a, (ii) Combined with Lemma 4.11 and Remark 4.12, we obtain that the results in Proposition 5.1 hold when we substitute h n (1) for h n,1 (1) as well.
Proof. We first prove (5.3) and briefly discuss how to prove (5.2) using [12,Lemma 2.5] at the end. In the setting of (5.3), we recall that we assume that d diverges as n → ∞ and h, and t n are as in (5.1).
To prove the expected value has the desired limit, we start by rewriting the binomial random variables X n, ,1 and X n, Here, we set X n, ,1 = 0, X n, ,2 = 0 if Q n − d ≤ 0, Q n = 0, respectively. Notice that Q n and Q n are independent, that they can be determined from S n,1 (1) and that the values of the I n j , I n j are independent of S n,1 (1), so that conditioning on S n,1 (1) is equivalent to conditioning on Q n , Q n . We can then write the expected value in the statement of the proposition as The second line follows from the fact that, by changing the upper limits of the second and third sum in the probability on the first line from Q n − d to (Q n − d)1 {Qn−d≥1} , we can remove the indicator in the expected value. Indeed, if Q n ≤ d, then 1 {Qn−d≥1} = 0, and hence the second event in the probability cannot occur almost surely, so that the probability is zero. As a result, the indicator in the expected value is redundant. We thus obtain Combining this with (5.4) yields What remains is to show that the first two terms yield the desired limit and that the last term is negligible compared to the first two. Let us start with the former and tackle the product of two probabilities on the first line. It follows from Lindeberg's conditions [11,Theorem 3.4.5] that with N, N ∼ N (0, 1) independent standard normal random variables, as we recall that Q n and Q n are sums of independent Bernoulli random variables. It is readily checked that by the choice of in (5.1) and since d diverges with n, and, by the choice of , d and t n , (1)) log log n, (1)) log log n.
(5.8) By (5.6) and (5.7) we thus obtain that which converges to Φ(x), where we recall that Φ : R → (0, 1) denotes the cumulative density function of a standard normal distribution. By Skorokhod's representation theorem [5,Theorem 6.7] there exists a probability space and a coupling of (Q n ) n∈N , ( Q n ) n∈N and (I n j ) j∈[n],n∈N , ( I n j ) j∈[n],n∈N such that the collections (I n j ) j∈N , ( I n j ) j∈N are independent of Q n and Q n and the convergence in (5.6) is almost sure rather than in distribution. In particular, Q n /d (5.10) Combining this with the Skorokhod representation, the fact that d/ log n → a and (5.8), yields where N 1 , N 2 are independent standard normal random variables. Combining this with (5.9) and using that h = log n − d/2 + y log n − d/4, we obtain 11) where N is again a standard normal random variable. This deals with the second term of (5.5).
For the first term, we observe that as n → ∞ by (5.9), and similarly for z ≥ 0, as n → ∞. Hence, for x ∈ R fixed, let us define a random variable M x : where M is a standard normal random variable. It then follows that P(M x = 0) = Φ(x) and If we let N, N , N be i.i.d. standard normal random variables, independent of M x , and use similar steps as in (5.12) and (5.10) (in particular using the Skorokhod representation for the random variables (Î n j ) j∈[n] , O n , (Q n − d)1 {Qn−d≥1} and that d/ log n → a), this converges in distribution to Combining this with (5.11) in (5.5) yields (5.13) By intersecting the event in the first probability on the right-hand side with the complementary events {M x = 0}, {M x > 0}, and using that M x is independent of N , we arrive at By the definition of M x , it follows that the event {M x > 0} is equivalent to {M > x}, where we recall that M is a standard normal random variable. Moreover, on the event {M x > 0} = {M > x}, as desired. Finally, we show that By splitting the expected value into the cases where Q n is at most d + 1 + d 1/2−η and at least d + 1 + d 1/2−η , respectively, for some η ∈ (0, 1/2), we obtain Since d 1/2−η = o Var(Q n ) (see (5.7)), it follows from (5.6) that the probability in the last line converges to zero. This proves (5.15), and combining this with the limit (5.14) of the left-hand side of (5.13) in (5.5) yields the desired result and concludes the proof of (5.3).
We now discuss the the proof of (5.2). We recall that now L := lim sup n→∞ d < ∞. Also, conditionally on S n,1 (1), let X n = X n (d) ∼ Bin(|S n,1 (1)| − d, 1/2) (where we set X n = 0 when |S n,1 (1)| − d ≤ 0) and let us define h := log n + y √ log n. Note that (h − h )/ √ log n = o(1) since L < ∞, so that using h instead of h yields the same result. Again using Lemma 4.1 and Proposition 4.5, we obtain Notice that, for any S n,1 (1) ⊆ Ω 1 , both the indicator as well as the probability are decreasing functions of d. As a result, we can bound the expected value from above by setting d = 0 in the indicator and using X n (0) in the probability. The upper bound has the desired limit by [12, Lemma 2.5] with a = b = 0. Similarly, we can bound the expected value from below by setting d = L in the indicator and using X n (L) in the probability. The result then follows from [12, Lemma 2.5] with a = 0, b = L, which yields a matching lower bound.
To finish this section, we use the results related to the truncated selection sets developed in Section 4 to extend Proposition 5.1 to the case of multiple vertices.
∈ R k , and set t n = (log n) 2 . Then, If, additionally, d i diverges as n → ∞ for all i ∈ [k], let M and N be independent standard normal random variables. Then, Proof. We provide a proof for (5.17), the proof of (5.16) uses the same steps.

(5.18)
For the first term on the right-hand side, we use that the truncated selection sets are pairwise disjoint by the definition of B n,δ in (4.7) and that by Lemma 4.4, f n (J) = g n (J) for allJ ∈ B n,δ and n sufficiently large as a result. Together with Lemma 4.8, recalling that R n,1 is a tuple of k independent copies of S n,1 (1), this yields (1)).

Joint properties of vertices with a given label
This section is devoted to studying the joint behaviour of the degree and depth of vertices with a given label. We use the preliminary results proved in Section 4 to obtain the required results.
The section is structured in the same way as Section 5.
We let ∈ [n] be increasing in n such that diverges as n → ∞, and set  We then have the following result.
Proposition 6.1. Let d, h, and t n be as in (6.1) with x, y ∈ R, and recall Pr from (6.2), with y ∈ R, ρ ∈ (0, 1). Then, We observe that we can divide the terms in the expected value into three parts which are pairwise independent. Namely, the exponent and the first indicator, the second indicator, and finally the conditional probability, respectively. Indeed, the exponent and first indicator only depend on [ + 1, n] ∩ S n,1 (1), the second indicator only on the event { ∈ S n,1 (1)}, and the conditional probability depends only on [t n , − 1] ∩ S n,1 (1). Since > log = t n for all n sufficiently large, these three parts depend on disjoint sets of independent random variables and are hence independent. As a result, we obtain 3) The first probability on the right-hand side equals 2/ . The expected value on the right-hand side can be rewritten as follows. First, by summing over all possible truncated selection sets S n,1 (1), As diverges with n, ( − 1)/(n − 1) = (1 + o(1)) /n. Defining S n,1 (1) as a random subset of { + 1, . . . , n} which includes each integer j ∈ { + 1, . . . , n} independently with probability 1/j, the double sum and triple product can be interpreted as By bounding the product from below by one and using (6.3), we obtain the lower bound where the last step follows if we assume that the two probabilities in the first step are asymptotically equal to Pr and Φ(x), respectively. For an upper bound, we first expand the product in the expected value of (6.4) to obtain  1 j t − 2 P j t ∈ S n,1 (1) As P(j t ∈ S n,1 (1)) = 1/j t ≤ 1/(j t − 2), we arrive at the upper bound Combining this with (6.6) and (6.3) and since diverges with n, we thus obtain the upper bound when we (again) assume that the first and last probability on the second line are asymptotically equal to Pr and Φ(x), respectively. As this upper bound matches the lower bound in (6.5), we arrive at the desired result.
It remains to prove that 1 + e t − 1 1 j .
Since, for any t ∈ R, the moment generating function (MGF) of | S n,1 (1)| converges to the MGF of P (ρ), Finally, when = n − o(n), using Markov's inequality yields as desired.
For the latter result in (6.7) we set Q n := |[t n , − 1] ∩ S n,1 (1)|, let (I n j ) j∈[n] denote independent Bernoulli random variables with success probability 1/2, also independent of Q n , and write We then use a similar approach as (5.10). In particular, we use the Skorokhod embedding which provides us with a coupling of the random variables Q n and (I n i ) i∈[n] such that 2 where N is a standard normal random variable. As a result, recalling that h := log + x √ log , as required, which concludes the proof.
To finish this section, we use the results related to the truncated selection sets developed in Section 4 to extend Proposition 6.1 to the case of multiple vertices. The choice of t n is imperative, and so we define, for some ( i ) i∈[k] ∈ [n] k , We can then formulate the following result.
be k distinct integer-valued sequences such that i increases with n and i diverges as n → ∞ for all i ∈ [k]. Let, for i ∈ [k], h i := log i +x i √ log i and d i := log(n/ i )+y i log(n/ i ) if i = o(n) and d i ∈ N 0 fixed otherwise, with (x i ) i∈[k] , (y i ) i∈[k] ∈ R k and let t n as in (6.8). Furthermore, recall the definition of Pr in (6.2). Then, Remark 6.4. As is the case in Remark 5.2, it follows from Lemma 4.11 and Remark 4.12 that the result in Proposition 6.3 holds when substituting h n (i) for h n,1 (i) as well.
Proof. The proof follows a similar approach as the proof of Proposition 5.3. We first write where the last step follows from Lemma 4.10. We then define With similar steps as in (5.18) through (5.21), we then have (6.10) It follows from (4.13) in Lemma 4.10 that . A similar argument as in the proof of (4.13) can be used to show that as well. As the elements of R n,1 are i.i.d., we have The product on the right-hand side equals by Proposition 6.1 and Lemma 4.10. Using this in (6.10) yields Combining this with (6.9) then yields the desired result. Recall the counting measures X to prove Theorem 2.2. We use the method of moments combined with Proposition 3.1 to achieve this: Proof of Theorem 2.2 subject to Proposition 3.1. As discussed, it suffices to prove the weak convergence of MP (nt) to MP ε along subsequences (n t ) t∈N such that ε nt → ε (where ε ∈ [0, 1]) as t → ∞. In turn, this is implied by the convergence of the FDDs, i.e., by the joint convergence of the counting measures X ≥s (B) of finite collections of disjoint subsets of A (see (7.1)). We recall that the points P i in the definition of the variables X s (B), X ≥s (B) in (3.3) are the points of the Poisson point process P with intensity measure λ(dx) := 2 −x log 2 dx in decreasing order. As a result, as the random variables (M i , N i ) i∈N are i.i.d. and also independent of P, We also recall that (n ) ∈N is a subsequence such that ε n → ε as → ∞. We now take c ∈ (1/ log 2, 2) and for any K ∈ N consider any fixed non-decreasing integer sequence (s m ) m∈ [K] . By the choice of c and the fact that the s m are fixed with respect to n, s 1 + log 2 n = ω(1) and s K + log 2 n < c log n for all n ≥ 2 follow. Moreover, let K := min{m : s m+1 = s K } and let (B m ) m∈[K] be a sequence of sets in B(R 2 ) such that B m ∩ B = ∅ when s m = s and m = . We can then, for any (c m ) m∈[K] ∈ N K 0 , obtain from Proposition 3.1 and since s 1 , . . . , where the last step follows from the independence property of (marked) Poisson point processes and the choice of the sequences (s m , B m ) m∈ [K] . The method of moments [21, Section 6.1] then concludes the proof.
It remains to prove Proposition 3.1. We note that this construction implies that the first c 1 many d i , a i and A i equal log 2 n + s 1 , a 1 and B 1 , respectively, that the next c 2 many d i , a i and A i equal log 2 n + s 2 , a 2 and B 2 , respectively, etcetera. Furthermore, lim n→∞ d i / log n = a i for all i ∈ [L]. We then define the events The right-hand side of (7.2) then equals where the product is independent of S and j and can therefore be taken out of the double sum. The double sum equals Now, recall the definition of the variables X since (n) L := n(n − 1) · · · (n − (L − 1)) = (1 + o(1))n L . We now recall that there are exactly c m many d i , a i , and A i that equal log 2 n + s m , a m , and B m , respectively, for each m ∈ [K] and that Combined with (7.5), this finally yields In this section we provide the final steps that build on Propositions 5.1 and 6.1 to prove Theorems 3.5 and 3.7. In particular, we show how to include the graph distance between vertices 1, . . . , k in the Kingman n-coalescent. As mentioned at the end of Section 3, combining Theorems 3.5 and 3.7 with Corollary 3.4 then immediately implies Theorems 2.4 and 2.6, respectively.
Intuitively, the graph distance between vertices can be related to their (truncated) depth. By the definition of τ k , the largest common ancestor of any two distinct vertices i, j ∈ [k] in the random recursive tree has label at most τ k and hence the sum of the depths and truncated depths of vertices i and j form an upper and lower bound for the graph distance between these vertices in the Kingman n-coalescent, respectively. Since the depth and the truncated depth are asymptotically equal under certain constraints on the truncation sequence t n (see Lemma 4.11 and Remark 4.12), and since (τ k ) n∈N forms a tight sequence of random variables by Lemma 4.9, these bounds on the graph distance are sufficiently sharp. Using the largest common ancestor to provide a lower bound on the graph distance has been used by Munsonius and Rüschendorf for b-are recursive trees [29] and by Ryvkina for random split trees [33], previously.
We formalise the above intuition in the remainder of the section, in which we prove Theorems 3.5 and 3.7.
It remains to obtain a matching lower bound. We make use of the following observation: In the Kingman n-coalescent process, assume two vertices i 1 , i 2 are in distinct trees at step j of the coalescent. Then, the sum of their depths at step j is bounded from above by the graph distance between i 1 and i 2 in the final tree of the coalescent. That is, h Fj (i 1 ) + h Fj (i 2 ) ≤ dist F1 (i 1 , i 2 ) on the event that i 1 , i 2 are in two distinct trees in the forest F j . See Figure 1 for an example, where the graph distance between vertices 1 and 3 in F 1 is larger than the sum of the depths of 1 and 3 in F 2 .
This observation allows us to use the truncated depths h n,1 (i) to bound the graph distances between the vertices 1, . . . , k. Indeed, h n,1 (i) = h Ft n (i) denotes the depth of vertex i in the tree at the truncation time t n . Recall that the event {τ k < t n } denotes that the vertices 1, . . . , k are in distinct trees at step t n , which holds with high probability by Lemma 4.9. For h i , i , L i,j as in (8.3), we thus have , h n,1 (i) + h n,1 (j) ≤ L i,j , 1 ≤ i < j ≤ k, τ k < t n | D k ) + P(τ k ≥ t n | D k ) ≤ P(h n,1 (i) ≤ h i , log n (i) ≤ i , i ∈ [k], h n,1 (i) + h n,1 (j) ≤ L i,j , 1 ≤ i < j ≤ k | D k ) + P(τ k ≥ t n | D k ).
The last term tends to zero with n by Lemma 4.9. With the same approach as in (8.2) Combined with the matching lower bound which follows from (8.4), this concludes the proof.
In a similar spirit, we prove Theorem 3.7. Again, combined with Corollary 3.4, this implies Theorem 2.6.
Proof of Theorem 3.7. The proof follows a similar approach to the proof of Theorem 3.5. Recall the random variables (d * n (i)) i∈[k] and (Z i ) i∈[k] from (3.9) and set t n = min i∈[k] log i . Proposition 6.1 provides that the tuple d * n (i), conditionally on the event L k := { n (i) = i , i ∈ [k]}, converges in distribution to (Z i , N i ) i∈ [k] , where the N i are i.i.d. standard normal random variables, also independent of the Z i . By our choice of t n , Lemma 4.11 and Remark 4.12 yield that the result holds when h n,1 (i) is substituted by h n (i) as well. As in (8.1), we can use the trivial upper bound dist n (i, j) ≤ h n (i) + h n (i), i, j ∈ [n]. We can thus write, similar to (8.2), dist n (i, j) − (log i + log j ) log i + log j ≤ h n (i) − log i √ i log i log i + log j + h n (j) − log j log j log j log i + log j .
Recall the limits c i,j , c j,i of the two square-root terms on the right-hand side of (8.6) from (3.8). We thus obtain, for (x i ) i∈[k] ∈ R k fixed, by (8.6) and (8.5) (and the remark on the h n (i) below (8.5)) together with the continuous mapping theorem [5], We now use the same observation made after (8.4). That is, on the event {τ k < t n }, it holds that dist n (i, j) ≥ h n,1 (i) + h n,1 (j) for any two distinct vertices i, j ∈ [k]. We hence have , h n,1 (i) + h n,1 (j) ≤ L i,j , 1 ≤ i < j ≤ k, τ k < t n | L k ) + P(τ k ≥ t n | L k ) , h n,1 (i) + h n,1 (j) ≤ L i,j , 1 ≤ i < j ≤ k | L k ) + P(τ k ≥ t n | L k ) .
The last term on the right-hand side tends to zero by Lemma 4.10. Using the right-hand side of (8.6) to rewrite the event {h n,1 (i) + h n,1 (j) ≤ L i,j }, we thus obtain lim sup which matches the lower bound in (8.7) and concludes the proof.