Clustering in a hyperbolic model of complex networks

In this paper we consider the clustering coefficient and clustering function in a random graph model proposed by Krioukov et al.~in 2010. In this model, nodes are chosen randomly inside a disk in the hyperbolic plane and two nodes are connected if they are at most a certain hyperbolic distance from each other. It has been shown that this model has various properties associated with complex networks, e.g. power-law degree distribution, short distances and non-vanishing clustering coefficient. Here we show that the clustering coefficient tends in probability to a constant $\gamma$ that we give explicitly as a closed form expression in terms of $\alpha, \nu$ and certain special functions. This improves earlier work by Gugelmann et al., who proved that the clustering coefficient remains bounded away from zero with high probability, but left open the issue of convergence to a limiting constant. Similarly, we are able to show that $c(k)$, the average clustering coefficient over all vertices of degree exactly $k$, tends in probability to a limit $\gamma(k)$ which we give explicitly as a closed form expression in terms of $\alpha, \nu$ and certain special functions. We are able to extend this last result also to sequences $(k_n)_n$ where $k_n$ grows as a function of $n$. Our results show that $\gamma(k)$ scales differently, as $k$ grows, for different ranges of $\alpha$. More precisely, there exists constants $c_{\alpha,\nu}$ depending on $\alpha$ and $\nu$, such that as $k \to \infty$, $\gamma(k) \sim c_{\alpha,\nu} \cdot k^{2 - 4\alpha}$ if $\frac{1}{2}<\alpha<\frac{3}{4}$, $\gamma(k) \sim c_{\alpha,\nu} \cdot \log(k) \cdot k^{-1} $ if $\alpha=\frac{3}{4}$ and $\gamma(k) \sim c_{\alpha,\nu} \cdot k^{-1}$ when $\alpha>\frac{3}{4}$. These results contradict a claim of Krioukov et al., which stated that the limiting values $\gamma(k)$ should always scale with $k^{-1}$ as we let $k$ grow.


Introduction and main results
In this paper, we will consider clustering in a model of random graphs that involves points taken randomly in the hyperbolic plane. This model was introduced by Krioukov, Papadopoulos, Kitsak, Vahdat and Boguñá [25] in 2010 -we abbreviate it as the KPKVB model. We should however note that the model also goes by several other names in the literature, including hyperbolic random geometric graphs and random hyperbolic graphs. Krioukov et al. suggested this model as a suitable model for complex networks. It exhibits the three main characteristics usually associated with complex networks: a power-law degree distribution, a non-vanishing clustering coefficient and short graph distances.

KPKVB model
We start with the definition of the model. As mentioned, its nodes are situated in the hyperbolic plane H, which is a surface with constant Gaussian curvature −1. This surface has several convenient representations (i.e. coordinate maps), such as the Poincaré half-plane model, the Poincaré disk model and the Klein disk model. A gentle introduction to Gaussian curvature, hyperbolic geometry and these representations of the hyperbolic plane can be found in [36]. Throughout this paper we will be working with a representation of the hyperbolic plane using hyperbolic polar coordinates, sometimes called the native representation. That is, a point u ∈ H is represented as (r, θ), where r is the hyperbolic distance between u and the origin O and θ as the angle between the line segment Ou and the positive x-axis. Here, when mentioning "the origin" and the angle between the line segment and the positive x-axis, we think of H embedded as the Poincaré disk in the ordinary euclidean plane.
The KPKVB model has three parameters: the number of vertices n, which we think of as going to infinity, and α > 1 2 , ν > 0 which we think of as fixed. Given n, α, ν we define R = 2 log(n/ν). Then the hyperbolic random graph G(n; α, ν) is defined as follows: • The vertex set is given by n i.i.d. points u 1 , . . . , u n denoted in polar coordinates u i = (r i , θ i ), where the angular coordinate θ is chosen uniformly from (−π, π] while the radial coordinate r is sampled independently according to the cumulative distribution function • Any two vertices u i = (r i , θ i ) and u j = (r j , θ j ) are adjacent if and only if d H (u i , u j ) ≤ R, where d H denotes the distance in the hyperbolic plane. We will frequently be using that, by the hyperbolic law of cosines, d H (u i , u j ) ≤ R is equivalent to cosh(r i ) cosh(r j ) − sinh(r i ) sinh(r j ) cos(|θ i − θ j | 2π ) ≤ cosh(R), Figure 1 shows a computer simulation of G(n; α, ν). As observed by Krioukov et al. [25], and proved rigorously by Gugelmann et al. [21], the degree sequence of the KPKVB model follows a power-law with exponent 2α + 1. Gugelmann et al. [21] also showed that the average degree converges in probability to the constant 8να 2 /π(2α − 1) 2 , and they showed that the (local) clustering coefficient is non-vanishing in the sense that it is bounded below by a positive constant a.a.s. Here, and in the rest of the paper, for a sequence (E n ) n of events, E n asymptotically almost surely (a.a.s.) means that P (E n ) → 1 as n → ∞.
Apart from the degree sequence and clustering, the third main characteristic associated with complex networks, "short distances", has also been established in the literature. In [1] it is shown that for α < 1 the largest component is what is called an ultra-small world : if we randomly sample two vertices of the graph then, a.a.s., conditional on them being in the same component, their graph distance is of order log log n. In [22] and [19] a.a.s. polylogarithmic upper and lower bounds on the graph diameter of the largest component are shown, and in [30], these were sharpened to show that log n is the correct order of the diameter.
Earlier work of the first and third authors with Bode [7] and of the first and third authors [17] has established the "threshold for a giant component": if α < 1 then there is a unique component of size linear in n no matter how small ν (i.e. the average degree); if α > 1 all components are sublinear no matter the value of ν; and if α = 1 then there is a critical value ν c such that for ν < ν c all components are sublinear and for ν > ν c there is a unique linearly sized component (all of these statements holding a.a.s.). Whether or not there is a giant component if α = 1 and ν = ν c remains an open problem. In [22] and [23], Kiwi and Mitsche considered the size of the second largest component and showed that for α ∈ (1/2, 1), a.a.s., the second largest component has polylogarithmic order with exponent 1/(α − 1/2).
In another paper of the first and third authors with Bode [8] it was shown that α = 1/2 is the threshold for connectivity: for α < 1/2 the graph is a.a.s. connected, for α > 1/2 the graph is a.a.s. disconnected and when α = 1/2 the probability of being connected tends to a continuous, non-decreasing function of ν which is identically one for ν ≥ π and strictly less than one for ν < π. Friedrich and Krohmer [5] studied the size of the largest clique as well as the number of cliques of a given size. Boguña et al. [9] and Bläsius et al. [6] considered fitting the KPKVB model to data using maximum likelihood estimation. Kiwi and Mitsche [24] studied the spectral gap and related properties, and Bläsius et al. [4] considered the tree-width and related parameters of the KPKVB model. Recently Owada and Yogeshwaran [33] considered subgraph counts, and in particular established a central limit theorem for the number of copies of a fixed tree T in G(n; α, ν), subject to some restrictions on the parameter α.

Clustering
In this work we study the clustering coefficient in the KPKVB model. In the literature there are unfortunately two distinct, rival definitions of the clustering coefficient. One of those, sometimes called the global clustering coefficient, is defined as three times the ratio of the number of triangles to the number of paths of length two in the graph. Results for this version of the clustering coefficient in the KPKVB model were obtained by Candellero and the first author [10] and for the evolution of graphs on more general spaces with negative curvature by the first author in [16].
We will study the other notion of clustering, the one which is also considered by Krioukov et al. [25] and Gugelmann et al. [21]. It is sometimes called the local clustering coefficient, although we should point out that Gugelmann et al. actually call it the global clustering coefficient in their paper. For a graph G and a vertex v ∈ V (G) we define the clustering coefficient of v as: where E(G) denotes the edge set of G and deg(v) is the degree of vertex v. That is, provided v has degree at least two, c(v) equals the number of edges that are actually present between the neighbours of v divided by the number of edges that could possibly be present between the neighbours given the degree of v. The clustering coefficient of G is now defined as the average of c(v) over all vertices v: As mentioned above, Gugelmann et al. [21], have established that c(G(n; α, ν)) is non-vanishing a.a.s., but they left open the question of convergence. Theorem 1.1 below establishes that the clustering coefficient indeed converges in probability to a constant γ that we give explicitly as a closed form expression involving α, ν and several classical special functions.
In addition to the clustering coefficient, we shall also be interested in the clustering function. This assigns to each non-negative integer k the value where N (k) denotes the number of vertices of degree exactly k in G. In other words, the clustering function assigns to the integer k the average of the local clustering coefficient over all vertices of degree k. We remark that, while it might seem natural to consider c(k) to be "undefined" when N (k) = 0, we prefer to use the above definition for technical convenience. This way c(k; G(n; α, ν)) is a plain vanilla random variable and we can for instance compute its moments without any issues. Krioukov et al. state ([25], last sentence on page 036106-10) that as k tends to infinity, the clustering function decays as k −1 . This seems to be based on computations that were not included in the paper. Despite the attention the KPKVB model has generated since then, the behaviour of the clustering function in KPKVB random graphs has not been rigorously determined yet. In particular it has not been established whether it converges as n → ∞ to some suitable limit function. Theorems 1.2 and 1.3 below settle this question. Theorem 1.2 shows that for each fixed k, the value c(k; G(n; α, ν)) converges in probability to a constant γ(k) that we again give explicitly as a closed form expression involving α, ν and several classical special functions. Theorem 1.3 extends this result to growing sequences satisfying k n 1/(2α+1) . Proposition 1.4 clarifies the asymptotic behavior of the limiting function γ(k), as k → ∞. This depends on the parameter α, and γ(k) only scales with k −1 when α > 3/4, which corresponds to the exponent of the degree distribution exceeding 5/2. So in particular our findings contradict the above-mentioned claim of Krioukov et al. [25].

Notation
In the statement of our main results, and throughout the rest of the paper, we will use the following notations. We set ξ := 4αν π(2α − 1) .

The clustering coefficient
Our first main result shows the convergence of the local clustering coefficient in the KPKVB model and establishes the limit exactly.
In the above expression for γ, a factor α − 1 occurs in the denominator of each term, but we will see that this corresponds to a removable singularity. We have not been able to find a closed form expression in terms of standard functions in the case when α = 1, but in Section 3.2.4 we do provide an explicit expression involving integrals.
A plot of γ(k), together with the results of computer experiments, can be found in Figure 3. Again, we remark that the above expression for γ(k) appears to have a singularity at α = 1, but this will turn out to be a removable singularity. Again, we have not been able to find a closed form expression in terms of standard functions in the case when α = 1, but in Section 3.2.4 we do provide an explicit expression involving integrals. Theorem 1.2 in fact generalises to increasing sequences (k n ) n≥1 . Theorem 1.3. Let α > 1 2 , ν > 0 be fixed and let k n be a sequence of non-negative integers satisfying 1 k n n 1/(2α+1) . Then, writing G n := G(n; α, ν), we have The conclusion of Theorem 1.3 is slightly stronger than c(k n ; G n )/γ(k n ) P − −−− → n→∞ 1, which might alternatively be written as c(k n ; G n ) = (1 + o(1))γ(k n ) a.a.s., using notation that is common in the random graphs community.

Scaling of γ(k)
To clarify the scaling behaviour of γ(k) we offer the following result. Proposition 1.4. As k → ∞, we have where c α,ν := Theorem 1.3 states that the clustering function of the KPKVB model scales as γ(k) as the number of vertices n → ∞, and Proposition 1.4 makes clear how γ(k) behaves as k grows. In particular, these results contradict the scaling claimed in [25] for α ≤ 3 4 , and confirms it only for α > 3 4 . We remark that simultaneously and independently Stegehuis, van der Hofstad and van Leeuwaarden [35] used a completely different technique to obtain a similar, though less detailed, result on the k → ∞ scaling of the clustering function in the KPKVB model.

Additional observations and results
There are a few additional remarks we would like to make regarding our results. The reader may already have observed that, with a power law exponent of 2α+1 for the probability mass function of the degree sequence, we would expect Θ(n · k −(1+2α) ) = o(1) vertices of degree exactly k whenever k n 1/(1+2α) . This is the reason why in Theorem 1.3 we restrict ourselves to sequences k n with k n n 1/(1+2α) . When k n n 1/(1+2α) there are no vertices of degree exactly k n a.a.s., which in particular implies that the clustering function equals zero a.a.s. for any such sequence k n .
As mentioned previously, Gugelmann et al. [21] gave a mathematically rigorous result on the degree sequence, which can of course be rephrased as a result on the number of nodes with degree exactly k. Their results allow k = k n to grow with n, but unfortunately require that k n ≤ n δ with δ < min 2α−1 4α(2α+1) , 2(2α−1) 2α+1 . For completeness we offer the following result, which extends that of Gugelmann et al. to the full range 1 ≤ k n ≤ n − 1.
Theorem 1.5. Let α > 1 2 , ν > 0, denote by N n (k) the number of vertices with degree k in the KPKVB model G(n; α, ν) and consider a sequence of integers (k n ) n with 0 ≤ k n ≤ n − 1.
1.3.2 Transition in scaling at α = 3/4. Proposition 1.4 demonstrates that there is a transition in the scaling of the local clustering function at α = 3/4. This corresponds to an exponent 5/2 for the probability mass function of the degree distribution. This transition is different from those often observed for networks with scale-free degree distributions, where transitions occur at integer values of the exponent. At this point, it is unclear what the underlying reason is for the appearance of the transition at this particular half integer exponent. Interestingly, a similar transition point has also been observed for both majority vote models [11] and flocking dynamics [29] on networks with scale-free degree degree distributions.

Uniform convergence.
Our results for the local clustering function in fact imply uniform convergence of c(k; G n ) for all 2 ≤ k ≤ a n where a n n 1 2α+1 . To see this we first observe that for fixed k, the statement c(k; G n ) Then b n ≤ a n n

Outline of the paper
In the next section we will recall some useful tools from the literature and define a series of auxiliary random graph models that will be used in the proofs. In particular, we relate in a series of steps the KPKVB model to an infinite percolation model G ∞ that was used in previous work of the first and third authors [17] on the largest component of the KPKVB model. The value of the limiting constant γ, respectively limiting clustering function γ(k), correspond to the probability that two randomly chosen neighbours of a "typical point" in this infinite model are themselves neighbours, respectively the probability of this event conditional on the typical point having exactly k neighbours. These probabilities can be expressed as certain integrals, which we solve explicitly in Section 3. In the same section we also prove Proposition 1.4, on the asymptotics of γ(k). We then proceed to prove Theorems 1.1 and 1.2 by relating said probabilities for the typical point of the infinite model to the corresponding clustering coefficient/function in the original KPKVB random graph, using the Campbell-Mecke formula and some other, relatively straightforward considerations. In Section 5 we prove Theorem 1.5, which also doubles as a warm-up for the proof of Theorem 1.3 of the clustering function for growing k. The remaining sections are devoted to the proof of Theorem 1.3, which turns out to be technically involved. The main reason for this is that when we push k n close to the maximum possible value, a great deal of work is needed to properly control the arising error terms.
The Appendix includes some auxiliary results on Meijer's G-function, Chernoff bounds for Poisson and Binomial random variables and the code used for simulations.

Preliminaries
In this section we recall some definitions and tools that we will need in our proofs.

The infinite limit model G ∞
We start by recalling the definition of the infinite limit model from [17]. Let P = P α,ν be a Poisson point process on R 2 with intensity function f = f α,ν given by The infinite limit model G ∞ = G ∞ (α, ν) has vertex set P and edge set such that for p = (x, y), p = (x , y ) ∈ P. For any point p ∈ R × (0, ∞), we write B ∞ (p) to denote the ball around p, i.e.
With this notation we then have that B ∞ (p) ∩ P denotes the set of neighbours of a vertex p ∈ G ∞ . We will denote the intensity measure of the Poisson process P by µ = µ α,ν , i.e. for every Borel-measurable subset S ⊆ R 2 we have µ(S) = S f (x, y) dx dy. Using the notation p = (x, y) for a point in R × R + we shall write S h(p) dµ(p) for the integral of h over S with respect to the intensity measure µ, i.e. S h(p) dµ(p) = S h(x, y)f (x, y) dx dy.

The finite box model G box
Recall that in the definition of the KPKVB model we set R = 2 log(n/ν). We consider the box R = (− π 2 e R/2 , π 2 e R/2 ] × (0, R] in R 2 . Then the finite box model G box = G box (n; α, ν) has vertex set V box := P ∩ R and edge set such that where |x| r = min(|x|, r − |x|) for −r ≤ x ≤ r. Using |.| πe R/2 instead of |.| results in the left and right boundaries of the box R getting identified, which in particular makes the model invariant under horizontal shifts and reflections in vertical lines. The graph G box can thus be seen as a subgraph of G ∞ induced on V box , with some additional edges caused by the identification of the boundaries. Similar to the infinite graph, for a point p ∈ R we define the ball B box (p) as

The Poissonized KPKVB model G Po
Imagine that we have an infinite supply of i.i.d. points u 1 , u 2 , . . . in the hyperbolic plane H chosen according to the (α, R)-quasi uniform distribution. In the standard KPKVB random graph G(n; α, ν) we take u 1 , . . . , u n as our vertex set and add edges between points at hyperbolic distance at most R = 2 log(n/ν). In the Poissonized KPKVB random graph G Po := G Po (n; α, ν), we instead take N d = Po(n), a Poisson random variable with mean n, independent of our i.i.d. sequence of points and let the vertex set be u 1 , . . . , u N and add edges according to the same rule as before. Equivalently, we could say that the vertex set consists of the points of a Poisson point process with intensity function ng, where g denotes the probability density of the (α, R)-quasi uniform distribution. That is, Working with the Poissonized model has the advantage that when we take two disjoint regions A, B then the number of points in A and the number of points in B are independent Poissondistributed random variables. As we will see, and as is to be expected, switching to the Poissonized model does not significantly alter the limiting behaviour of the clustering coefficient and function.

Coupling G Po and G box
The following lemmas from [17] establish a useful coupling between the Poissonized KPKVB random graph and the finite box model and relate the edge sets of the two graphs. Lemma 27]). Let V Po denote the vertex set of G Po (n; α, ν) and V box the vertex set of G box (n; α, ν). Define the map Ψ : [0, R] × (−π, π] → R by Then there exists a coupling such that, a.a.s., In the remainder of this paper we will write B (p) to denote the image under Ψ of the ball of hyperbolic radius R around the point Ψ −1 (p) for p ∈ R, i.e.
Under the map Ψ, a point p = (x, y) ∈ R corresponds to u := Ψ −1 (p) = (2e −R/2 x, R − y). By the hyperbolic rule of cosines, for two points p = (x, y) = Ψ((r, θ)), p = (x , y ) = Ψ((r , θ )) ∈ R we have that p ∈ B (p) iff. either r + r ≤ R or r + r > R and cosh r cosh r − sinh r sinh r cos (|θ − θ | 2π ) ≤ cosh(R), This can be rephrased as p ∈ B (p) iff. either y + y ≥ R or y + y < R and The following lemma provides useful bounds on the function Φ(r, r ). Note that in [17] the function Φ is written in terms of where r := R − y, r := R − y . Lemma 28]). There exists a constant K > 0 such that, for every ε > 0 and for R sufficiently large, the following holds. For every r, r ∈ [εR, R] with y + y < R we have that Moreover: A key consequence of Lemma 2.2 is that the coupling from Lemma 2.1 preserves edges between points whose heights are not too large. Lemma 30]). On the coupling space of Lemma 2.1 the following holds a.a.s.: 1. for any two points p, p ∈ V box with y, y ≤ R/2, we have 2. for any two points p, p ∈ V box with y, y ≤ R/4, we have that Remark 2.1 (Notational convention for points). We will often be working with the finite box graph G Po or the infinite graph G ∞ , whose nodes are points in R × R + . For any point p ∈ R × R + we will always use p = (x, y). When considering different points p, p ∈ R×R + , we will use primed coordinates to refer to p , i.e. p = (x , y ), and similar with subscripts, i.e. p i = (x i , y i ).

The Campbell-Mecke formula
A useful tool for analyzing subgraph counts, and their generalizations, in the setting of Poissonized random geometric graphs, and in particular the Poissonized KPKVB model and the box model is the Campbell-Mecke formula. We use a specific incarnation, which follows from the Palm theory of Poisson point processes on metric spaces, see [26]. For this consider a Poisson point process P on some metric space M with density µ and let N denote the set of all possible point configurations in M, equipped with the sigma algebra of the process P. Then, for any natural number k and measurable function h :

Concentration of heights
When analyzing degrees and clustering in the Poissonized KPKVB and related models we often encounter expressions of the form R 0 P (Po(μ(y)) = k n ) h(y)e −αy dy, where h(y) is some function andμ(y) is µ (B (y)), µ (B box (y)) or µ (B ∞ (y)). We will often have to either bound the behavior of such integrals as k n → ∞ or establish their asymptotic behavior. For this we will utilize that Poisson random variables are well concentrated around their mean.
In particular, if λ = λ n → ∞, then for any C > 0, For our application these Chernoff bounds imply that if y is such thatμ(y) is far from k n then P (Po(μ(y)) = k n ) becomes very small. To be more specific, we define for any k ≥ 0 and C > 0, where we set y − k,C = 0 if k − C k log(k) < ξ and likewise if k + C k log(k) < ξ we set y + k,C = 0, but note that as we consider k → ∞, we can assume that this case does not occur. For convenience we write K C (k n ) := [y − kn,C , y + kn,C ]. Then we can show that for all y outside K C (k n ) Since we can select C to be as big as we want we can make this error as small as needed. This implies that then the main contribution to the integral (12) comes from those ''heights" y that are in the interval K C (k n ). In other words, the main contribution is concentrated around the heights y for which µ(y) = k n . We thus refer to this as the concentration of heights result. More precisely, we prove the following.
The key implication of Proposition 2.4 is that if the function h(y) does not increase too fast, then we can restrict integration to the interval K C (y). The full details associated with these concentration of heights and the proof of Proposition 2.4 can be found in the Section E of the Appendix.
3 Clustering and the degree of the typical point in G ∞ As alluded to earlier, we plan to make use of the Campbell-Mecke formula for comparing the clustering coefficient and function of the (Poissonized) KPKVB random graph with certain quantities associated with G ∞ . We will be considering the Poisson process P to which we add one additional point (0, y) on the y-axis. In some computations the height y will be fixed, but eventually we shall take it exponentially distributed with parameter α, and independent of P. We refer to (0, y) as "the typical point".
To provide some intuïtion for this definition and name, note that we can alternatively view P as follows. We take a constant intensity Poisson process on R corresponding to the x-coordinates, and to each point we attach a random "mark", corresponding to the y-coordinate, where the marks are i.i.d. exponentially distributed with parameter α.
Since c(G) is defined as an average over all vertices of the graph, it is not immediately obvious how to meaningfully define a corresponding notion for infinite graphs, and similarly for the clustering function, the degree sequence, etc. We can however without any issues speak of the (expected) clustering coefficient of the typical point, or the expected clustering coefficient given that it has degree k, or the distribution of the degree of the typical point. (All considered in the graph obtained from G ∞ by adding the typical point to its vertex set.) If p = (x, y) ∈ R × [0, ∞) is a point, not necessarily part of the Poisson process, then we will write µ(y) = µ(p) := µ(B ∞ (p)).
Integrating the intensity function of P over B ∞ (p) gives

The degree of the typical point
Before considering clustering we briefly investigate the distribution of the degree of the typical point.
where Po(λ) denotes a Poisson random variable with mean λ. We will often write ρ(y, k) instead of ρ(p, k).
Let the random variable D denote the degree of the typical point. Since the typical point has a height that is independent of the Poisson process and exponential(α)-distributed: π(k) := P(D = k) = ∞ 0 ρ(y, k)αe −αy dy.
(Note that here we define π(k) as the probability that the degree of the typical point equals k.) Using the transformation of variables z = ξe y 2 (so dy = 2 z dz), we compute where we recall that Γ denotes the gamma-function and Γ + the upper incomplete gamma-function. Note that, unsurprisingly, this is identical to the expression Gugelmann et al. [21] gave for the limiting degree distribution of G(n; α, ν). Using Stirling's approximation to the gamma function, we find that By a similar computation we have the following result, which will be useful later on. For any β > 0, as k → ∞ ∞ 0 e βy ρ(y, k)αe −αy dy ∼ 2αξ 2(β+α) k −2(β+α)−1 .

The expected clustering coefficient of the typical point
Let the random variable C denote the clustering coefficient of the typical point (0, y), in the graph obtained from G ∞ by adding (0, y). We now define (Where we take the expectation over both the Poisson point process P and y d = exp(α), independently of the Poisson process P.) We shall show shortly that these take on the values stated in Theorem 1.1 and 1.2.
For any fixed value y 0 > 0, the set of points inside B ∞ (y 0 ) is a Poisson process with intensity f · 1 B∞(y0) . As µ(B ∞ (y 0 )) = µ(y 0 ) = ξe y0/2 < ∞, this can be described alternatively by first picking N d = Po(µ(y 0 )) and then taking N i.i.d. points in B ∞ (y 0 ) according to the probability density f · 1 B∞(y0) /µ(y 0 ). (That is, the intensity function of the Poisson point process, but set to zero outside of B ∞ (y 0 ) and re-normalized in such a way that it integrates to one.) Hence, if we condition on the event that y takes on some fixed value y 0 and that there are exactly k points of P inside B ∞ (y 0 ), then those k points behave like k i.i.d. points in B ∞ (y 0 ) chosen according to the mentioned re-normalized probability density function. This shows that, for every k ≥ 2: where u 1 , . . . , u n are i.i.d. points in B ∞ (y 0 ) with the above mentioned density. Note that this does not depend on the value of k. For notational convenience, we will write with u 1 , u 2 as above. We now observe that where g k denotes the density of y conditional on D = k. That is, where we recall that ρ(y, k) = P (Po(µ(y)) = k) denotes the probability that a Poisson random variable with mean µ(y) is k. Hence, This also gives A key step is to derive the following explicit expression for P (y).
We split the proof of this lemma into a couple of smaller pieces. We begin with the following lemma.
The previous lemma covers the case when y 0 < y 1 < y 2 . We now leverage it to take care of the other cases as well.
To complete the proof for the other cases we note that since P (y 0 , y 1 , y 2 ) is symmetric in y 1 and y 2 , we can assume, without loss of generality, that y 1 < y 2 . Then, there are two more orderings of y 0 , y 1 , y 2 , namely y 1 < y 0 < y 2 and y 1 < y 2 < y 0 , which can be summarized as y 1 < min(y 0 , y 2 ), or equivalently z 1 > max(z 0 , z 2 ). For y 1 < y 0 < y 2 and y 1 < y 2 < y 0 we can apply Lemma 3.3 to obtain P (y 1 , y 0 , y 2 ) = P (y 1 , y 2 , y 0 ) which happen to agree due to the symmetry in the last two arguments of the expression found in Lemma 3.3. The expression for P (y 0 , y 1 , y 2 ) then follows from (25).
As the function x → x 2 is increasing in x for x > 0 and the function x → e −(y1+y2)x is decreasing in x and P (y 0 , y 1 , y 2 ) ∈ [0, 1], it holds that . Application of the theorem of dominated convergence yields that P αn (y 0 ) → P α (y 0 ) which gives the claim as the sequence (α n ) n was arbitrary.
Due to this lemma we can first assume α / ∈ { 3 4 , 1}, compute P (y 0 ) and then obtain the values of P (y 0 ) at the remaining two points by taking the corresponding limit in α. This strategy is executed below. It involves the computation of several integrals which are involved and will take up a few pages. The proof is structured using headers, to aid the reader.
The integral I 11 (y 0 ) is easily obtained: where we have used the substitution u := z 1 /z 0 giving z 0 du = dz 1 in the penultimate line and B − denotes the (lower) incomplete beta function. Note that since c ≥ −1, −a ∈ {0, −1, −2} and by our assumption α ∈ { 3 4 , 1}, the denominators that occur during the integration are all non-zero. Plugging this back into (28) gives For the last step we use the identities to obtain Computing I 2 (y 0 ) We will follow a similar strategy as for I 1 (y 0 ). First, using the change of variables z i := e −yi/2 , i = 1, 2, we get and write We now compute J a,b,c (z 0 ) Here we used that for x ∈ R, y > −1 (note that as c ≥ −1, it holds that c + 2α − 1 > −1): As c ≥ −1 and −a ∈ {0, −1, −2} and by our assumption α ∈ { 3 4 }, the denominators that occur during the computations above are non-zero.
Plugging the expression for J a,b,c (z 0 ) back into (33) we get, Using some algebra and the identities (29) and (30) this can be reduced to Combining the results for I 1 (y 0 ) and I 2 (y 0 ) Combining the results for I 11 (y 0 ), I 12 (y 0 ), I 21 (y 0 ) and I 22 (y 0 ) we get, after some algebra, an explicit expression for P (y 0 ) as a linear combination of terms of the form Observe that the above expression only contains terms of the form α − 1 in the denominator. The only expression of the form α − 3/4 is in the lower incomplete beta-function B − (1 − z 0 ; 2α, 3 − 4α) which appears twice in the expression for P (y 0 ).

The case of α = 3/4
Note that the factor α − 3 4 does not occur in any denominator of the previously obtained expression. For the lower incomplete beta function, the last argument 3 − 4α is zero for α = 3 4 , however as z 0 < 1 the integration domain of the lower incomplete beta function does not touch the singularity at t = 1 (note . Therefore, the previous expression holds for this case as well.
We will thus compute J and I (k) . It will be helpful to change coordinates to z := e −y/2 . This yields We shall be assuming α = 1. We observe from Lemma 3.1 that for α = 1, P (y(z)) is in fact a linear combination of terms of the form z u , (1 − z) u and z u B − (1 − z; v, w).
To compute J we observe that, by integration by parts, This takes care of the two integrands involving the beta function in P (y). The other integrals are easily computed and yield the following expression for J (note that it only depends on α but not on ν) We proceed to work out I (k) . For this we will compute the integrals involving terms in P (y(z)) of the form z u , (1 − z) u and B(1 − z, v, w) separately. We first point out that for any 0 In particular where Γ + denotes the (upper) incomplete gamma function, and we have used the substitution t = ξ/z which gives dz = −ξt −2 dt. (And of course it is understood that ξ/0 = ∞). This takes care of the integrals of all terms in P (y(z)) of the form z u .
Next we will consider the integrals over the terms in P (y(z)) of the form (1 − z) u . For this we need the hypergeometric U-function (also called Tricomi's confluent hypergeometric function), which has the integral representation which holds for a, b, z ∈ C, b ∈ Z ≤0 , Re(a), Re(z) > 0, see [15, p.255]. Applying the change of variables t = 1−s s (i.e. dt = −s −2 ds and s = 1 t+1 ) yields Finally we need to deal with the terms in P (y(z)) that involve the incomplete beta function. Let a, c ∈ R, ξ, b > 0 positive real numbers. Using the integral definition of the incomplete beta function, the change of variables s = 1 − t gives: Then changing the order of integration and using the substitution u = ξ/z and recognizing the upper incomplete gamma function yields To compute this last integral we make use of the fact that the incomplete Γ-function has a representation in terms of Meijer's G-function (see Lemma A.1 in Appendix A) which holds for any a ∈ R and s > 0 (that for a fixed second argument, the upper incomplete gamma function is entire in the first argument, see [20, pp. 899, 1032ff.]). We can now evaluate the integral in (37) using several identities for Meijer's G-function. First, inserting the expression for the incomplete Gamma-function into (37) gives Next we apply the inversion identity for Meijer's G-function (see [15, p. 209 This expression is actually the Euler transform of Meijer's G-function (see [15, p. 214, 5.5.2.(5)]) and (as the conditions 2 + 1 < 2(0 + 2) and | arg(ξ −1 )| < π 2 (as ξ > 0) and Using again the inversion identity for Meijer's G-function we now get Using equation (35), (36) and (38) we get With the expressions for J and I (k) and using Γ * (q, z) = Γ + (q + 1, z) + Γ + (q, z) we now obtain, after some algebra, the expression for γ γ = J − I (0) − I (1) .
which is the expression in Theorem 1.1. Similarly, we get which equals the expression in Theorem 1.2.
We've already established that γ, γ(k) can be obtained at α = 1 by taking the α → 1 limit of the expression obtained for α = 1. Here we derive an alternative explicit expression for completeness. Since the rest of our proofs do not the depend on it the reader could decide to skip this section on a first reading. Recall that Γ * (q, z) = Γ + (q + 1, z) + Γ + (q, z). We will prove the following.
with η = 4ν/π and Li 2 (z) = ∞ t=1 z t /t 2 , the dilogarithm function. Naturally, the proof proceeds by proving the analogue of Lemma 3.1: where dt is the dilogarithm function.
Proof. We want to compute the limit lim α→1 P α (y 0 (z 0 )). For α = 1, we label the terms as follows: Now, we consider the functions s i (α) = s i (α, z 0 ) as functions of α only and compute their Taylor expansion at α = 1, for i ∈ {1, 2, 5, 6, 7} up to linear and for i ∈ {3, 4} up to quadratic order, i.e. we write Using these expansions, we can rewrite In order to continue, we compute: Based on this we see that and i∈{1,2,5,6,7} Finally, it follows that as α → 1, P (y 0 (z 0 )) = i∈{1,2,5,6,7} Therefore, the desired value of lim α→1 P (y 0 (z 0 )) is given by i∈{1,2,5,6,7} where we used that and By expanding the squares and collecting terms, the last expression can be simplified to which finishes the computation.
Proof of Proposition 3.5. It suffices to find the value of J and I (k) at α = 1. We can do this by computing the integrals with the expression for P (y) that we found for α = 1, i.e. and where η = 4ν π and and Li 2 (z) = ∞ t=1 z t /t 2 , the dilogarithm function. Plugging this into (23) and (22) yields the expressions in the statement of the proposition.

The proof of Proposition 1.4
Instead of extracting the scaling of γ(k) from its explicit expression, it turns out to be more convenient to derive it using P (y). Recall that The asymptotic behavior for the denominator is given by (19). Hence, the main term to consider is the numerator ∞ 0 P (y) ρ(y, k)αe −αy dy, and in particular the function P (y). We therefore start with establishing the asymptotic behavior of the latter. First we combine (19) and (20) to obtain the following scaling result ∞ 0 e −βy ρ(y, k)αe −αy dy ∞ 0 ρ(y, k)αe −αy dy Proposition 3.7 (Asymptotic behavior of P (y)). Let α > 1 2 , ν > 0 and c α,ν as defined in Proposition 1.4 Then, as y → ∞, 3. and for α > 3 4 , Proof. We shall deal with each of the three cases for α separately.
Now consider again variable z = e −y/2 and not that z → 0 as y → ∞. Because for any b < 1, with c α,ν as defined in Proposition (1.4). The proof now follows since for 1/2 < α < 3/4, the remaining three terms go to zero as y → ∞. For the first of these terms this is true since as y → ∞ and 1/2 < α < 3/4. α = 3/4 Similar to the previous case we use Lemma 3.1 to obtain (evaluating the expressions First we note that as y → ∞, which implies that We can now conclude that all terms in 2 y e y 2 P (y) except the first one are o (1) as y → ∞. By writing z = e − y 2 we can rewrite the first term as We therefore conclude that as y → ∞.
α > 3/4 We first deal with the case α = 1. Here it follows from Lemma 3.6 that e y/2 P (y) = The last three terms are o (1) as y → ∞, while 2 = (α − 1/2)/(α − 3/4) for α = 1. Now we will deal with the case α > 3/4 and α = 1. For simplicity we write Then, by Lemma 3.1 we get The first term is constant while the last two terms go to zero as y → ∞. We will therefore focus on the remaining two terms. For the first we have, see (40) Therefore it follows, see Lemma B.1, that .
We conclude that as y → ∞ , which finishes the proof.
With the asymptotic behavior of P (y) we are ready to prove Proposition 1.4. Recall that for any C > 0 we defined Since P (y) ≤ 1 by the concentration of heights results (Proposition 2.4) we have that, as k → ∞, Note that this implies that if as y → ∞.
Proof of Proposition 1.4. We split the proof over the different cases for α.
α = 3/4 Similar to the previous case Proposition 3.7 and (42) imply that as k → ∞ However, the final step does not follow immediately from (39) because of the additional logarithmic term. To deal with this we observe the following upper and lower bound e −y/2 ρ(y, k)αe −αy dy and similarly, a lower bound Now observe that as k → ∞, it follows from (43) that as k → ∞, α > 3/4 Again, by Proposition 3.7, equation (42) and (39) with β = 1/2, it follows that as k → ∞, 4 Proofs of Theorem 1.1 and Theorem 1.2 We will first derive Theorem 1.2. It will turn out that Theorem 1.1 has a quick derivation assuming Theorem 1.2.

Clustering function for fixed k, proving Theorem 1.2
We will now show that the clustering function of the KPKVB model c(k; G n ) The key ideas are that the coupling of the Poissonized KPKVB model with the box model is guaranteed to be exact (in the sense that it also preserves edges) for all vertices up to height R/4; and that when computing the expected value clustering function c(k; G) in the subgraph of the box model induced by all vertices up to height R/4 using the Campbell-Mecke formula we obtain integrals that are very similar to the expressions we found earlier for γ(k).
We will repeatedly rely on the following observation.
Proof. We observe that (In the second line we use that deg In the third line we use that clustering coefficients and v has degree K in whichever of G, H it belongs to then at least one edge of E(G)∆E(H) is incident with v, and that every edge in E(G)∆E(H) only affects the status of its two incident vertices. For the fifth line we used that Proof. Let us fix some ε > 0 and write (We ignore rounding issues, i.e. the issue that (1 − ε)n, (1 + ε)n may not be integers, to avoid notational burden. We leave the straightforward details of adapting our arguments below to deal with it to the reader.) Observe that the vertices of G − , G + , G n , G Po all live on the same hyperbolic disk, of radius R = 2 ln(n/ν). We consider the standard coupling where we have an infinite supply of i.i.d. points u 1 , u 2 , . . . chosen according to the (α, R)-quasi uniform distribution, the vertices of G n = G(n; α, ν) are u 1 , . . . , u n , the vertices of G − are u 1 , . . . , u (1−ε)n , the vertices of G + are u 1 , . . . , u (1+ε)n and the vertices of

It follows that
This holds for every fixed ε > 0. Sending ε 0, concludes the proof of the lemma.
Next, let us recall that by the results of Gugelmann et al. on the degree sequence ( [21], Theorem 2.2) we have that for every fixed k. In particular N Gn (k) = Ω(n) a.a.s. Combining this with lemmas 4.1 and 4.2 we obtain: (For the second statement we use that In the remainder of this section, we'll denote by G box− the subgraph of G box induced by all vertices (x, y) ∈ V box = P ∩ R of height at most R/4. Proof. We remind the reader that under the coupling of Lemma 2.1, we can view G box and G Po as having the same vertex set V box = P ∩ R; and two points p = (x, y), p = (x , y ) ∈ V box are joined by an edge in G box if |x − x | πe R/2 ≤ e (y+y )/2 , while p, p are joined by an edge in G Po if either y + y ≥ R or y + y < R and |x − x | πe R/2 ≤ Φ(y, y ) with Φ as provided by (8). It follows immediately from Lemma 2.3 that G box− is an induced subgraph of G Po , a.a.s., as claimed.
Fix ε > 0, and let X denote the number points of V box with y-coordinate ≥ (1 − ε)R. Then X is a Poisson random variable with mean the last equality holding provided ε was chosen sufficiently small (using that α > 1/2). We conclude that, a.a.s., there are no vertices of height ≥ (1 − ε)R.
Let Y denote the number of pairs of vertices p = (x, y), p = (x , y ) ∈ V box with y + y ≥ R. Then, by the Campbell-Mecke formula the last equality holding because α > 1/2 and n = νe R/2 . In particular, by Markov's inequality, Hence also Z = o(n) a.a.s. This concludes the proof as we've now shown that under the stated coupling, a.a.s., G box− and G Po differ by only o(n) edges.
Analogously to Corollary 4.3 we obtain: Lemma 4.6. For every fixed k ≥ 2 we have By the Campbell-Mecke formula where G z box− denotes the graph we get by adding z as an additional vertex to G box− , and adding edges between z and the other vertices as per the connection rule (for G box ). Spelling out the intensity measure µ, plus symmetry considerations, gives Hence, for every fixed y 0 and k, we have that Next we remark that, analogously to the argument given in the beginning of Section 3.2, we have with w 1 = (x 1 , y 1 ), w 2 = (x 2 , y 2 ) chosen independently from B ∞ ((0, y 0 )) ∩ R − according to the probability measure we get by renormalizing µ, i.e. with pdf f µ · 1 B∞((0,y0))∩R− /µ(B ∞ ((0, y 0 )) ∩ R − ). By considerations completely analogous to those following Lemma 3.1, the random variables y 1 , y 2 both follow a truncated exponential distribution with parameter α − 1/2 truncated at height R/4 (i.e. with density 1 {yi≤R/4} · (α − 1/2)e (1/2−α)yi /(1 − e (1/2−α)R/4 )) and, given the values of y 0 , y 1 , y 2 , each x i is chosen uniformly on the interval [−e (y0+yi)/2 , e (y0,yi)/2 ]. In particular with P (., ., y.) as defined in the paragraph following Lemma 3.1. (That is, P (y 0 , y 1 , y 2 ) is the probability that . It follows that, for any fixed y 0 , we have (Applying monotone convergence to justify the convergence of the integral as n → ∞.) Since (expected) clustering coefficients and probabilities are between zero and one and αe αy0 is integrable, we can now apply the dominated convergence theorem to obtain that (Applying (22) for the last equality.) Next, we turn attention to . Another application of Campbell-Mecke shows that with G z,z box− denoting the graph we get by adding z, z as additional vertices to G box− . Now note that if z = (x, y) and z = (x , y ) satisfy |x − x | πe R/2 > 2e R/4 then the neighbourhoods of z, z are determined by the points of the Poisson process P in disjoint areas of the plane. This implies On the other hand, the LHS of (46) is always between zero and one, also if |x − x | πe R/2 ≤ 2e R/4 . We may conclude that Combining this with (45), it follows that Var Chebychev's inequality, we therefore have In combination with Corollary 4.5 (second limit) we can conclude that as desired.

Overall clustering coefficient, proving Theorem 1.1
Proof of Theorem 1.1. Recall in Section 3, we defined π(k) := P(D = k), γ := EC, γ(k) := E(C|D = k) with D the degree and C the clustering coefficient of the "typical point" in the infinite limit model G ∞ . We can write For the KPKVB random graph, or any graph for that matter, we have the similar relation By Theorem 1.2 and (44) we have, for any fixed k ≥ 2: where Slutsky's theorem justifies the convergence in probability. On the other hand we have where the convergence in probability can be justified using Slutsky's theorem together with the fact that ∞ k=0 π(k) = 1 (one convenient way to convince oneself that this is true, is to note that D, the degree of the typical point, is a.s. finite). In more detail, The result follows from (48) and (47), by sending K → ∞.
5 Degrees when k → ∞: proof of Theorem 1.5 Since the new contribution of Theorem 1.5 concerns the cases where the degree k n → ∞, we will assume that this holds throughout this section.

Proof overview
We start by using the Campbell-Mecke formula to compare the degree distribution in G Po with that of the typical point in G ∞ . As we've already seen this equals We will relate this to the Poissonized KPKVB model G Po . More precisely, let N Po (k) denote the set of degree k vertices in G Po . We then show in Section 5.2 that for any 1 (1))nπ(k n ) and more generally, for any integer r ≥ 1 in Section 5.4. The latter result requires us to analyze the joint degree distribution in G Po , which we do in Section 5.3. The above result in particular implies concentration of N Po (k n ) from which the result on the degree distribution in G Po follows for k n = o n 1 2α+1 . When k n = (1 + o (1))cn 1 2α+1 we use the above result to show that the fraction of degree k n nodes in G Po converges to a Poisson distribution.
To extend these results to G n we couple the construction of the KPKVB model to that of the Poissonized version G Po in Section 5.5 to show that a.a.s, N n (k n ) = (1 + o (1))N Po (k n ). With these results we then prove Theorem 1.5 in Section 5.6.
We will also establish all the above mentioned results for the finite box model G box , since the proofs only require small alterations and we will need these results later on when analyzing the clustering coefficient and function.

Expected degrees in G box and G Po
We proceed with the expected degrees in the finite box and Poissonized KPKVB model. Recall the definition of the neighbourhood balls B box (y) and B (y) of a point (0, y) in, respectively G box and G Po . We introduce the short hand notation µ Po (y) := µ (B (y)) and µ box (y) := µ (B box (y)) .
Our first results relate these measures to the measure µ(y), of the ball B ∞ (y) in the infinite model G ∞ .
Proof. Recall that when y ≥ R − y then p ∈ B (y) while for y < R − y this is true when (8) holds. We does split the integral for µ P o,n (y) accordingly, into two integrals I 1 and I 2 , Firstly, we will show that the second integral I 2 = o(µ(y)) and then we will show that For the second integral I 2 , we compute To see that n 1−2α ν 2α (e αy − 1) = o(µ(y)), recall that µ(y) = ξe y 2 . So, we need to show that or equivalently that For this, note that As y ≤ (1 − )R = (1 − )2 ln n ν and α > 1 2 , we have where the convergence is uniform in y, 0 ≤ y ≤ (1 − )R, as the last upper bound does not depend on y.
For the first integral I 1 , we first recall from Lemma 2.2 that there is a positive constant K such that for any ε > 0, for all y 1 , We thus define the main and error term of I 1 as αν π e −αy1 dy 1 , From the error bounds for Φ as given in Lemma 2.2, it follows that We will firstly show that I 1,main = (1 + o(1))µ(y) and then that I 1,error = o(µ(y)).
Proof. First note that since we have identified the boundaries of [− π 2 e R 2 , π 2 e R 2 ] we can assume, without loss of generality, that p = (0, y). We then have that the boundaries of B box (p) are given by the equations x = ±e y+y 2 , which intersect the left and right boundaries of Therefore, if y ≤ 2 log(π/2) this intersection occurs above the height R of the box R while in the other case the full region of the box above h(y) is connected to p. We will first consider the case where y ≤ 2 log(π/2). Here we have where the error term is o (1), uniformly in y. Now let y > 2 log(π/2) and recall that µ(y) = ξe y 2 where ξ = 4αν (2α−1)π . Then, after some simple algebra, we have that Since R − y ≥ εR we have that |φ n (y)| is uniformly bounded by O e −(α− 1 2 )εR , which is o (1) for α > 1 2 .
We can now use a concentration of heights argument to show that the integration of the Poisson probabilities P (Po(µ Po (y)) = k n ) over 0 ≤ y ≤ (1 − ε)R is asymptotically equivalent to π(k n ). And the same holds if we instead consider µ box (y). The proof contains some technical elements that are contained in the Appendix to not hinder the flow of the argument.
Next we need a similar result regarding the derivative of µ Po (y), i.e.
This result is given by Lemma F.1 in the Appendix. The lemma is placed there since the proof is a straightforward though cumbersome use of function analysis and we do not want to break the flow of the argument. We now have that We now apply integration by substitution to the integral, i.e. use the new variable z = z(y), to obtain Note that since the function y → 2 ln y ξ is monotone increasing it follows that for large enough n, . Therefore, by a concentration of heights argument (Proposition 2.4) it follows that which finishes the proof for µ Po (y).
The proof for µ box (y) follows similar arguments. First, we define z(y) = 2 log(µ box (y)/ξ) and use Lemma 5.2 instead of Lemma 5.1 to establish that e −αy = (1 + o (1))e −αz(y) . For the derivative z (y) we recall from the proof of Lemma 5.2 that µ box (y) = (1 + φ n (y))µ(y) with φ n (y) given by (50). The derivative of φ n (y) can be uniformly bounded by We can now apply the same change of variables and a concentration of heights argument to arrive at the required statement.
The main result of this section now follows almost immediately. , then as n → ∞, Proof. We shall consider G Po . The proof of the statements for G box follows using the same arguments and we omit it here. Let D Po (p) denote the degree of a node p ∈ G Po . Then since where 0 < ε < 1 is a constant to be chosen later. Note that . This The first statement of the lemma now follows from Lemma 5.3.
When k n n 1 2α+1 , Lemma 5.3 implies that On the other hand, which is o (1) since by our choice 2α(1 − ε) > 1. Thus the second claim of the lemma follows.

Joint degrees in G box and G Po
To prove the factorization of higher moments of N Po (k n ) and N box (k n ) as in (49), we first have to understand the joint degree distribution in G Po and G box , respectively. This subsequently requires us to analyze the joint neighbourhoods of two points p, p in these models. To explaining the proof strategy we will use the finite box model, since the formulas there are slightly easier. The results for the Poissonized KPKVB model G Po follow the same idea. For r ∈ N and p 1 , . . . , p r ∈ R, we write G box ∪ {p 1 , . . . , p r } for the finite box model obtained by adding p 1 , . . . , p r to the vertex set of the graph and adding all corresponding edges according to the connection rule. Then we define, for any positive integer s and V ⊂ {p 1 , . . . , p s }, In particular, for two points p, p ∈ R and V = {p, p }, ϕ box (V, k; p, p ) is the joint degree distribution of p, p in G box . We will use similar notation for the Poissonized KPKVB model. That is, G Po ∪{p 1 , . . . , p r } denotes the Poissonized KPKVB model obtained by adding p 1 , . . . , p r to the vertex set of the graph and adding all corresponding edges and ϕ Po (V, k; p 1 , . . . , p s ) the corresponding joint degree function.
x right (p, p ) If we define, then each of these are independent Poisson random variables, while Recall the definition of y ± kn,C from equation (15). We will show (see Lemma 5.7) that for any two points p, p whose y-coordinate is in K C (k n ) and whose x-coordinates are sufficiently separated, it holds that µ (B box (p) ∩ B box (p )) = O k 1−ε n . Since the mean of X 1 and X 2 for such two points is k n , the contribution of the Poisson random variable Y (p, p ) to their degrees becomes negligible as k n → ∞ and hence the joint degree distribution will factorizes on this set. The main idea is that if p and p are sufficiently separated in the x-direction, then the overlap of their neighbourhoods B box (p) ∩ B box (p ) is of smaller order than µ (B box (p)) + µ (B box (p )). We now proceed with analyzing theses joint neighbourhoods.
Let p, p ∈ R and denote by N box (p, p ) the number of common neighbours of p and p in G box ∪ {p, p }. We shall establish an upper bound on the expected number of joint neighbours when p and p are sufficiently separated.
We start by analyzing the shape of the joint neighbourhood. Due to symmetry and the fact that we have identified the left and right boundaries of the box R, we can, without loss of generality, assume that p = (0, y) and p = (x , y ) with x > 0. To understand the computation it is helpful to have a picture. Figure 5 shows such an example. There are several different quantities that are important. The first are the heights where the left and right boundaries of the ball B box (p) hit the boundaries of the box R. Since x = 0 these heights are the same and we denote their common value by h(y). We also need to know the coordinatesŷ right (p, p ) andx right (p, p ) of the intersection of the right boundary of the neighbourhood of p with the left boundary of the neighbourhood of p and those for the intersection of the left boundary of the neighbourhood of p with the right boundary of the neighbourhood of p , which we denote byŷ left (p, p ) andx left (p, p ). Finally we will denote by d(p, p ) the distance between the lower right boundary of B box (p) and the lower left of B box (p ), which is positive only when the bottom parts of both neighbourhoods do not intersect, as is the case in Figure 5. The condition d(p, p ) > 0 is exactly the right notion for p and p being sufficiently separated.
We start by deriving expressions for these important coordinates. For this we introduce some notation. For any p = (x, y) ∈ R we will define the left and right boundary functions as, respec- Note that these functions describe the boundaries of the ball B box (p). In particular, Plugging this into either the left or right hand side of the above equation yields the y-coordinateŷ left (p, p ) = 2 log πe R/2 −x e y/2 +e y /2 . The expressions forx right (p, p andŷ right (p, p ) are derived in a similar way. The expression for d(p, p ) follows as the difference ). The full expressions of all coordinates are given below for further reference.
The following result shows that if d(p, p ) > 0, then the expected number of common neighbours is o (µ (B box (p)) + µ (B box (p ))).
Proof. Again, without loss of generality we assume that p = p 0 = (0, y) and p = (x , y ) with 0 ≤ x ≤ π 2 e R/2 . Note that since 0 < x ≤ π 2 e R/2 ,ŷ right (p, p ) ≤ŷ left (p, p ). We writeŷ for y right (p, p ) and observe that belowŷ the balls B box (p) and B box (p ) are disjoint. Therefore, if we define A : We proceed with computing the right hand side The result follows by plugging in and noting that x is the same as |x − x |, by our generalization step.
We can also prove a similar result for the Poissonized KPKVB model G Po , denoting by N Po (p, p ) the number of joint neighbours in G Po ∪ {p, p }.
Proof. We will proceed in a similar fashion as for Lemma 5.5. That is, we will bound the expected number of common neighbours by the number of neighbors of p whose y-coordinate is above the intersection of the right boundary of B (p) and the left boundary of B (p ). Denote byŷ the height of this intersection point. Then The second integral is bounded by ν ξ µ (B box (y)) e −(α− 1 2 )(R−y) . We bound the first integral using Lemma 2.2 as where we used that 3y1 2 ≤ R − y + y1 2 for all y 1 ≤ R − y for the second line. It remains to computeŷ, for which we will establish the following bound To show (61) we note that for any point y 1 ≥ŷ, the corresponding x-coordinate of the left boundary of B (p ) must be to the left of that of the ball B (p), i.e. x − Φ(y , y 1 ) ≤ Φ(y, y 1 ). Therefore it is enough to show that for all it holds that Φ(y, y 1 ) ≤ x − Φ(y , y 1 ), with λ as defined in the statement of the lemma. Note that by assumption on x := |x − x | n (since we can take x = 0) the above upper bound is non-negative. Using Lemma 2.2 it suffices to prove that for all such y 1 , Plugging the upper bound for y 1 into the left hand side and using that (e y/2 + e y /2 ) 3 ≥ e 3y/2 + e 3y /2 , we obtain where we also used that x ≤ π 2 e −R/2 . Let us now define the stripe S kn,C = R ∩ (R + × K C (k n )).
and in addition define, for any 0 < ε < 1, the following set where |x| n = min{|x|, πe R/2 − |x|} denotes the norm on the finite box R where the left and right boundaries are identified. Then for any two points p, p ∈ E ε (k n ) the expect number of joint neighbours is o (k n ).
It is clear that using Lemma 5.6 instead of Lemma 5.5, the above proof applies to the Poissonized KPKVB model, yielding the following result.  Proof. Let H = G Po ∪ {p 1 , . . . , p s } or H = G box ∪ {p 1 , . . . , p s } and 1 ≤ r ≤ s. For 1 ≤ j ≤ r, let Y j be the number of vertices of H which are adjacent to both p j and p r+1 . Let X j be the number of vertices of H which are adjacent to p j , but not to p r+1 . Then, X j + Y j = D H (p j ) is the degree of p j in H.
Now let X r+1 be the number of vertices of H which are adjacent to p r+1 , but to none of p 1 , . . . , p r . Let Y r+1 be the number of vertices of H which are adjacent to p r+1 , and at least one of p 1 , . . . , p r . Then, X r+1 + Y r+1 = D H (p r+1 ) is the degree of p r+1 in H. By definition, we therefore have and the claim of the lemma is that To prove (64) let ε = min(ε, ε(2α − 1)) ∈ (0, 1). Since for 1 ≤ i ≤ r, it is given that The rest of the proof is independent of which of the two models we consider and only uses that By equation (14), we have As by definition c 1 satisfies Beginning with the left-hand side of the claim of the lemma, the law of total probability applied to the events {Y r+1 = y r+1 }, for all y r+1 ∈ A n , and S c n implies that As X r+1 is independent of X 1 , . . . , X r , Y 1 , . . . , Y r by the properties of a Poisson process (as X r+1 counts the number of points in a set which is disjoint of X 1 , . . . , X r , Y 1 , . . . , Y r ), it follows that We will now show that uniformly for all y r+1 , s ∈ A n , it holds that, To see this, observe that for all y r+1 , s ∈ A n , we have that |y r+1 − s| ≤ 2c k 1−ε n ln k 1−ε n . Denote the expectation of X r+1 by λ, write δ n = k n − y r+1 − λ and note that We will now use that (a+b)!
, applied to a = k n − y r+1 and b = y r+1 − s. To see this auxiliary fact, note that by Stirling's approximation to the factorial (see e.g. [14], [31]), it follows that . This finishes the proof of the auxiliary fact and we can continue with where the last line follows since δ n , |y r+1 − s| ≤ 2c 0 k 1−ε n ln k 1−ε n and λ = Θ(k n ) and therefore, with convergence uniform in y r+1 , s, From (65), it then follows that Note that P (X r+1 + Y r+1 = k n ) no longer depends on y r+1 and neither does the O k −C n error term. Therefore we have For the last summation we have Finally, plugging this into the previous step gives which establishes (64) and thus the claim of the lemma.

Factorial moments of degrees
Now that we have analyzed the joint neighbourhoods and degree distributions in both the Poissonized KPKVB and finite box model, we can show convergence of the factorial moments of the number of nodes of degree k in both models. and k n → ∞. Then, for any positive integer r, it holds that The proof of this result requires the following technical lemma which states that the integration of the joint degree distribution can be factorized. and k n → ∞. Let ϕ be either ϕ box or ϕ Po . Then we have that R · · · R ϕ({p 1 , . . . , p r }, k n ; p 1 , . . . , p r ) dµ(p 1 ) · · · dµ(p r ) Proof. Let C > r(2α + 1) and define the set A = (R × · · · × R)\(S kn,C × · · · × S kn,C ). We will first show that the contribution of the integration of ϕ over this range is negligible.
By applying the base case of the induction to the first factor and the induction hypothesis to the second one, we have derived that Finally, for E we observe that Recall that again by (66), which implies that For C > r(2α + 1), we can conclude that indeed We now proof the result for the factorial moments.
Proof of Lemma 5.10. We give the proof for the Poissonized KPKVB model. The proof for the finite box model G box follows using similar arguments.
First of all, we observe that This can be seen by induction on r. For r = 1, the claim is clear. Assuming it holds for r ≥ 1, by the induction hypothesis,

Now, we can write
pr+1 ∈{p1,...,pr} The first sum leads to the right-hand side of the claim for r + 1, whereas the second sum will cancel with the −r.
By the Campbell-Mecke formula where we integrate over r additional points which we can think of as being added independently and with the same distribution as the vertices of the Poissonized KPKVB model G Po in the upper half-plane coordinates.
With r = 1, it follows that which yields that the right-hand side of the claim of the lemma can be rewritten as Using Lemma 5.11, we conclude that . . . , p r }, k n ; p 1 , . . . , p r ) dµ(p 1 ) · · · dµ(p r )

Coupling G n to G Po
In the previous sections we have established results for the degrees and the factorial moments of the degree k n nodes in the Poissonized KPKVB and finite box model. Our intended result, however, was for the degree distribution in the original KPKVB model. In order to extend the result for the Poissonized KPKVB model to the original model we will use a coupling argument to show that the expected difference between the number of degree k n nodes is negligible.
Lemma 5.12. As n → ∞, it holds that for 0 ≤ k n ≤ n − 1, Proof. We couple both models by taking an infinite supply of i.i.d. points u 1 , u 2 , . . . chosen according to the (α, R)-quasi uniform distribution and letting the vertices of G(n; α, ν) be u 1 , . . . , u n and the vertices of G Po (n; α, ν) be u 1 , . . . , u N with N d = Po(n) independently of u 1 , u 2 , . . . . Thus, under this coupling, the only difference between G n = G(n; α, ν) and G Po = G Po (n; α, ν) is the number of points. Note that since N is Poisson with mean n, it follows from the Chernoff bound (see also equation (135) in the paper) that we may assume that n−C √ n log n ≤ N ≤ n+C √ n log n. To keep notation simple we will suppress this conditioning in the derivations.
Clearly, if N = n the graphs are the same. So we will consider the two cases n − C √ n log n ≤ N < n and n < N ≤ n + C √ n log n. We will prove the latter case. The other case uses similar arguments and hence we omit the details here.
If n < N ≤ n + C √ n log n then the G n has less vertices that G Po . Write V n (k n ) and V Po (k n ) to denote the set of vertices that have degree k n in G n and G Po , respectively. Then since the vertices u n+1 , . . . , u N are not present in G n , Therefore it remains to consider the first summation.
Let D n (u) and D Po (u) denote the degree of a point u in G n and G Po , respectively. Then there are two scenarios to consider: 1) either D n (u i ) = k n and D Po (u i ) = k n or 2) D n (u i ) = k n and D Po (u i ) = k n . In the first case, since u i is present in both graphs it follows that D Po (u i ) > k n . Similarly, for the second case it must hold that D n (u i ) < k n . Hence we have Let us first consider the second summation, i.e. the case where the node has degree smaller than k n in G n . Taking the expectation gives nP (D n < k n , D Po = k n ), where D n denotes the degree in the KPKVB model of a point u placed according to the (α, R)-quasi uniform distribution. We now observe that because the points u 1 , . . . , u N used to couple the graphs are independent, we can view the graph G n as being obtained from G Po by removing N − n points, uniformly at random. Therefore if a point has degree k n in G Po but smaller degree in G n , this means that at least one of its neighbors was removed. Denote by Z(n) a random variable with a Hypergeometric distribution, for taking N − n draws from a population of size N , where there are k n good objects. That is, Z(n) denote the number of removed neighbors of a node u with degree k n in G Po . We then have We now proceed with the other summation, for the case where a vertex has degree k n in G n but larger degree in G Po . Since the degree of u in G Po can be a most N − n larger we have Using that the graph G n can be seen as being obtained from G Po by removing N − n points uniformly at random, a point with degree k n + t in G Po can only have degree k n in G n if exactly t of its neighbors where removed. Let us therefore denote by Z(n, t) a random variable with a Hypergeometric distribution, for taking N − n draws from a population of size N , where there are k n + t good objects. Then P (D n (u) = k n , D Po (u) = k n + t) = P (Z(n, t) = t) P (D Po = k n + t) .

In addition we have that
We thus obtain We will show that both summations are o (π(k n )). For the first summation we recall that  o (π(k n )). Hence, since π(k n + t) ≤ π(k n ), For the other summation we use that together with the fact that for ε small enough, log(n) .

It now follows that
which finishes the proof for the case where N > n.

Proof of Theorem 1.5
We now have all necessary ingredients to prove the main result on the degrees, Theorem 1.5.
Proof of Theorem 1.5.
Recall that we shall only give the proof for the case where k n → ∞, since result (i) for fixed k = O (1) follows from [21].
(i) First we recall that the statements regarding π(k) and its asymptotic behavior follow from Equation 18 and Equation 19. (1))nπ(k n ).
Using Lemma 5.10 with r = 2 we have that E N P o (kn) 2 = (1 +o (1) Hence, by Chebychev for any > 0,  , by Lemma 5.12, Hence, it also holds in the original KPKVB model that N n (k n ) d − → P o(ζ).
(iii) We will show that in this case E [N n (k n )] = o(1). This then implies, by Markov's inequality, First we observe that as the Poissonized KPKVB model G Po has the same intensity measure as the original KPKVB model with a fixed number n of points, the expected degree of a vertex of the KPKVB model with radial coordinate r = R − y is given by µ Po (y) and hence, E [N n (k n )] = n R 0 P (Bin(n − 1, µ Po (y)/n) = k n ) α sinh(α(R − y)) cosh(αR) − 1 dy.
Fix 0 < ε < 4α−1 4α+2 ∧ 2α−1 2α . We first show that we only need to consider integration up to y ≤ (1 − ε)R. By our choice of ε, 2α(1 − ε) > 1, so that This implies and thus it is enough to show that Hence, by bounding the Binomial probability (see Lemma D.3) We shall now consider two cases: n 1 2α+1 k n < n 1−ε and n 1−ε ≤ k n ≤ n − 1. For k n ≥ n 1−ε we have, by our choice of ε, that 3 2 − (2α + 1)(1 − ε) < 0, and thus 6 Clustering when k → ∞ : overview of the proof strategy The proof of Theorem 1.3 follows the same strategy as outlined in Section 2 and executed in Section 4. However, the fact that k = k n → ∞ as n → ∞, introduces significant technical challenges, especially for k n close the the maximum scale n 1 2α+1 . For example, the coupling between G Po and G box we use becomes less exact so that we can no longer use Lemma 2.3 to conclude that triangle counts in G Po and G box are asymptotically equivalent. Moreover, since we are ultimately interested in recovering the scaling of c(k n ; G n ), which Theorem 1.3 claims is γ(k n ), we need to show that each step in the strategy outlined in Section 2 only introduces error terms that are of smaller order, i.e. that are o (γ(k n )). This will turn out to require a great deal of care in bounding all error terms we encounter.
In this section we explain the challenges associated with each step and give a detailed overview of the structure for the proof of Theorem 1.3 using intermediate results for each of the steps. We first define the scaling function so that γ(k) = Θ (s(k)) as k → ∞. We will end this section with the proof of Theorem 1.3, based on the intermediate results.
Remark 6.1 (Diverging k n ). Throughout the remainder of this paper, unless stated otherwise, {k n } n≥1 will always denote a sequence of non-negative integers satisfying k n → ∞ and k n = o n 1 2α+1 , as n → ∞.
We start with introducing a slightly modified version of the local clustering function, which will be convenient for computations later, Notice that the only difference between c(k; G) and c * (k; G) is that we replace N (k) by its expectation E [N (k)]. The advantage is that now, the only randomness is in the formation of triangles. In addition, note that since E [N (k)] > 0 a case distinction for N (k) is no longer needed for c * (k; G). It is however still relevant since we are eventually interested in c(k; G). Following the notational convention, throughout the remainder of this paper we write c * (k; G Po ) and c * (k; G box ) to denote the modified local clustering function in G Po and G P,n (α, ν), respectively. Figure 6 shows a schematic overview of the proof of Theorem 1.3 based on the different propositions described below, plus the sections in which theses propositions are proved. Observe that the order in which the intermediate results are proved is reversed with respect to the natural order of reasoning. This does not create any circular logic, since each intermediate result is independent of the others. We choose this order because results proved in the later stages are helpful to deal with error terms coming up in proofs at earlier stages and hence help streamline those proofs. Below we briefly describe each of the intermediate steps leading up to the proof of Theorem 1.3.

Adjusted clustering and the Poissonized KPKVB model
Recall that the first step for the fixed k case was to show that the transition from the KPKVB graph G n = G(n; α, ν) to the Poissonized version G Po did not influence clustering. Here we first make a transition from the local clustering function c(k n ; G n ) to the adjusted version c * (k n ; G n ). The following lemma justifies working with this modified version. The proof uses a concentration result for N n (k n ) and full details can be found in Section 9.3. Lemma 6.1. As n → ∞, We then establish that the modified local clustering function for KPKVB graphs G n behaves similarly to that in G Po . The proof, found in Section 9.3, is based on a standard coupling between a Binomial Point Process and Poisson Point Process. Proposition 6.2. As n → ∞, E [|c * (k n ; G n ) − c * (k n ; G Po )|] = o (s(k n )) .

KPKVB graph G n = G(n; α, ν)
Local clustering c(k n ; G n ) Adjusted local clustering c * (k n ; G n ) Lemma 6.1 Section 9.3 Poissonized KPKVB graph G Po = G Po (n; α, ν) Proposition 6.2 Section 9.3 Adjusted clustering function c * (k n ; G Po ) Clustering limit γ(k n ) Proposition 6.5 Section 7 Adjusted clustering function c * (k n ; G box ) Proposition 6.3 Section 9.2 Expected adjusted local clustering E [c * (k n ; G box )] Proposition 6.4 Section 8 Finite box graph G box = G box (n; α, ν) Figure 6: Overview of the proof strategy for Theorem 1.3. The left column denote the models in which the true hyperbolic balls are used while the right column contains the models that use an approximation of these. The most important part is the transition between these to setting which is accomplished by Proposition 6.3.

Coupling of local clustering between G Po and G box
The next step is to show that the modified clustering is preserved under the coupling described in Section 2.4. The proof can be found in Section 9.2. This step is one of the key technical challenges we face in proving Theorem 1.3.
To understand why, recall that the degree k of a node is related to its height y, roughly speaking, by k ≈ ξe y/2 . Therefore, when k is fixed we have that the heights of nodes with that degree are also fixed, in particular y < R/4 for large enough n. In addition, the main contribution of triangles would also come from nodes with heights y < R/4. This allowed us to use Lemma 2.3 and conclude that the triangles present in the graph G Po where exactly those present in G box and therefore the local clustering function was the same in both models. When k n → ∞ this is no longer true in general. For instance, suppose k n = n 1−ε 2α+1 , for some small 0 < ε < 1. Then the relation k n ≈ ξe yn/2 implies that y n ≈ 2(1−ε) 2α+1 log(n) − 2 log(ξ). Since R/4 = 1 2 log(n) − 1 2 log(ν) we get that R/4 = o (y n ) for all α > (3 − 4ε)/2 and hence y n > R/4 for large enough n, violating the conditions of Lemma 2.3. However, by carefully analyzing the difference between the adjusted local clustering function in both models we can still make the same conclusion. This is summarized in the following proposition whose proof is found in Section 9.2.

Proposition 6.3 (Coupling result for adjusted clustering function). As n → ∞,
Together, the three results described so far imply that the difference between the clustering function for a KPKVB graph and the adjusted clustering function for the finite box graph G box converges to zero faster than the proposed scaling γ(k n ) in Theorem 1.3. Hence, it is enough to prove the result for c * (k; G box ).

From the finite box to the infinite model
To compute the limit of the adjusted clustering function c * (k; G box ) we first prove in Section 8 that it is concentrated around its mean E [c * (k n ; G box )]. Proposition 6.4 (Concentration for adjusted clustering function in G box ). As n → ∞, This result represents another technical challenge we face when considering k n → ∞. For the proof, we first identify the specific range of heights that give the main contribution to the triangle count, showing that the triangles coming from nodes with heights outside this range is of smaller order. Then we prove a concentration result for the main term, using that the neighbourhoods of two nodes whose x-coordinates are sufficiently separated can be considered to be disjoint (see Section 5.3). The full details are found in Section 8.
Assuming this concentration result, we are left to compute the expectation E [c * (k n ; G box )] and show that it is asymptotically equivalent to γ(k n ) as n → ∞. To accomplish this we move to the infinite limit model G ∞ and show that the difference between the expected value of c * (k; G box ) and γ(k n ) goes to zero faster than the proposed scaling in Theorem 1.2. Recall that for the finite box model the left and right boundaries of R n where identified, so that graph G box contains some additional edge with respect to the induced subgraph of G ∞ on R n . The proof of Proposition 6.5 therefore relies on analyzing the number of triangles coming from these additional edges and showing that their contribution to the clustering function are of negligible order, see Section 7.
Remark 6.2 (Notations for different graphs). We will use the subscripts n, Po, box and ∞ to identify properties of, respectively, the KPKVB mode G n , the Poisson version G Po , the finite box model G box and the infinite model G ∞ . For example N Po (k) denotes number of nodes with degree k in G Po and ρ box (y, k) = P (Po(µ(B box (y))) = k), i.e. the degree distribution in G box for a point p = (x, y).

Proof of the main results
We are now ready to prove Theorem 1.3, using the results stated in the previous sections.
Proof of Theorem 1.3. Note that the second of the theorem follows immediately from the first.
To prove the first statement, we rewrite c(k n ; G n ) as c(k n ; G n ) − γ(k n ) = (c(k n ; G n ) − c * (k n ; G n )) + (c * (k n ; G n ) − c * (k n ; G Po )) + (c * (k n ; G Po ) − c * (k n ; G box )) + (c * (k n ; G box ) − Ec * (k n ; G box )) + Ec * (k n ; G box ) − γ(k n ) Then, we take absolute values and apply the triangle inequality. By monotonicity of expectation, we can apply it to both sides and obtain At this point, the lemmas and propositions presented above in this section can be applied in order to show that all summands are o (γ(k n )): Lemma 6.1 for the transition to the modified clustering function in the first term, Proposition 6.2 for the Poissonization in the second term, Proposition 6.3 for the coupling between the Poissonized KPKVB and the finite box model in the third term, Proposition 6.4 for the concentration in the fourth term and finally Proposition 6.5 for the transition to the infinite limit model. All of this together yields that: which establishes the first statement of the theorem and finishes the proof.
7 From G box to G ∞ (Proving Proposition 6.5) In this section we shall relate the clustering in the finite box model G box to that of the infinite model. The main goal is to prove Proposition 6.5 which states that Recall that G box is obtained by restricting the Poisson point process P to the box R = (−I n , I n ] × (0, R], with I n = π 2 e R/2 and connecting two points p 1 , p 2 ∈ R if and only if |x 1 − x 2 | πe R/2 ≤ e (y1+y2)/2 . We also recall that by definition of the norm |.| πe R/2 the left and right boundaries of R are identified. See Section 2.2 for more details. Due to this identification of the boundaries some triples of nodes that form a triangle in the finite box model do not form a triangle in the infinite model. Therefore, to establish the required result we need to compute the asymptotic difference between triangle counts in both models. To keep notation concise we write | · | n for the norm | · | πe R/2 . For any p ∈ R × R + we define for the finite box model, where the sum is over all distinct pairs in P \ p and Similarly, for the infinite model we define Recall that, slightly abusing notation, we write B ∞ (y) for B ∞ ((0, y)) and that N box (k) denotes the number of vertices with degree k in G box .
We will first relate γ(k n ) to an integral expression involving T ∞ (y) and E [c * (k n ; G box )] to one involving E [T box (y)]. Recall the definition of y ± k,C from (15) and the interval K C (k n ) = [y − kn,C , y + kn,C ]. Note that for any y ∈ K C (k n ) it holds that and thus k n − C k n log(k n ) ≤ µ(y) ≤ k n + C k n log(k n ).
Lemma 7.1. Let γ(k n ) be defined as in (22). Then as n → ∞ Moreover, where u 1 and u 2 are independent and distributed according to the probability density µ (B ∞ (y)) It then follows from the Campbell-Mecke formula that It then follows that, , uniformly in y ∈ R + and µ(y) = (1 + o (1))k n uniformly for y ∈ K C (k n ), by the concentration of heights (Proposition 2.4) For (72) we recall that where c box (p) can be expressed as .

By the Campbell-Mecke formula
where the last line follows from the concentration of heights, for which we used the upper bound To analyze the conditional expectation we observe that, similar to the analysis of γ(k n ), conditioned on there being k n points in B box (y), each point u i = (x i , y i ) is independently distributed according to the probability density µ (B box (y)) and thus, by applying a concentration of heights argument on µ (B box (y)) −2 , To finish the argument, we first note that µ (B box (2 log(k n /ξ))) −2 = (1 + o (1))k 2 n , while by Lemma 5.2. We therefore conclude that As a result of this lemma we only need to compare the difference in triangles between both models for height in the interval K C (k n ). This will significantly help the analysis.

Comparing triangles between G ∞ and G box
To analyze the difference |T box (y) − T ∞ (y)| we first reiterate that the difference between the indicator 1 {p1∈B box (p)} in the finite box model and 1 {p1∈B∞(p)} is that in G box we identified the boundaries of the interval [− π 2 e R/2 , π 2 e R/2 ] and we stop at height y = R. This induces a difference in triangle counts between both models. To see this, note that for any p = (x, y) with 0 ≤ y ≤ R we have that B box (p) = B ∞ (p)∩R. This means that if p , p 2 ∈ B box (p) and p 2 ∈ B ∞ (p )∩R then p 2 ∈ B box (p) ∩ B box (p ) and hence (p, p , p 2 ) form a triangle both in G box and G ∞ . However, it could happen that there are points in the intersection B box (p)∩B box (p ) that are not in B ∞ (p)∩B ∞ (p ). Let us denote this region by T (p, p ), see Figure 7 for an example of this region. Then, any p 2 ∈ T (p, p ) creates a triangle with p and p in G box that is not present in G ∞ . Finally, any point p 2 ∈ B ∞ (p) ∩ B ∞ (p ) with height y 2 > R creates a triangle with p, p in G ∞ but not in G box .
Let us now define the following triangle count function Then T box (p 0 ) only counts those triangles attached to p 0 that exist in both G box and G ∞ and thus, by definition of the region T (p 0 , p 1 ), The next result, which is crucial for the proof of Proposition 6.5, computes the expected measure of T (p, p ) with respect to p . Lemma 7.2. Let p 0 = (0, y) with y ∈ K C (k n ). Then as n → ∞, The proof of the lemma is not difficult but cumbersome, since it involves computing many different integrals. We postpone this proof till the end of this section and proceed with the main goal, proving Proposition 6.5. First we state a small lemma about the scaling of s(k n ) that will be very useful. Proof. First let 1 2 < α < 3 4 . Then Similarly, for α ≥ 3 4 we have that 4α 2 > 2 and hence, The first key implication of Lemma 7.2 is that the triangle count in the finite box model is equivalent to k 2 n P (y), where P (y) is defined by (21).
Define R := (R × R + ) \ R. Then we can write Then by the Campbell-Mecke formula The first part is taken care of by Lemma 7.2. For the other integral we have Thus we conclude, using Lemma 7.2, that, Therefore, on K C (k n ), where the last part follows from Lemma 7.3 and the fact that s(k n ) 2 = o (s(k n )).
We can now prove the main result of this section.
Proof of Proposition 6.5. First, by Lemma 7.4 and (72) we have By Lemma 5.3 the integral in the second term is (1 + o (1))π(k n ) and thus the second term is o (s(k n )) = o (γ(k n )). Hence it remain to prove that the first term is (1 + o (1))γ(y). Using (71) it is enough to show that Next we recall that by Proposition 3.7 In particular, this implies that P (z − 2 log(1 + o (1))) = (1 + o (1))P (z), uniformly on K C (k n ). We therefore conclude that K C (kn) P (y)ρ box (y, k n )αe −αy dy = (1 + o (1)) which finishes the proof.
From the proof of Proposition 6.5 we immediately obtain the following useful corollary, which will be used in Section 8. Recall from (62) that S kn,C = R ∩ (R + × [y − kn,C , y + kn,C ]). Corollary 7.5. Let p 0 = (0, y). Then, as n → ∞, In particular,

Counting missing triangles
We now come back to computing the expected number of triangles attached to a node at height y in G box that are not present in G ∞ .
Recall that T (p, p ) denotes the region of points which form triangles with p and p in G box but not in G ∞ . Figure 7 shows an example of a configuration where T (p, p ) = ∅. We observe that T (p, p ) = ∅ because the right boundary of the ball B box (p ) exits the right boundary of the box R and then, since we identified the boundaries, continues from the left so that B box (p ) covers part of the ball B box (p) which would not be covered in the infinite limit model. Figure 8: Example for a given p of the boundary function x → b * p (x ), given by the red curve, which determines whether T (p, p ) = ∅. We see that when y = b * p (x ) then (x(p, p ),ŷ(p, p )) = (x * (p ), y * (p )).
The point (x(p, p ),ŷ(p, p )) is the same as (x left ,ŷ left ) from Section 5.3). Using the same approach as there we can compute the other two coordinates, x * (p ) and y * (p ). In total we have the following four expressions: The crucial observation is that T (p, p ) = ∅ as long as the point (x * (p ), y * (p )) is above the left boundary of p. This happens exactly when y * (p ) > b − p (x * (p )), where b − p (z) is defined in (53). Therefore the boundary of this event is given by the equation y * (p ) = b − p (x * (p )) which reads 2 log π 2 e R/2 − y = 2 log π 2 e R/2 − x − y.

Solving this equation gives us the function
which is displayed by the red curve in Figure 8. It holds that y * (p ) > b − p (x * (p )) if and only if y < b * p (x ) and hence we have that T (p, p ) = ∅ for all p ∈ R for which y ≥ b * p (x ). We also note that when y = b * p (x ) the two points (x * (p ), y * (p )) and (x(p, p ),ŷ(p, p )) coincide. This analysis allows us to compute the expected difference in the number of triangles for the finite box model and the infinite model, for a typical node with height y, i.e. prove Lemma 7.2.
Proof of Lemma 7.2. Due to symmetry it is enough to show that The proof goes in two stages. First we compute µ (T (p, p 1 )) by splitting it over three disjoint regimes with respect to p 1 , with x 1 ≥ 0. Then we do the integration with respect to p 1 . Computing µ (T (p, p 1 )) Recall that I n = π 2 e R/2 and define the sets n , for i = 1, 2, 3, see Figure 9. Here the heights of the two intersections are given by h * (y) = y + 2 log I n I n + e y (76) h * (y) = y + 2 log I n I n − e y .
With these definitions we have that the union B n := n i=1 B (i) n denotes the area under the red curve in Figure 8 and hence, for all p 1 ∈ R \ B n with x 1 ≥ 0 we have that T (p, p 1 ) = ∅. So we only need to consider p 1 ∈ B n . We shall establish the following result: Depending on which set p 1 belongs to, the set T (p, p 1 ) has a different shape. We displayed these shapes in Figure 10 as a visual aid to follow the computations below.
where we used that x 1 ≤ e (y+y1)/2 = o (I n ) for all y 1 ≤ y and y ∈ K C (k n ) so that For I (2) n (p 1 ) we have We conclude that for p 1 ∈ B (1) n : which establishes the first part of (78).

In
Here we split the integration into two parts (see Figure 10). Recall that x * (p, p 1 ) = x 1 − I n . Then, for the first part we have were we used that y ≤ y 1 + 2 log(I n /(I n − x 1 )) for p 1 ∈ B (2) n for the third line and for the last line.
Case p 1 ∈ B (3) n : y + 2 log(1 + x 1 /I n ) < y 1 ≤ y + 2 log(I n /(I n − x 1 )) and hence we have For the second integral we have, using that y ≤ y 1 for p 1 ∈ B For the integral we have where we used the upper bound on y 1 and the fact that 2I n − x 1 = Θ (I n ) for all x 1 ∈ [−I n , I n ]. We conclude that I Integration µ(T (p, p 1 )) with respect to p 1 We now proceed with the second part of the computation leading to (75). Here we will integrate µ(T (p, p ))(p, p 1 ) over the region B n := B (1) n , see Figure 9. Let us first identify the boundaries of these areas.
The area B (1) n is bounded from above by the line given by the equation Solving this for x 1 yields x 1 = I n 1 − e (y1−y)/2 and hence the area B (1) n is given by In a similar way we have that B (2) n is bounded from above by line which yields x 1 = I n e (y1−y)/2 − 1 . The lower red boundary is the upper boundary of B (2) n and hence we have We continue in the same way for B We these characterizations of the areas we now integrate µ(T (p, p 1 )) over B n , splitting the computations over the three different areas.
For the first integral we use that e (y+y1)/2 − I n (1 − e (y1−y)/2 ) ≤ e y1/2 e y/2 + e −y/2 to obtain Integration over B n : For this case we show that Here the integral is split into three parts: In(e (y 1 −y)/2 −1) Let us first focus on the first integral. Since I n (e (y1−y)/2 − 1) − I n (1 − e (y−y1)/2 ) ≤ I n e (y1−y)/2 we get, using similar arguments as above Proceeding to the second integral, we first note that e (y+y1)/2 −I n (1−e (y−y1)/2 ) = O I n e (y1−y)/2 so that similar calculations as before yield h(y) h * (y) 8 Concentration for c(k; G box ) (Proving Proposition 6.4) In this section we establish a concentration result for the local clustering function c * (k; G box ) in the finite box model G box . Similar to the previous section we will focus on typical points p = (0, y) with y ∈ K C (k n ).

The main contribution of triangles
Recall that N box (k n ) denotes the number of vertices in G box with degree k n . We first write In particular, the variance of c * (k n ; G box ) is determined by the variance of T box (k n ). Next, recall the adjusted triangle count function as well as the definition of K C (k n ) and write R(k n , C) = [−I n , I n ]×K C (k n ) for the part of the box R with heights in K C (k n ). Slightly abusing notation, we will define the corresponding triangle degree function T box (k n , C) = p∈P∩R(kn,C) and with that a different clustering function.
The idea is that the main contribution of triangles of degree k n to the triangle count T box (k n ) is given by T box (k n , C). Therefore, in order to prove Proposition 6.4 it suffices to show that T box (k n , C) is sufficiently concentrated around its mean. This last part is done in the following proposition.
Proposition 8.1 (Concentration T box (k n , C)). Let α > 1 2 , ν > 0 and let (k n ) n≥1 be any positive sequence satisfying k n = o n 1 2α+1 . Then for any C > 0, as n → ∞, We first use this result to prove Proposition 6.4. The remainder of this section is devoted to the proof of Proposition 8.1. The final proof can be found in Section 8.3.
Proof of Proposition 6.4. We bound the expectation as follows, We will show that both terms are o (s(k n )).
and therefore For the expectation of T box (k n , C) we use that to get where the last line is due to Corollary 7.5. In particular, since the last integral is Θ k Since E [N box (k n )] = (1 + o (1) nπ(k n ) it follows that On the other hand, Proposition 6.5 implies that E [c * (k n ; G box )] = (1 + o (1))γ(k n ) and thus we conclude that 2E [|c * (k n ; G box ) − c box (k n )|] = o (γ(k n )) = o (s(k n )) .
For the remaining term we use Hölder's inequality and Proposition 8.1 to obtain This implies which finishes the proof.
We note that the above proof establishes the following important result

Joint degrees in G box
To prove Proposition 8.1 will use results from Section 5.3 regarding the joint degree distribution in G box . For any two points p, p ∈ R we will denote by ρ box (p, p , k, k ) := P (Po (µ (B box (p))) = k, Po (µ (B box (p ))) = k ) .
the joint degree distribution.
Recall the definition of E ε (k n ) from Section 5.3, as defined in (15). Furthermore, we recall that by Lemma 5.9 the joint degree distribution of two point p, p ∈ E ε (k N ) factorizes, i.e. on the set E ε (k n ) the joint degree distribution in G box is asymptotically equivalent to the product of the degree distributions. We shall now prove a slightly stronger result (Lemma 8.4) which also takes care of bounded shifts in the joint degree distribution ρ box (p, p , k n − t, k n − t ), for some uniformly bounded t, t ∈ Z. For this we first need the following simple result for Poisson distributions.
Lemma 8.3. Let k n → ∞ be a sequence of non-negative integers and X = Po(λ n ) be a Poisson random variable with mean λ n satisfying k n − C k n log(k n ) ≤ λ n ≤ k n + C k n log(k n ) for some C > 0. Then, for any t n , s n = O(1), as n → ∞, Proof. Note that k n > t n , s n for large enough n. Hence, using Stirling's formula, as n → ∞, where we wrote n = (k n − s n )/(k n − t n ). Note that n → 1 and hence √ n → 1. Moreover, since (k n − s n )/λ n → 1 and |s n − t n | = O (1) we have that kn−sn λn tn−sn ∼ 1 Therefore it remains to show that lim n→∞ e (kn−tn) log( n)+tn −sn = 1.
For this we note that for any x, such that |x| ≤ 1/2, we have Write x n = n − 1 = tn−sn kn−tn . Then by the assumptions of the lemma, x n → 0, and thus, for n large enough, In particular e − (tn −sn ) 2 kn−tn ≤ e (kn−tn) log( n )+tn−sn ≤ 1, and the result follows since (tn−sn) 2 kn−tn → 0. We can now prove the main result of this section.
Lemma 8.4. Let 0 < ε < 1, k n → ∞ and let t n , t n , s n , s n ∈ Z be uniformly bounded. Then for any (p, p ) ∈ E ε (k n ), as n → ∞, Proof. Define the random variables Since by Lemma 5.7 µ (B box (p) ∩ B box (p )) = O k 1−ε n , it follows from Lemma 5.9 that The result then follows by applying Lemma 8.3 twice.

Concentration result for main triangle contribution
We now turn to Proposition 8.1. Before we dive into the proof let us first give a high level overview of the strategy and the flow of the arguments.
Recall (see (81)) that for any C > 0 T box (k n , C) = p∈Pn∩K C,n (kn) Then we have This expression can be written as the sum of several terms, depending on how {p, p 1 , p 2 } and {p , p 1 , p 2 } intersect. To this end we define, for a ∈ {0, 1} and b ∈ {0, 1, 2}, T P,n (p, p 1 , p 2 )T P,n (p , p 1 , p 2 ), with the sum taken over all two distinct pairs (p 1 , p 2 ) and (p 1 , p 2 ). Then we have To prove Proposition 8.1 we will deal with each of the I a,b separately, showing that and for all other combinations Note I 1,2 = T box (k n , C) and since (83) implies that E T box (k n , C) → ∞, it follows that (86) holds for I 1,2 .
Recall that R(k n , C) = [−I n , I n ] × K C (k n ) and (63) Let E ε (k n ) c be the same set but with |x − x | n ≤ k 1+ε n and denote by I * a,b the the part of I a,b where (p, p ) ∈ E ε (k n ). Will split the analysis between I * a,b and I a,b − I * a,b . The idea for these two cases is that by Lemma 8.4 it follows that on the set E ε (k n ) and for any uniformly bounded t, t ∈ Z, the joint degree distribution factorizes, ρ box (p, p , k n + t, k n + t ) = (1 + o (1))ρ box (p, k n )ρ box (p, k n ).
In particular this allows us to prove that E I * On the other hand, the expected number of points in where the latter is the expected number of points in R(k n , C) × R(k n , C). Hence we expect the contributions coming from E ε (k n ) c to be negligible.
Proof of Proposition 8.1. Throughout this proof we set i = |{p , p 1 , p 2 , p 1 , p 2 } ∩ B box (p) |, j = |{p } ∩ B box (p) | and define i , j in a similar way by interchanging the primed and non-primed variables. In addition, we write D box (p, p , k, ) to denote the indicator that |B box (p) ∩ (P \ {p, p , p 1 , p 2 , p 1 , p 2 })| = k and |B box (p ) ∩ (P \ {p, p , p 1 , p 2 , p 1 , p 2 })| = . Note that this also depend on {p 1 , p 2 , p 1 , p 2 } but we suppressed this to keep notation concise. Similarly we write D box (p, p , k, ) to denote the indicator that |B box (p) ∩ (P \ {p, p })| = k and |B box (p ) ∩ (P \ {p, p })| = , which now only depends on p and p . Then, by the Campbell-Mecke formula where the sum is over all distinct pairs (p 1 , p 2 ) and (p 1 , p 2 ). We also know that We will now proceed to establish (85) and (86).
Computing I 0,0 We first show that so that for the remainder of the proof we only need to consider p, p ∈ E ε (k n ) and hence, we can apply Lemma 8.4. For J 0 we have, using Lemma 8.4 Next we recall that for all y ∈ K C (k n ) (see (70)), where p = (x , y ) and we used that E T box (p ) = (1 + o (1))k 2 n P (y ), for all y ∈ K C (k n ). Therefore, using that ρ box (p, p , k n , k n ) ≤ ρ box (p, k n ), and thus which proves (87). Here we used that k 2+ε n = o (n) and for the last line.
We will now show that Recall the result from Lemma 8.4, that for (p, p ) ∈ E ε (k n ) and any two uniformly bounded t, t ∈ Z, ρ box (p, p , k n + t, k n + t ) = (1 + o (1))ρ box (p, k n )ρ box (p, k n ).
Therefore, by defining h(y) The difference with E E [T box (k n , C)] 2 is in that the above integral is over E ε (k n ) instead of R(k n , C) × R(k n , C). Since the difference between the two sets is E ε (k n ) c and nk 1+ε n = o n 2 it follows that Thus we conclude that E I * 0,0 = (1 + o (1))E E [T box (k n , C)] 2 , which finishes the proof of (85).
Computing E [I 0,1 ] We first write Then, using that ρ box (p, p , k n , k n ) ≤ ρ box (p, k n ), Recall that E T box (k n , C) = Θ nk −(2α−1) n s(k n ) . Therefore to show that E I 0, for ε small enough. Hence for ε small enough. For (p, p ) ∈ E ε (k n ) we assume without loss of generality that p 1 = p 1 = (x 1 , y 1 ), i.e.
T box (p , p 1 , p 2 ). Now let Z 0,1 denote the part of J 0,1 where y 1 ≤ 4 log(k n ) and y 2 , y 2 ≤ ε log(k n ). We first analyze E [ Z 0,1 | D G box (p), D G box (p ) = k n ]. When y 1 ≤ 4 log(k n ) and both y 2 , y 2 ≤ ε log(k n ) we have that Next, by integrating only over x 2 and y 2 we get where the last line follows from the analysis done for E I 0,0 − I * 0,0 . It now remains to consider J 0,1 − Z 0,1 := Z * 0,1 . We will show that Using that the joint degree distribution factorizes on E ε (k n ) this then implies that which finished the proof of (86) for a = 0, b = 1. We first consider the part with y 1 > 4 log(k n ). Since the integration over x 1 , x 2 and x 2 we get that the contribution to Here the last step follows since for 1 2 < α < 3 Next we consider the case where y 1 ≤ 4 log(k n ) and at least one of y 2 , y 2 is larger than ε log(k n ). Due to symmetry it is enough to consider the case with y 2 > ε log(k n ). Here the contribution to The last line follows since k −1 n = o (s(k n )) for 1 2 < α < 3 4 and k −1 n = O (s(k n )) for α ≥ 3 4 .
Computing E [I 0,2 ] In this case we have We then use that ρ box (p, p , k n , k n ) ≤ ρ box (p, k n ) to obtain were the last line follows since E T box (k n , C) = Θ nk −(2α−1) n s(k n ) and k ε n n −1 = o (s(k n )). For the other term we use the fact that the degree distribution factorizes; where we also used that k −2 n = o (s(k n )).
Computing E [I 1,1 ] Using (89) we get We conclude that k n = o nk −(2α−1) s(k n ) and hence E [ 9 Equivalence for local clustering in G Po and G box In this section we establish the equivalence between c * (k; G n ) and c * (k; G box ) as expressed in Proposition 6.3, using the coupling procedure explained in Section 2.4. As in the previous section we write | · | n for the norm | · | πe R/2 . Recall the map Ψ from (7) Ψ(r, θ) = θ e R/2 2 , R − r , and that B (p) denotes the image under Ψ of the ball of hyperbolic radius R around the point Ψ −1 (p). Under the coupling between the hyperbolic random graph and the finite box model, described in Section 2.4, two points p = (x, y) and p = (x , y ) are connected if and only if see (8). We will often use the result from Lemma 2.2 to approximate the function Φ, for y +y < R, by e where K is a constant determined by the lemma.

Some results on the hyperbolic geometric graph
We start with some basic results for the hyperbolic random geometric graph. Recall that B ∞ (p) = {p ∈ R × R + : |x − x | ≤ e (y+y )/2 } and observe that (10) from Lemma 2.2 implies the following.
Corollary 9.1. For sufficiently large n and p ∈ R, where K is the constant from Lemma 2.2.
Furthermore, Lemma 2.2 enables us to determine the measure of a ball around a given point p = (0, y) -this is will be fairly useful in our subsequent analysis.
Let p ∈ R. Then we can see that the curve x = e 1 2 (y+y ) with x ≥ 0 meets the right boundary of R, that is, the line x = π 2 e R/2 at y = R−y +2 ln π 2 . Hence, any point p ∈ R([R−y +2 ln π 2 , R]) is included in B ∞ (p). In other words, This together with the fact that for any u = (r , θ ), where A B denotes the symmetric difference of the sets A and B. We can now compute the expected number of points in B (p) B ∞ (p), i.e. those that belong are a neighbor of p in only one of the two models. Now, if p ∈ [r n , r n + 2 ln π 2 )] and also p ∈ B (p n ) B ∞ (p n ), then |x n − x| n = π 2 e R/2 − e 1 2 (yn+y) .
Finally, (90) implies that no point in R([r n + 2 ln π 2 , R]) belongs to B (p n ) B ∞ (p n ). We first compute the expected number of points in p ∈ B (p n ) B ∞ (p n ) that have R − y ≤ r n . The result depends on the value of α, yielding the following three cases Next we compute the number of remaining points in B (p n ) B ∞ (p n ), Now note that for any α > 3/2, we have by our assumption on y n . For α = 3/2, these two quantities are equal. From these observations, we deduce that

Equivalence clustering G Po and G box
Here we prove Proposition 6.3. We first note that Lemma 5.1 and Lemma 5.2 imply the following and Moreover, Recall that Proposition 6.3 states lim n→∞ s(k n ) −1 E [|c * (k n ; G Po ) − c * (k n ; G box )|] = 0.
Next recall the definition of K C (k n ) and (82) where T box (k n , C) counts for all nodes p = (x, y) with y ∈ K C (k n ) the pairs (p 1 , p 2 ) that form a triangle with p, with the exception that it considers p 2 ∈ B ∞ (p 1 ) ∩ R instead of B box (p 1 ). Then using Corollary 8.2 we get and hence it is enough to prove that The following lemma will be frequently used in the proof of Proposition 6.3. Proof. Note that on K C (k n ) we have that e ty = Θ k 2t n . Hence, by Lemma 5.3 K C e tyρ n (y, k n − r)e −αy dy = Θ k 2t n K Cρ n (y, k n − r)e −αy dy Proof of Proposition 6.3.
To keep notation concise we abbreviate E [N Po (k n )] and E [N box (k n )] by n Po (k n ) and n box (k n ), respectively. We will also suppress the subscript n in most expressions regarding the graphs G Po and G box . Finally we will write to denote the triangle count function for p in G Po . Then we have The last term can be rewritten as where we used Proposition 6.5 (See Section 7). The first term in this product converges to zero by (93) while the second term scales as s(k n ). Hence and therefore we are left to analyze the other term. By the Campbell-Mecke formula we have that and similar for the other term, it follows that Therefore, by a concentration of heights argument (c.f. Proposition 2.4), it is enough to consider the integral where we also used that f (x, y) is simply a constant multiple of the function e −αy . Since We shall proceed by expanding the integrand and analyzing the individual terms. With a slight abuse of notation we shall write y instead of (0, y) in an expression such as B (y). In addition we write D Po (y, k n ; P) for the indicator which is equal to 1 if and only if B (y) contains k n points from P \ {(0, y)}. We define D box (y, k n ; P) analogously for the ball B box (y). It is important to note that for any p ∈ R it holds that p ∈ B box (y) ⇐⇒ p ∈ B ∞ (y). We need to split the integrand over several terms and then analyze each of these separately. Applying the Campbell-Mecke formula yields where the sum ranges over all distinct pairs of points in P \ {(0, y)}. In what follows, we will set B Po ∞ (p ) = B (p ) (B ∞ (p ) ∩ R) and B Po∩box (p ) = B (p ) ∩ B box (p ) and observe that B Po∩box (y) = B (y) ∩ B ∞ (y). We will now bound the sum that is inside the expectation. We will split the sum into different parts, depending on combinations of p 1 , p 2 ∈ P \ {(0, y)} for which only one of the two terms of the difference is non-zero. Clearly, for this we need that either p 1 ∈ B Po∩box (y) and p 2 ∈ B Po ∞ (p 1 ) or p 1 ∈ B Po ∞ (y) and p 2 ∈ B Po∩box (p 1 ). We will consider the following four cases: 2. p 1 ∈ B (y) \ B ∞ (y) with y 1 < K and p 2 ∈ B Po∩box (y).
3. p 1 ∈ B Po ∞ (y) with y 1 ≥ K and p 2 ∈ B Po∩box (y), where K in the last two cases is the constant from Lemma 2.2.
Observe that when y 1 < (1−ε)R∧(R−y) and y 2 ≥ (1−ε)R∧(R−y) it follows from Corollary 9.1 that p 2 ∈ B Po∩box (p 1 ) and thus we do not have to consider this case when p 1 ∈ B Po∩box (y) and p 2 ∈ B Po ∞ (p 1 ). Similarly, when y 1 ≥ K and p 1 ∈ B Po ∞ (y) Corollary 9.1 implies that p 1 ∈ B (y) \ B ∞ (y) which explains the setting of case 2.
We can now bound the sum by the following expression: In the following paragraphs we will give upper bounds on the expected values of each one of these partial sums.
Note that D ⊆ D 1 ∪ D 2 ∪ D 3 . Hence, we can write We bound each one of the above three summands as follows: and We will bound each term using the Campbell-Mecke formula and show for i = 1, 2, 3 that for since y ω = y + ω and ω → ∞. We then deduce that We now integrate this with respect to y and determine its contribution to (94); We will now bound the term in (104). Using similar observations as for the previous term we get that I Now, Lemma 2.2 implies that for y 2 ≤ R − y 1 , we have that if (x 2 , y 2 ) ∈ B Po ∞ ((x 1 , y 1 )), then x 2 lies in an interval of length Ke 3y/2+3y /2−R , where K > 0 is again the constant in Lemma 2.2. Using these observations we obtain: The integrals satisfy Since y ω := y + ω(n) ≤ R = O (log(n)) we conclude that on K C (k n ) and hence K C (kn) . Now for 1/2 < α < 3/4 it holds that 4α 2 − α + 1 > 0. Hence since k n = O n 1 2α+1 , we have from which we deduce that lim n→∞ k 6α−3 n K C (kn) I (2) n (y)e −αy dy = 0.
For α ≥ 3/4 we have that both n −α log(n)k −1 n and n −2 log(n) 2 k −1 n converge to zero as n → ∞ and hence in this case lim n→∞ k 2α n K C (kn) I (2) n (y)e −αy dy = 0.
We will consider this intersection more closely. We use Lemma 2.2 to define a ball around p 1 that contains both B (p 1 ) and B ∞ (p 1 ). For K > 0, we define, for any point p 1 = (x 1 , y 1 ) ∈ R×R + , It is an implication of Lemma 2.2 that Therefore, any point p 2 = (x 2 , y 2 ) ∈ B Po ∞ (p 1 ) ∩ B (y) with y 2 ≤ R − y 1 must belong tǒ B Po (p 1 ) ∩B Po (y). We will use this in order to derive a lower bound on y 2 as a function of x 1 , y 1 . Let us suppose without loss of generality that x 1 < 0. The left boundary ofB Po ((0, y)) is given by the equation We can solve the above forŷ |x 1 | = (1 + K)eŷ /2 e y1/2 + e y/2 .
But y 1 > R/2 and since y ∈ K C (k n ), it follows that for sufficiently large n, y ≤ (1 + ε)R/(2α + 1). So if ε is small enough depending on α, we have Let c 2 K denote the multiplicative term 1 + K + o(1), which appears in the above. The above yieldŝ In particular, note thatŷ = 0 if and only if |x 1 | ≤ c K e y1/2 . Moreover, since p 1 ∈ B (y) and x 1 ≤ R − y, we also have that |x 1 | ≤ e (y+y1)/2 (1 + o (1)). This upper bound on |x 1 | together with (113), imply that for n sufficiently large, we haveŷ ≤ y. This observation will be used below, where we integrate over y 2 , thus ensuring that the integrals are non-zero. We conclude that p ∈B Po (y) ∩B Po ((x 1 , y 1 )) ⇒ y ≥ŷ(x 1 , y 1 ), which implies If we integrate this over x 2 , y 2 we get Note also that uniformly over all (p 1 , p 2 ) ∈ D 3 . Hence the Campbell-Mecke formula yields that I n (y) equals: In −In In −In Due to the symmetry ofB Po (y), the integration over x 1 is: We will split this integral into two parts according to the value ofŷ(x 1 , y 1 ): c K e y 1 /2 The second integral trivially gives: We conclude that The sums (97) and (98) Again, we will only consider (97) since the analysis for the other term is similar. Recall that in this case, we consider pairs (p 1 , p 2 ), with p 1 = (x 1 , y 1 ) satisfying y 1 ≥ (R − y) ∧ (1 − ε)R, and p 1 ∈ B (y), p 2 ∈ B Po ∞ (p 1 ) ∩ B (y). We split this into three sub-domains: i) y 2 ≥ R − y; ii) R − y 1 ≤ y 2 ≤ R − y and iii) y 2 < R − y 1 . Similar to the analysis above we define In the first case, note that for y ∈ K C (k n ) we have, for small enough ε and sufficiently large n, 2y ≤ 2(1 + ε) R 2α+1 = o (R). Thus y 1 + y 2 ≥ 2(R − y) = Ω(R) and thus p 2 ∈ B (p 1 ) for large enough n. Furthermore, y 2 > R − y 1 + 2 ln(π/2), which implies that p 2 ∈ B ∞ (p 1 ) too. Hence, the contribution from these pairs is zero.
With these computations we obtain Thus, for 1/2 < α < 3/4, we have We now consider the second sub-domain D 2 . The Campbell-Mecke formula yields that: We bound the integral as follows:

Now, by Lemma 2.2,
In −In We then integrate with respect to y 1 : Therefore we get Similarly, for α > 3/4 we have 2α − 1 > 1/2 and we get provided that ε is small enough, so that lim n→∞ k 2α n K C (kn) I (2) n (y)e −αy dx dy = 0.
Due to symmetry, to bound the integral it is enough to integrate this with respect to x 1 from 0 to I n . We will split this integral into two parts according to the value of c(x 1 , y 1 ): In 0 eŷ (x1,y1)(1/2−α) dx 1 = In c K e y 1 /2 e c(x1,y1)(1/2−α) dx 1 + c K e y 1 /2 0 dx 1 .
To calculate the expectation of the above function we need to approximate the intersection of the two ballsB Po (y) andB Po (p 1 ), where p 1 = (x 1 , y 1 ). Let us assume without loss of generality that x 1 > 0. The right boundary ofB Po (y) is given by the equation x = x(y ) = (1 + K)e 1 2 (y+y ) whereas the left boundary ofB Po (p 1 ) is given by the curve x = x(y ) = x 1 − (1 + K)e 1 2 (y1+y ) . The equation that determines the intersecting point of the two curves is whereŷ is the y-coordinate of the intersecting point. We can solve the above forŷ The above yieldsŷ > (y ∧ y 1 ) − 2 log(2(1 + K)) :=ŷ(y 1 , y).
which, in turn, implies the following We thus conclude that  (y 1 , y), R]) e −αy1 dx 1 dy 1 (120) Recall that (B Po ∞ ((0, y)))∩R([R−y+2 log π 2 , R]) = ∅. We will first calculate the measures µ appearing in (120) and (121). The first one is: The second term is: Using these, we get Now, Lemma 2.2 implies that for any y 1 ∈ [0, R − y + 2 ln π 2 ], we have Similarly, for (123) we have We thus conclude, using 2(2 − α)y ≤ y for α > 3/2, that where We proceed to calculate: Computing each of the integral separately we obtain, using Lemma 9.3 and the fact that n = νe R/2 , . Now, we will consider the two cases according to the value of α. First we note that R = O (log(n)) and since k n = O(n 1 2α+1 ) and α > 1/2 we have that Rk 2 n n −1 = o (1). Assume first that 1/2 < α < 3/4. In this case, we want to show that Using the above expression for M i , we have We wish to show that each one of the above three terms is o(1) for k n = O(n Finally, the third one yields: For α ≥ 3/4, we would like to show that Firstly, we note that each M i is as above if 3/4 < α < 1. Therefore, since for this range 2α < 6α−3 the result follows from the above analysis. Next we consider the case 1 ≤ α < 3/2. Here, only the value of M 1 changes and we compute that so that (126) holds for 3/4 < α < 1.
Proceeding with the case α ≥ 3/2, it is only M 2 and M 3 that change values. In particular, for any α ≥ 3/2 we have k n n since k n = o(n 1/2 ) and hence (126) holds. This finished the proof for (99).
The sum of (101) Using the Campbell-Mecke formula, we write Recall that µ (B (y)) = O(1)e y/2 . We bound the integral using Lemma 2.2. In particular, (9) implies that if p 1 = (x 1 , y 1 ) ∈ B Po ∞ (y), then because y 1 < K Therefore, In −In Now, we integrate this over y to obtain that To finish the argument assume first that 1/2 < α ≤ 3/4. In this case, For 3/4 ≤ α < 2 we use that 2α < 6α − 3, so that k 2α n n −2 k 4−2α which completes the proof for (101) and thus the proof of Proposition 6.3.

Coupling G n to G Po
Now that we have established the equivalence of the clustering function between the Poissonized KPKVB graph G Po and the finite box graph G box the final step is to relate the clustering function in G Po to the KPKVB graph G n . As mentioned in Section 6.1, this is done by moving from c(k n ; G n ) to the adjusted clustering function c * (k n ; G n ) (Lemma 6.1) and then to c * (k n ; G Po ) (Proposition 6.2). For this we will use the coupling result (Lemma 5.12) from Section 5.5. We first give the proof of Proposition 6.2 and after that we prove Lemma 6.1. Recall that Proposition 6.2 states lim n→∞ s(k n )E [|c * (k n ; G n ) − c * (k n ; G Po )|] = 0.
Proof of Proposition 6.2]. First we note that Proposition 6.3, 6.4 and 6.5 together imply that Therefore it suffices to show that For this we observe that we are looking at the modified clustering coefficient, where we divide by the expected number of degree k n vertices. As the expected numbers of degree k n vertices in G Po and G n are asymptotically equivalent (see Lemma 5.12), it is therefore sufficient to consider the sum of the clustering coefficients of all vertices of degree k n . Given again the standard coupling between the binomial and Poisson process (as used in the proof of Lemma 5.12), we again denote by V n (k n ) the set of degree k n vertices in G n and by V Po (k n ) the set of degree k n vertices in G Po . If a vertex is contained in both sets, it must have the same degree in both the Poisson and KPKVB graph, and given the nature of the coupling, the neighbourhoods are therefore the same and hence also their clustering coefficients agree. The difference of the sum of the clustering coefficients therefore comes from all the clustering coefficients of the symmetric difference V n (k n )∆V Po (k n ). By Lemma 5.12 the expected number vertices in this set is E [|N n (k n ) − N Po (k n )|] = o (E [N Po (k n )]). Therefore we have that which finishes the proof.

B Incomplete Beta function
Here we derive the asymptotic behavior for the function B − (1 − z; 2α, 3 − 4α) as z → 0, which is used to analyze the asymptotic behavior of P (y), see Section 3.3.

When
3. For α > 3/4, Proof. We use the hypergeometric representation of the incomplete Beta function, where F denote the hypergeometric function [37] (or see [32, Section 8.17 (ii)]). In particular we have that The behavior of F (a, b, c, 1 − z) as z → 0 depend on the real part of the sum of c − a − b and whether c = a + b [3] (or see [32,Section 15.4(ii)]). Since in our case a, b, c will be real it only depends on the sum of and finally, when In our case we have, with a := 2α, b := 4α − 2 and c := 2α + 1. Therefore, where we used that Γ(2α + 1) = 2αΓ(2α). When α = 3/4 then c − a − b = 0 and therefore (130), together with the fact that (1 − z) 3/2 ∼ 1 as z → 0, implies that Finally, when α > 3/4, c − a − b = 3 − 4α < 0 and using (131) we get

D Some results for random variables
Here we summarize several known results for random variables and provide one technical lemma for Binomial random variables. We start with the following concentration result which follows from [18,Theorem 4], together with the note directly after it.
Lemma D.1. Let X n be a sum of n, possibly dependent, indicators and c > 0. Then Next we recall two versions of the Chernoff bound for Poisson and Binomial random variables. They can be found in [34,Lemma 1.2]; note that the Chernoff bound exists in many different versions, the original idea was developed by Chernoff in the context of efficiency of statistical hypothesis testing in [12]):.
It follows from the above lemma that In particular, if λ n → ∞, then, for any C > 0, P |Po(λ n ) − λ n | ≥ C λ n log(λ n ) ≤ 2e Note that these are equations (13) and (14) from the main text. Let Bin(n, p) denote a Binomial random variable with n trials and success probability p, and 0 < δ < 1. Then we have the following well-known Chernoff bound. P (|Bin(n, p) − np| > δnp) ≤ e − δ 2 np 3 . (132) The following lemma gives an upper bound on the Binomial distribution for p = λ/n in terms a Poisson distribution with mean λ. The following lemma gives a standard comparison between Binomial and Poisson distribution. We provide a short proof for completeness. Lemma D.3. Let n ≥ 1, 0 < λ < n. Then, for any integer 0 ≤ k ≤ n − 1, P (Bin(n, λ/n) = k) ≤ e √ 2π n n − k P (Po(λ) = k) .

E Concentration of heights for vertices with degree k
Here we will prove Proposition 2.4. We start by considering integration with respect to the function ρ(y, k n ) = P (Po(µ(y)) = k n ) (the degree distribution of a typical point in G ∞ ). Here we show that we may restrict integration with respect to the height y to the interval K C (k n ) = [y − kn,C , y + kn,C ] on which µ(y) = Θ (k n ). Next we show that if we consider any other measureμ n (y) that is sufficiently equivalent to µ(y) on this interval (which will be made precise later), then we may replaceρ n (y, k n ) := P (Po(μ n (y)) = k n ) in integrals with ρ(y, k n ). This then implies that we can also restrict integration to the interval K C (k n ). We will refer to such results as a concentration of heights result.
We start with a concentration of heights result for the infinite model G ∞ (Lemma E.1). We then present a generalization of this result (Lemma E.1) and use this to establish concentration of heights results for the Poissonized KPKVB G Po and finite box model G box .
Finally we provide a general result that allow to substituteρ n (y, k n ) in the integrand with ρ(y, k n ) and show that this holds in particular for the degree distributions in G Po and G box , given by, respectively ρ Po (y, k n ) := P (Po(µ Po (y)) = k n ) and ρ box (y, k n ) := P (Po(µ box (y)) = k n ).

E.1 Concentration of heights argument for the infinite model
The next lemma states that for a large class of functions h(y) and k n → ∞, to compute the integral ∞ 0 ρ(y, k n )h(y)e −αy dy it is enough to consider integration over a small interval on which e y/2 ≈ k n , instead of R + .
For any continuous function h : R + → R, such that h(y) = O e βy as y → ∞ for some β < α, as n → ∞.
Apply the Chernoff bound (14)  Note that we can tune the error in (133) by selecting an appropriately large C > 0, i.e. by restricting the function h(y) inside the integral to an appropriate interval around 2 log(k n /ξ). This makes Lemma E.1 very powerful. As an example we give the following corollary, which allows us to bound integrals of functions h n (y) by considering their maximum of K C (k n ).

E.2 Concentration of heights for the KPKVB and finite box model
Although powerful, the current versions of the concentration of heights argument is only valid for the function ρ(y, k n ) := P (Po (µ (B ∞ (y))) = k n ). We want to extend this to the Poissonized KPKVB model G Po and the finite box model G box . To be more precise, recall that µ P o (y) = µ (B (y)) and µ box (y) = µ (B box (y)) and let us define ρ Po (y, k) = P (Po(µ Po (y)) = k) and ρ box (y, k) = P (Po(µ box (y) = k) .
Then we want when Lemma E.1 to remain true if we replace ρ(y, k n ) with either the function ρ Po (y, k n ) or ρ box (y, k n ). To establish this result we first prove the following technical lemma.
The conclusion of Lemma E.3 is that as long as µ P o (y) and µ box (y) are (1+o (1))µ(y), uniformly on [0, (1 − ε)R], then indeed the concentration of height result (Lemma E.1) also holds in both G Po and G box . This was proven in Lemma 5.1 and Lemma 5.2, respectively. For completeness we give the proof of Proposition 2.4.

F Derivative of µ P o (y)
Recall that µ P o (y) = µ (B (y)) denote the measure of the ball at height y in the KPKVB model and µ(y) = ξe y 2 denotes the measure of a ball at height h in the infinite model G ∞ . In this section we will show that µ P o (y) = (1 + o (1))µ (y), uniformly on [0, (1 − ε)R], for some 0 < ε < 1. This is a technical result that is needed in the proof of Lemma 5.3 in Section 5.2.
We proceed by showing that For I 3 (y) we first use that y < R − y to bound ϕ(y, y ) as follows, ϕ n (y, y ) ≤ e y −y−R 1 − e y −y−R ≤ e −2y 1 − e −2y .

G Code for the simulations
The simulations of the clustering coefficient and function in the KPKVB model were done using Wolfram Mathematica 11.1. The simulation dots for the clustering coefficient in Figure 2 were generated by the following code (where in the second line, the entire script was also run for the values nu=1 and nu=0. sum=sum+M e a n C l u s t e r i n g The simulation dots for the clustering function in Figure 3 were generated by the following code (where in the third line, the entire script was also run for the values nu=1 and nu=0.5):