The interchange process with reversals on the complete graph

We consider an extension of the interchange process on the complete graph, in which a fraction of the transpositions are replaced by `reversals'. The model is motivated by statistical physics, where it plays a role in stochastic representations of $XXZ$-models. We prove convergence to PD($\tfrac12$) of the rescaled cycle sizes, above the critical point for the appearance of macroscopic cycles. This extends a result of Schramm on convergence to PD(1) for the usual interchange process.


Introduction
Recent years have seen a growing interest in the cycle structure of large random permutations. A major example is the interchange process, or random-transposition random walk. One motivation for studying this process is that it plays a key role in a stochastic representation of the most important quantum spin system, the ferromagnetic Heisenberg model. This representation was developed by Tóth in the early 1990's [23] (after an earlier observation by Powers [20]).
At about the same time, a closely related stochastic representation was discovered for the anti-ferromagnetic Heisenberg model, by Aizenman and Nachtergaele [2]. Very roughly speaking, in the ferromagnetic Heisenberg model the interaction between neighbouring electrons behaves like a transposition of the spins. In the antiferromagnetic model the interaction involves a 'reversal', which Aizenman and Nachtergaele depicted as on the right in  In both cases, the stochastic representation of the spin system involves randomly placing these objects in the product of the graph with an interval. In the case of the ferromagnetic model, the relevant measure has transpositions appearing randomly at each edge, in the manner of independent Poisson processes. For the antiferromagnetic model, the structure is the same except that the transpositions are replaced by 'reversals' as on the right in Figure 1.1. Many quantities of interest for the spin systems, such as correlation functions, may be expressed using expected values of suitable random variables in these processes.
Recently, Ueltschi [25] explained that weighted combinations of the two processes described above also lead to representations of certain quantum spin systems (known as xxz-models). The relevant measure has independent Poisson processes on the edges as before, but the objects are now randomly chosen to be either transpositions or 'reversals', independently over the points of the process and with some fixed probability (see Figure 1.2). In this paper we study such a process defined on the complete graph. Our main result is that the correlation structure in this model, above a critical point, is described by a probability distribution on random partitions called the Poisson-Dirichlet distribution with parameter 1 2 . To state our results more precisely, let us give the relevant definitions.
1.1. Definitions. We consider the complete graph K n = (V n , E n ) on n > 2 vertices. The vertex set is V n = {1, 2, . . . , n} and the edge-set consists of all pairs {i, j} of vertices i = j. To each edge and vertex we attach a circle of circumference 1, which we denote by S 1 . We will sometimes identify S 1 with the unit interval [0, 1). A configuration ω is a finite subset of E n × S 1 × { , }, where , are two possible marks which we call a cross and a bar, respectively. The collection of configurations is denoted Ω. An element (e, ϕ, m) ∈ ω of the configuration is called a link and if (e, ϕ, m) is a link then we say that ω has a link at (e, ϕ) ∈ E n × S 1 .
We will primarily be interested in configurations obtained as samples of a (marked) Poisson point process defined in the following way. Fix ν ∈ [0, 1) and β > 0. For each edge e ∈ E n we consider a Poisson point process with intensity β n−1 on e × S 1 , these Poisson processes being independent for different edges e. This defines a configuration of unmarked links. The configuration ω is then obtained by assigning to each link a mark, independently of all other links, which is either a cross, , with probability ν, or a bar, , with probability 1 − ν. The probability measure corresponding to this point process will be denoted by P β (we consider ν to be fixed and it will be suppressed in the notation), and the corresponding expectation will be denoted by E β . We will refer to this process as the interchange process with reversals (the usual interchange process would correspond to taking ν = 1).
Such a configuration ω gives rise to a set of loops γ ⊆ V n × S 1 . We first give an informal description and then a more precise definition. For a fixed point (v, ϕ) ∈ V n × S 1 the unique loop, γ(v, ϕ), containing it is constructed by the following process. Starting from (v, ϕ) we move on the associated circle in the positive direction, i.e. after time dt we are at a point (v, ϕ + dt). If we encounter a point (v, ϕ ) such that ω has a link at ({v, w}, ϕ ) for some w ∈ V n , then we traverse this link to (w, ϕ ) ∈ V n × S 1 . We then continue moving in the positive direction if the link was a cross, or in the negative direction if the link was a bar. Each time we encounter a link we follow this rule of traversing the link, reversing our direction if the link was a bar. Continuing until we arrive back at (v, ϕ) we have traced out a single loop, γ(v, ϕ).
More formally, following [10,25] we may define the loops as follows. A loop of length L is a function γ : [0, L) → V n × S 1 such that, writing γ(t) = (v(t), ϕ(t)), the following properties hold: (1) γ is injective and satisfies lim t↑L γ(t) = γ(0). for some y = v(t−), in which case v(t+) = y. (4) If I 1 = (t 1 , t 2 ) and I 2 = (t 2 , t 3 ) with γ continuous on I 1 and I 2 but discontinuous at t 2 then for any s 1 ∈ I 1 and s 2 ∈ I 2 we have that d dt ϕ(s 1 ) = d dt ϕ(s 2 ) if the link at ({v(t 2 −), v(t 2 +)}, t 2 ) is a cross and d dt ϕ(s 1 ) = − d dt ϕ(s 2 ) if it is a bar. Loops with the same support but different parameterisations are identified. This means that the functions γ(t), γ(−t) and, for s ∈ R, γ(s ± t) are identified. From this description we can give ω a natural pictorial representation, see  There is one small loop which does not give rise to a cycle.
A cycle is a sequence of vertices v ∈ V n such that the points (v, 0) are visited by a single loop. Namely, suppose we start at a point (v 1 , 0) and follow the loop γ = γ(v 1 , 0), in either direction, until we return to the starting point. If we enumerate the successive visits to V n × {0} as (v 1 , 0), . . . , (v , 0), (v 1 , 0) then the corresponding cycle is Here the cycle C has length |C| = , and the d i ∈ {↑, ↓} denote the direction in which we pass through the point (v i , 0) ∈ V n × S 1 , with ↑ corresponding to the positive direction ( d dt ϕ(t) = +1) and ↓ corresponding to negative direction ( d dt ϕ(t) = −1). Note that the directions d i in a cycle are defined up to an overall reversal and that we made an arbitrary choice of the first vertex v 1 . It is also worth noting that not every loop gives rise to a cycle, see Figure 1.2. A fixed configuration of links, ω, has an associated set of cycles which we denote by C ω .

Main result.
Let ω be sampled from the measure P β . Consider the random graph where an edge is present between vertices u and v if there is at least one link on {u, v} × S 1 in ω. By the Erdős-Rényi theorem, if β > 1 then the largest connected component of this graph, V β G , has size approximately zn where z is the positive solution to 1 − z = e −βz . (If β < 1 the largest component has size smaller than (log n) 2 and the same holds for the largest cycle.) Let X ω denote the list (|C|/|V β G | : C ∈ C ω ) of rescaled cycle sizes, ordered by decreasing size (we make it into an infinite list by appending infinitely many 0's). Our main result is the following.
Theorem 1.1. Let β > 1 and ν ∈ [0, 1). Let ω be sampled from the corresponding measure P β . As n → ∞ the law of X ω converges weakly to the Poisson-Dirichlet distribution PD( 1 2 ). More precisely, we will show that for given β > 1 and ε > 0 there exists n(β, ε) such that for n > n(β, ε) there is a coupling of the interchange process with reversals with a PD( 1 2 ) sample Y such that Note that this result holds for any ν < 1.
The Poisson-Dirichlet distribution with parameter θ > 0, PD(θ), can be defined via the 'stick-breaking' construction as follows. Let B 1 , B 2 , . . . be independent Beta(1, θ) random variables, thus P(B i > s) = (1 − s) θ for s ∈ [0, 1]. We construct a random partition {P i } i∈N of [0, 1) using the B i by letting P 1 = B 1 and P k+1 = B k+1 (1 − P 1 − · · · − P k ). We can think of constructing {P i } i∈N by progressively breaking off pieces of [0, 1), with the (k + 1) th removed piece being a fraction B k+1 of what remained after k pieces had been removed. The law of the partition {P i } i∈N is called the GEM(θ) distribution. The PD(θ) distribution is obtained by sorting the P i in order of decreasing size.
Returning to the context of Theorem 1.1, let us comment on the case ν = 1 (only crosses allowed) which is excluded by our result. This is the (usual) interchange process, and was considered on the complete graph in a famous paper by Schramm [21]. To be precise, he considered the closely related process where the configuration ω is obtained by placing the crosses successively one after the other, uniformly and independently at each step. Viewing the crosses as transpositions, as above, the process is a random-transposition random walk on the set of permutations of n objects. In this case Schramm proved that, when the number of transpositions exceeds cn for c > 1 2 , then the rescaled cycle sizes of the resulting random permutation converge in distribution to PD(1).
The main tool in Schramm's argument was a coupling with a split-merge process which has PD(1) as an invariant distribution. Roughly speaking, the important feature is what happens to an existing cycle when a uniformly chosen transposition is applied. If the transposition transposes two points which belonged to different cycles then those cycles merge; if they belonged to the same cycle then the cycle is split. A similar principle applies to the loops, which on the addition of another cross to ω either merge, if the ends are in different loops, or split, if both ends are in the same loop (Figure 1.3). Now we may explain how the case ν < 1 is different from ν = 1, and why we get PD( 1 2 ) rather than PD(1). The key point is that the presence of bars ( ) introduces changes of orientation within the loops. This means that on adding a link (cross or bar) with both endpoints in the same loop, this loop will not always split. Whether or not the loop splits depends on the orientation of the loop at the points where the new link is placed. Specifically, if the link is a cross then a split occurs if and only if the orientation is the same; if it is a bar then the opposite applies. The situation is depicted in Figure 1.3. Intuitively, when ν < 1 one would expect large loops to encounter many bars. Hence a uniformly chosen pair of points on a large loop (with the same S 1 -coordinate) should have probability close to 1 2 of having the same orientation, meaning that the probability of splitting is close to 1 2 . The corresponding split-merge dynamics, where proposed splits occur with probability 1 2 , has PD( 1 2 ) as its invariant distribution. 1.3. Outline and related works. In order to prove Theorem 1.1 we need three ingredients. Firstly, we need that with high probability (converging to 1 as n → ∞) there are cycles of size Θ(n), and that these large cycles occupy almost the entire giant component V β G . This is proved in Section 2 by a straightforward adaptation of arguments in [21]. Secondly, we need that in large cycles roughly half of all vertices are passed through in the positive direction (↑) and roughly half in the negative direction (↓). In fact, we need the stronger statement that the large cycles are 'well-balanced', namely: one may partition them into much smaller segments such that each segment consists of roughly half ↑ and half ↓ (ruling out, for example, a situation in which a cycle of size k consists of a block of k/2 vertices passed in direction ↑ followed by a block of k/2 vertices with direction ↓). This is the main novel contribution of the present paper, and is the content of Section 4. In proving this result we rely on a process which we call the exploration process, which we study in Section 3. Thirdly, we show that Schramm's coupling, when combined with the previous two ingredients, can be adapted to couple a PD( 1 2 ) sample with X such that the two samples are close. This appears in Section 5.
We now briefly summarise some other related works apart from Schramm's paper [21]. First note that interchange processes, with reversals (ν < 1) or without (ν = 1), can be defined on more general graphs, by placing independent Poisson processes of links on the edges of the graph. Most papers dealing with graphs other than the complete graph have studied the question of whether there can be large cycles. The case when the graph is a hypercube, and ν = 1, has been investigated by Kotecký, Miłoś and Ueltschi [15]. The case of Hamming graphs, also for ν = 1, has been investigated by Miłoś and Şengül [17] and by Adamczak, Kotowski and Miłoś [1]. In the case when the graph is an infinite tree one may ask about the occurrence of infinite cycles. For ν = 1 this question was investigated by Angel [3] and by Hammond [12,13]; and for ν < 1 by Björnberg and Ueltschi [8] as well as Hammond and Hegde [14].
As mentioned above, the original interest in the process was due to its connections with quantum spin systems. When the measure P β defining the process is given an additional weighting of ϑ #loops for ϑ ∈ N, the loop-model is essentially equivalent to a spin system on the same graph. This was first proved by Tóth [23] in the case ν = 1 (spin- 1 2 Heisenberg ferromagnet for ϑ = 2) and Aizenman and Nachtergaele [2] in the case ν = 0 (Heisenberg antiferromagnet, provided the graph is bipartite). This connection was extended to the case ν ∈ [0, 1] by Ueltschi [25]. From a probabilistic point of view, any ϑ > 0 makes sense. Such models have been considered on trees [9,5] and on the Hamming graph [1]. In very recent work there has been some limited progress in the direction of establishing Poisson-Dirichlet structure in these and related loop models [4,7]. For the Heisenberg model (ν = 1 and ϑ = 2) on the complete graph, the critical point for the appearance of cycles of diverging length was established already in the early 1990's by Tóth and by Penrose [18,22]. Notation V n = {1, 2, ..., n} vertex set of the complete graph; typical elements denoted u, v, w, . . . E = V 2 edge set of the complete graph m ∈ { , } the 'mark' of a link as either a cross or a bar ϕ ∈ S 1 'phase' or vertical coordinate Ω space of configurations, i.e. finite subsets of E × S 1 × { , } ω, ω A element of Ω, its restriction to A ⊆ E × S 1 ω, ω k ordered sequence of links in ω, its first k elements β > 0 intensity parameter P β measure of the Poisson links process with intensity β n−1 per edge ν ∈ [0, 1) probability of marking a link as a cross C ω set of cycles defined by ω X ω list of rescaled cycle sizes V β G the largest cluster in the random graph whose edges support links natural filtration of the exploration process I set of the next k vertices in a cycle starting from v c Poisson-Dirichlet distribution with parameter θ Y, Z samples from PD(θ)

Large cycles
In this section we show that, for β > 1, with high probability there are cycles of length of order n. The precise statement appears in Lemma 2.4 at the end of the section. The argument is a minor adaptation of Schramm's [21, Section 2].
As in [21], we will actually work not with configurations ω sampled from the Poisson measure P β , but instead with configurations constructed sequentially one link at a time. Given any configuration ω ∈ Ω, note that the set of cycles C ω only depends on the relative order of the links of ω as well as their position relative to 0 ∈ S 1 , but not on their precise S 1 -coordinates. Given ω ∈ Ω, let us order its elements (links) with respect to the S 1 -coordinate, namely we write ω = {(e 1 , ϕ 1 , m 1 ), . . . , (e |ω| , ϕ |ω| , m |ω| )}, with 0 < ϕ 1 < . . . < ϕ |ω| < 1 (we can assume that there are no distinct links with ϕ i = ϕ j , since under P β this occurs with probability 1). We denote by ω = ((e 1 , m 1 ), . . . , (e |ω| , m |ω| )) the ordered list of links with S 1 -coordinates suppressed. With a slight abuse of terminology we will also refer to the entries (e i , m i ) of ω as links. As noted above, C ω is a function of ω only, hence we may write C ω .
In the rest of this section we will work with a random ω obtained by sequentially laying down a fixed number t of random links. More precisely, first let e 1 be chosen uniformly from the edge-set E n and let m 1 ∈ { , } be chosen independently of e 1 , with probability ν for . Next, given the first s links (e 1 , m 1 ), . . . , (e s , m s ), we select e s+1 uniformly from E n and the mark m s+1 ∈ { , }, with probability ν for , independently of each other and of the previous choices. Write ω s = ((e 1 , m 1 ), . . . , (e s , m s )) and let C s := C ωs denote the set of cycles after s ≤ t steps. Note that, if t is taken to be Poisson-distributed with mean β n−1 n 2 = β 2 n, then C t is equal in distribution to C ω for ω sampled from P β . Due to concentration properties of the Poisson-distribution, there is very little difference between C t for t = β 2 n on the one hand, and C ω for ω sampled from P β on the other. We will not make this statement more precise at this point, deferring this to later (see Section 5.3).
We now describe in detail the effect that appending the next link (e s+1 , m s+1 ) to ω s has on the cycles, that is, the transition C s → C s+1 . See   For d ∈ {↑, ↓} we write −d for the reversed arrow. We have the following: • If the endpoints of e s+1 are in different cycles of C s then those cycles merge.
• If the endpoints are in the same cycle C then the result depends on the mark m s+1 in the following way. Let us assume that C = (v d 1 1 , v d 2 2 , . . . , v d ) and that e s+1 = {v i , v j } where i < j. Without loss of generality (since directions are defined up to an overall reversal) we may assume that d i =↑. -If m s+1 = is a cross then C splits if and only if d j =↑; in this case the two resulting cycles C and C are given by: -If m s+1 = is a bar then C splits if and only if d j =↓; in this case the two resulting cycles C and C are given by: . On the other hand, if d j =↑ then C is not split but modified into C where , . . . , v d ). Note that the edge e s+1 may be selected by first choosing v i uniformly from E n and then v j uniformly from E n \ {v i }. In particular we see that, just as in [21, Lemma 2.1], we have: Lemma 2.1. In the step from ω s to ω s+1 , the probability that some cycle is split into two cycles, with at least one containing at most k vertices, is at most 2k/(n−1).
Building on this, and replicating the arguments of Schramm [21], we obtain the following sequence of lemmas. Lemma 2.2 is proved exactly as [ [21] is that, firstly, if the endpoints of e s+1 are in different cycles then those cycles always merge, and, secondly, the cycle may or may not split if the endpoints are in the same cycle. This means that both the upper bounds on the probability of splitting, as well as the lower bounds on the probability of merging, are identical to [21]. This is all that we need.
In the following statements we consider a random graph G s with vertex set V n obtained by placing an edge between a pair {i, j}, i = j, if in ω s there is at least one link (e r , m r ), r ≤ s, such that e r = {i, j}. We write V s G (k) for the set of vertices in connected components of G s containing at least k vertices. Similarly, we write V s C (k) for the set of vertices belonging to cycles of C s which are of length at least k.
Assume that the following conditions hold in the transition from ω s to ω s+1 : (1) there exists c 1 > 0 such that for any s, k ∈ N, we have (2) there exists c 2 > 0 such that for any s ∈ N and any two cycles C , C ∈ C s we have Then there exist c 3 , c 4 > 0, depending only on c 1 , c 2 , such that if Note that in our case condition (1) is satisfied because of Lemma 2.1 and condition (2) is trivially satisfied. In notation of [1, Lemma 5.1] this corresponds to taking the stopping time τ = +∞, i.e. conditions (1) and (2) hold for all times s ∈ N. The lemma states that if, at some time t 0 , enough vertices are in reasonably large cycles (size ≥ 2 j ) then at some carefully chosen later time most of these vertices will be in cycles of size of the order n. Here one should think of 2 j as approximately n 1/4 and of t 0 ≥ c 0 n for some c 0 > 1 2 . Then |V t 0 G (2 j )| ≈ zn by the Erdős-Rényi theorem, hence by Lemma 2.2 also |V t 0 C (2 j )| ≈ zn. Note that if 2 j = n 1/4 then ∆t is of the order n 3/4 log n n, thus for any c > 1 2 and t 1 ≥ cn we may select c 0 > 1 2 such that t 0 = t 1 − ∆t ≥ c 0 n. The final result of this section paraphrases [21, equation (2.4) in Lemma 2.4]. It tells us that most of the vertices in V t G (the largest connected component in G t ) belong to large cycles. The proof is precisely as in [21] as the only appeal to the particular structure of the cycles is through invoking the previous lemmas, which all hold as in [21].
There is some C 2 > 0 such that for any ε ∈ (0, 1), if n is large enough we have

Exploration processes
An important tool in proving Theorem 1.1 is the exploration process, which we will define in this section. The exploration process is sometimes also called the cyclic-time random walk, see e.g. [13,14]. It will allow us to uncover the loop containing some specified point (v 0 , ϕ 0 ) ∈ V n × S 1 at the same time as we uncover the configuration ω ∈ Ω itself. We will also define a process which we call the simple exploration process which is easier to analyse and which may be coupled with the exploration process. In this section we work with a random ω ∈ Ω sampled from the Poisson measure P β for some fixed β > 1 (the definitions will make sense for all β > 0). Recall that ν ∈ [0, 1) is fixed throughout.
The process starts by traversing {v 0 } × S 1 at unit speed in the direction specified by d 0 , meaning that v t = v 0 , ϕ t = ϕ 0 + d 0 t and d t = d 0 . This continues until it either encounters a link of ω, or it returns to its starting point, i.e. (v t , ϕ t ) = (v 0 , ϕ 0 ). If a link is encountered first, say at time t and with other endpoint in {w} × S 1 , then the process jumps to {w} × S 1 and proceeds in a direction which depends on whether the link was a cross or a bar. That is, we set v t = w and ϕ t+s = ϕ t + d 0 s and d t+s = d 0 if the link was a cross or ϕ t+s = ϕ t − d 0 s and d t+s = −d 0 if the link was a bar. We define the process to be right-continuous (càdlàg). The process proceeds in this way, traversing links and adjusting its direction accordingly, until it returns to the starting point (v 0 , ϕ 0 , d 0 ). We let be the time when this happens. After this time the process is no longer useful to us, but to be definite we declare that the process continues by repeating itself periodically after time τ X . Note that at time τ X , the loop containing (v 0 , ϕ 0 ) has been fully discovered. Let us consider those links that, by time τ X , have been traversed by X at least once. Some of them have been traversed only once, others twice (no link can be traversed more than twice before time τ X as this would entail visiting a previously visited point (v t , ϕ t , d t )). We say that a link is discovered at the time of its first traversal, and backtracked on its second traversal (if traversed twice). Let J X t (v, ϕ) denote the number of times the exploration X, started at (v, ϕ) and run for time t, has discovered a link ('jumped'). Let I X t (v, ϕ) denote the number of times it has traversed some link, including backtracking. Thus 2) This is the set of points in V n × S 1 visited by X up to time t. Finally, let {F t } t≥0 denote the natural filtration of the exploration process, namely F t := σ (X s ) 0≤s≤t , andF t := s>t F s .
When ω is randomly sampled from the Poisson measure P β we may, thanks to the memorylessness of Poisson processes, construct (part of) ω itself simultaneously with X. This fact is central to our approach. We formulate the construction as a proposition. In the following result we will be using a Poisson process N on [0, ∞) and we will say that N rings at time t if it has an arrival at that time.
Proposition 3.1 (Construction of the exploration process). Let v 0 ∈ V n , ϕ 0 ∈ S 1 and d 0 ∈ {−1, 1}. Consider the following independent objects: random variables taking values ±1 and satisfying P(ξ i = +1) = ν. When ω has law P β , then the law of the exploration X, started at X 0 = (v 0 , ϕ 0 , d 0 ), may be constructed as follows: (1) The process starts at X 0 := (v 0 , ϕ 0 , d 0 ) and initially only ϕ t changes, according then nothing happens and the process con- Between successive rings of N the process may backtrack across previously discovered links. More precisely, a backtrack occurs at time t if there exists another time s < t such that v s = v s− and either: The construction of Proposition 3.1 is fairly standard and has been used previously in for example [1,13,14], hence we do not give a proof. Let us however draw attention to the condition that, when (w, ϕ t− ) ∈ H X t− or w = v t− , then the jump proposed by N is canceled. This means that X cannot jump to a previously visited point (w, ϕ), which effectively amounts to a reduction of the intensity of jumps (see Lemma 3.5 below).
The main difficulty in analysing X is that it may discover a new link which takes it to a previously visited copy of S 1 , i.e. it may jump at time t to a point (w, ϕ) satisfying ({w} × S 1 ) ∩ H X s = ∅ for some s < t. We refer to this as jumping to the history. In this case {w} × S 1 has already been partially explored, making X quite difficult to analyse directly.
To get around this problem we introduce what we call the simple exploration process Y = (Y t : t ≥ 0), which is easier to analyse and (on time intervals which are not too long) can be coupled with the exploration process X. Roughly speaking, the idea is that for Y we replace the vertex set V n with an augmented vertex set N × V n , where the N-coordinate increases on discovering a new link. The interpretation is that each newly discovered link brings us to a 'fresh' circle Below we give a detailed definition. Notice that the wording is very similar to Proposition 3.1, the main difference being what happens at the jump times of N .
as a càdlàg process, using the following independent objects: random variables taking values ±1 and satisfying P(ξ i = 1) = ν. Using these sources of randomness the process is constructed as follows: (1) The process starts at Y 0 := (0, v 0 , ϕ 0 , d 0 ) and initially only ϕ t changes, according to ϕ t = ϕ 0 + d 0 t. (2) Whenever N rings, say at time t, we inspect vertex w = v Nt ; we have two cases: (a) If w = v t− then nothing happens and the process continues and then ϕ t evolves according to ϕ t+s = ϕ t− + ξ Nt d t− s.
As we did for X we let be the first time at which the simple exploration process Y arrives back at the starting point. Note that τ Y , in contrast to τ X , may take the value +∞, see Proposition 3.4. If τ Y < ∞ we assume that the process Y stops evolution after τ Y . For Y we define the history by ) the number of times the simple exploration process has discovered (respectively, traversed) a link when started at (0, v, ϕ) and run for time t. We denote by Y the The important point which makes Y simpler to analyse than X is that, each time Y discovers a new link (i.e. N rings and v Nt = v t− ), we set k t to a previously unused value, namely N t . This means that Y , by construction, can never jump to its history. Crucially, it does still backtrack across previously discovered links.

3.2.
Coupling with the simple exploration process. The following lemma shows that when ω is randomly sampled from P β , one can couple the exploration process X and the simple exploration process Y so that they evolve in the same way on a sufficiently short time scale. One should think of T = o(n 1/2 ). Lemma 3.3 (Coupling with the simple exploration process). Fix T > 0. Let X be the exploration process and let σ be a stopping time with respect to the filtration {F t } t≥0 such that X jumps to a previously unvisited vertex at time σ. Conditionally onF σ , there exists a coupling P of a processX with a process Y such that: If at some time t ≤ τ Y we haveX t = Y t then we consider the coupling as failed at time t. The history HX t ofX is defined as in (3.2).
Proof. In the following we write simply P(·) for P(· |F σ ). We will construct X t t≥0 using the same sources of randomness as for Y , namely the same N , {v i } i∈N and {ξ i } i∈N as given in Definition 3.2. We writeX t = (ṽ t ,φ t ,d t ).
(3) Between successive rings of N the process may backtrack as before, using links of both X andX. It follows from Proposition 3.1 thatX and X have the same distribution, giving statements (1) and (2) from the lemma.
For the proof of (3), let be the sets of vertices visited by X up to time σ and byX up to time t, respectively. We define Until time ρ ∧ τ Y the processesX t and Y t are equal, thus This has probability at most the number of previously visited vertices divided by n. Since the number of visited vertices cannot exceed the number of discovered links by more than 1, we get where in the last equality we used that N T has Poisson distribution with mean Properties of the exploration processes. Next we present some basic properties of the processes X and Y , starting with the simple exploration Y .
First note that J Y t , the number of links discovered by time t, is a Poisson process with rate β, stopped at time τ Y (at which time Y itself terminates). It will be convenient to extend this process beyond time τ Y . For this purpose we let N t denote a Poisson process of rate β which agrees with J Y t up to time τ Y . Many relevant properties of Y can be understood in terms of the process Z given by Z t := N t − t. For example, if Z t hits −1 then this corresponds to Y returning to its starting point, that is to say we have that τ Y = inf{t ≥ 0 : Z t = −1}. To see this, note that N t + 1 counts the number of copies of S 1 that Y has visited by time t. If Z t = −1 then N t + 1 = t, which means that the total time spent equals the number of S 1 's visited. Hence at this time Y has explored the entirety of each copy of S 1 it has visited, meaning that it must have returned to its starting point. See Figure 3 Note that β > 1 implies that Z t → +∞ almost surely. We define a sequence of random times which we call frontier times k , as well as processes Z (k) = (Z k +t − Z k ) t≥0 , as follows. First, we let 0 := 0 and Z (0) := Z. Next, we let 1 be the time when Z t− attains its global minimum (note that as Z t → +∞ almost surely, this time is almost surely finite). This is necessarily a jump time of Z (equivalently, of N ) but it is not a stopping time. Inductively, k+1 is the time when Z (k) t− attains its global minimum. We also write ∆ k = k+1 − k for the time spent between successive frontier times. See Figure 3.3 for a sample trajectory of the process Z t with frontier times marked. In terms of the simple exploration Y , the frontier times k play the following role. Recall that the jump times of Z are exactly the times when Y discovers a new link. The frontier times are the times when Y discovers a new link which is never backtracked. (2) There exists C, c > 0 such that The proof is based on well-known properties of Poisson processes, for completeness we provide details in Appendix A. The first two parts of the proposition tell us that the simple exploration either continues indefinitely, or it closes 'quickly'. Intuitively, the former scenario parallels the situation when the (true) exploration process X explores a large cycle. The other two parts tell us that, conditionally on Y 'surviving', the frontier times k are renewal times, and the renewal intervals ∆ k are typically short.
We now turn to discussing some properties of the exploration process X. Recall that J X t := #{s ≤ t : X discovers a new link at time s}. Let N X t := #{v ∈ V n : (v, ϕ) ∈ H X t for some ϕ ∈ S 1 } denote the total number of vertices visited by X up to time t, and let A his t := {N X t ≤ n 2 } denote the event that no more than n/2 vertices have been visited up to time t. Note that N X t ≤ J X t + 1 and that J X t ≤ N t where N is the Poisson process of rate n β n−1 in Proposition 3.1. From this and a simple argument using Laplace transform we see that In particular, if t = o(n) then P(A his t ) ≥ 1 − e −cn for some c > 0. By A X t we will denote the set of vertices available to the exploration X at time t by means of a new jump, i.e.
Recall that a counting process is a nondecreasing, integer valued càdlàg stochastic process starting at zero and with jumps equal to one. Let J be an F t -adapted counting process. We will say that a nonnegative process λ is an s. for all t, and the process J t − t 0 λ s ds is an F t -martingale. Lemma 3.5 (Intensity of jumps). The processes J X and N X are counting processes with intensities λ, µ given respectively by In particular, on the event A his t we have µ t ≥ β 2 . A proof of this rather intuitive statement may be found (in a more general setting) in [1,Lemma 3.7]. The following lemma also appears in a more general form in [1, Lemma A.2], we include its proof here for the sake of completeness. Lemma 3.6. Suppose M is a counting process with intensity λ and let Λ t = t 0 λ s ds. Let σ, τ be stopping times such that σ ≤ τ . Let > 0. Then we have Proof. Consider any A ∈ F σ with positive probability and the process M t = M σ+t − M σ , which is a counting process with intensity λ t = λ σ+t with respect to the N be a Poisson process with intensity 1 such that N t = M Λt almost surely (see [1, Theorem A.1] and references there). We get Using the form of the Laplace transform of N and Chebyshev's inequality we obtain where in the second step we have used the elementary inequality e −a − 1 + a ≤ 1 2 a 2 valid for a ≥ 0. Thus we get , for arbitrary A ∈ F σ of positive probability, which implies the lemma.
Corollary 3.7 (Visits to previously unvisited vertices). Let σ be a stopping time with respect to the filtration of the exploration process X and let η X (σ) be the first time after σ when X makes a jump to a previously unvisited vertex. For any t > 0 on the event A his σ we have Proof. By definition of η X (σ), between times σ and η X (σ) ∧ τ X there are no jumps to previously unvisited vertices. In particular η X (σ) ∧ τ X − σ ≥ t implies that N X σ+t − N X σ = 0 and that A his σ+t holds. Thus Lemma 3.5 implies, with µ t being the intensity of N X t and Λ t = t 0 µ s ds, that then Λ σ+t − Λ t ≥ t/2. Applying Lemma 3.6 with M t = N X t , τ = σ + t and = t/2 easily gives the desired estimate.

Balance
This section contains the main work of the paper. The goal of the section is to prove that large cycles are 'balanced' in the sense that they contain roughly equal numbers of vertices passed in the directions ↑ and ↓. In fact we show that, with high probability, in a cycle which is at least n 1/2 long each segment of n 1/2 consecutive vertices is balanced in this sense. Throughout the section we work with a random ω ∈ Ω sampled from the Poisson measure P β for some fixed β > 1 (recall that ν ∈ [0, 1) is fixed).
We start by introducing some notation. Given ω ∈ Ω, v ∈ V n and k ∈ N, let us Without loss of generality we may assume that v = v 1 and that d 1 =↑. Under these assumptions, we let which are passed in the same, respectively opposite, direction as v. Finally we define the balance B ω (v, k) of the segment of length k after v in the loop, as The main result of this section is the following proposition, which tells us that B ω (v, n 1/2 ) is typically of much smaller order than n 1/2 . Proposition 4.1 (Segments of cycles are balanced). Let β 1 > β 0 > 1. There exist C, c > 0 such that for all β ∈ [β 0 , β 1 ] and for any v ∈ V n , we have This says that cycles of length at least n 1/2 are very likely to have balance |B| < n 5/12 log 3 n n 1/2 . Cycles containing fewer than n 1/2 vertices may possibly be unbalanced, but this does not concern us. A key feature of this result is that the upper bound kills any polynomial in n, making it possible to use quite crude union bounds later in the proof of Theorem 1.1. Also note that the bound is claimed to be uniform in β ∈ [β 0 , β 1 ] for any β 1 > β 0 > 1. This will allow us to derive a version of the proposition stated above where we 'remove' a deterministic number of links from ω, which will be important for the coupling with PD( 1 2 ) in Section 5. To formulate the last claim precisely, recall the notation ω for the ordered list of links of ω, and note that all depend on ω only. Also recall that, for s ≤ |ω|, we write ω s = {(e 1 , m 1 ), . . . , (e s , m s )} for the sequence of the first s links. If X ω is a random variable which only depends on the relative order of links in ω we write X ω for its value on any link configuration with the same relative order.
There exist C, c > 0 such that The proofs are given at the end of the section, after several preparatory results.
for the exploration and simple exploration, in particular that d t ∈ {−1, +1} indicates the direction of motion. We will use superscripts X and Y on v t , ϕ t , d t to distinguish between the two processes. Define the winding processes L X t t≥0 and L Y t t≥0 by Thus L X increases at rate 1 when the process X travels in the positive direction, otherwise it decreases at rate 1, and the same is true for L Y .
To prove Proposition 4.1 we will first estimate L X , and then transfer these estimates to B. In order to estimate L X we will use the coupling of X and Y introduced in Lemma 3.3, together with the following estimate on L Y : Proposition 4.3 (Winding of the simple exploration process). Let β 1 > β 0 > 1. There exist C, c > 0 such that for any β ∈ [β 0 , β 1 ], T > 0 and s ∈ [1, T 1/2 ] we have Proof. Fix β ∈ [β 0 , β 1 ]. We use Proposition 3.4 and the notation therein, we also write P(·) = P β (· | S). For lighter notation, within this proof let L t = L Y t . At each frontier time k the process Z jumps, meaning that Y discovers a new link. We let * 0 = 0 and let { * k } k≥1 be the subsequence consisting of the times k at which the link is marked as a bar (i.e. ξ i = −1 in the notation of Definition 3.2). As the choice of markings is independent of Z, using Proposition 3.4 we conclude that ∆ * k := * k+1 − * k form an i.i.d. sequence under P, satisfying for someC,c > 0. Also, the increments L * k+1 − L * k are independent under P. Now, the key observation is that we have the equality in distribution because upon crossing a bar the winding processes changes its orientation. Using these facts we infer that for any k ∈ {0, 1, . . .} are symmetric random variables with {Q 2k } k≥0 being independent. Moreover, by |L * k+1 − L * k | ≤ ∆ * k and (4.1) they have exponential tails. Let us set K = c 1 T , for some c 1 > 0 to be chosen later. We consider the As L t is continuous in t, we can replace the supremum of |L t | by maximum. The maximum max 0≤t≤T L t can then be bounded by the maximum at endpoints * 2k plus the maximum increment over all the intervals The first of these two terms can be bounded by applying a union bound and Etemadi's inequality [6,Thm. 22.5], giving where we have used that ∆ * k are i.i.d. By symmetry of Q 2i and Markov's inequality we have for any θ > 0 that For the second term on the right in (4.1) we recall that the ∆ * k 's have exponential tails. As s ∈ [1, T 1/2 ], we thus obtain for some C 4 , c 4 , C 5 , c 5 > 0 Finally, by standard large deviation considerations for i.i.d. variables we have for some C 6 , c 6 > 0, provided we choose c 1 large enough so that the mean of the sum above is larger than T . This proves the claim for any fixed β ∈ [β 0 , β 1 ]. The uniformness over such β follows since the upper bound can be chosen as a continuous function of β.
In order to compare L with B we need to keep track of how many times the exploration passes level 0 ∈ S 1 . To this end we make the following definitions. Denote by K X t (v, ϕ) (respectively, K Y t (v, ϕ)) the number of times X (respectively, Y ) passes through 0 ∈ S 1 when started at (v, ϕ) moving in the positive direction (d 0 = +1) and run for time t. We will write K X t (respectively, K Y t ) when the starting point is not ambiguous. Recall the definition of I X t (v, ϕ) given right after (3.4).
Proof. The proofs of (4.2) for X and Y are the same. We write K t and I t , omitting X, Y , v and ϕ in order to simplify the notation. Define a sequence of times τ 0 := 0 and, for i ≥ 1, We first claim that for i ≥ 1 we have Indeed, at time τ i the exploration passed 0 ∈ S 1 and if it passes 0 again without traversing a link this means that it has completed a full lap on one copy of S 1 . From (4.4) we deduce that Let k be such that t ∈ [τ k , τ k+1 ). Then as claimed. Now we turn to (4.3). Recall from Proposition 3.4, and the discussion preceding it, the notation ∆ k and Z (k) as well as the notion of frontier times. Let us use the term return times for the jump times of Z which are not frontier times, and return links for the corresponding links traversed by Y . Observe that where R Y t denotes the number of return links which have been backtracked by Y up to time t. This is because Y does not visit its own history other than by backtracking, hence between discovering a return link and backtracking it Y must complete at least one circle.
Let R k denote the total number of return times of (Z t ) k ≤t≤ k+1 . By Proposition 3.4, conditionally on S the sequence {(∆ k , R k )} k≥1 is a renewal-reward process, and by the basic renewal-reward theorem [11,Theorem 10.5] it thus follows that , as t → ∞.
The result (4.3) follows from E[R 1 | S] > 0, which is easily checked. For example, it suffices to check that P(R 1 = 1 | S) > 0. Letting σ 1 , σ 2 , . . . denote the jump times of Z and using that P(R 1 = 1 | S) = P( 1 = σ 2 | S) we get We now come to the key technical result of the paper, an upper bound on L X when X explores part of a large cycle. At the same time we also provide a lower bound on K X since the proof follows a similar structure.
Proposition 4.5 (Winding for the exploration process is small). Let β 1 > β 0 > 1, and consider the exploration X started at an arbitrary point (v, ϕ, d). There exist C 1 , c 1 > 0 such that for any β ∈ [β 0 , β 1 ] we have Before giving the proof we outline the main ideas. We want to use the coupling of the exploration process X to the simple exploration process Y from Lemma 3.3, as well as the concentration result for the latter process, Proposition 4.3. To get good concentration we will decompose [0, n 1/3 ] into many shorter time intervals [t i , t i+1 ) of length approximately n 1/6 each. On each [t i , t i+1 ) we will wait for a 'good' coupling with a simple exploration: first we wait until the exploration X jumps to a new vertex so we can start a coupling, then we check if the simple exploration survives indefinitely, which it does with probability z > 0. If so, we can apply Proposition 4.3 in this interval. If not, then we repeat the procedure, waiting for a jump of X to a new vertex and looking at the coupled simple exploration. Typically we only need to perform this a small number of times until we get a coupling with a simple exploration which survives.
Let us make these ideas formal and introduce the setup that will be used in the proof. Set a n := n 1/3 . We define t i := i · b n , where i ∈ 0, 1, . . . , n 1/6 , b n := a n / n 1/6 . Writing m = n 1/6 − 1, we decompose Fix i ∈ {0, 1, . . . , m}. Let us first analyse the change of the winding process on one interval [t i , t i+1 ). To this end we will define two sequences of times σ i k +∞ k=0 and τ i k +∞ k=0 as well as a sequence of simple explorations {Y i k } ∞ k=1 . The σ i k will form a non-decreasing sequence, taking values in [t i , t i+1 ], and will be defined so that, for k ≥ 1 and as long as σ i k < t i+1 , the process X jumps to a new vertex at time σ i k . For such k, the process Y i k is defined to be an independent copy of a simple exploration, coupled with X as in Lemma 3.3, starting at time σ i k . The possibility σ i k = t i+1 signifies that we have finished with the interval [t i , t i+1 ) and must move on to the next one.
We now define the times σ i k and τ i k . First we set σ i 0 := t i , τ i 0 := 0. Next, for k = 1, 2, . . . we set In particular, this will occur if τ i k−1 = ∞. In this case we do not need to define Y i k , Y i k+1 , . . . Also note that, since the coupling of Y i k with X entails constructing both processes using the same sources of randomness, we may work with the τ i k as if they are adapted to the filtration of X, even though they are defined in terms of Y .
In words, these definitions mean that, firstly, Y i 1 is a simple exploration coupled with X, started at time σ i 1 , the first time in [t i , t i+1 ) that X jumps to a new vertex. This coupling is then run either for the remaining time in [t i , t i+1 ), or until Y i 1 returns to its starting point (after time τ i 1 ). For k ≥ 2, if the simple exploration Y i k−1 has returned to its starting point, at time σ , then we wait until X jumps to a new vertex again. We call the time when this occurs σ i k and we begin a new coupling with a simple exploration, Y i k , from the location of X at this time. Let The first possibility, τ i k 0 = +∞, means that at attempt number k 0 the coupled simple exploration Y i k 0 survives (and is the first one with this property). The other possibility, that τ i k 0 < +∞ but σ i k 0 +1 = t i+1 , means that after time σ k 0 the exploration X never jumps to a new vertex until the end of the interval [t i , t i+1 ). Included in this possibility is the case when X closes the loop before jumping again. Intuitively, k 0 is the number of attempts at coupling X with a simple exploration which survives, until we either succeed or run out of time.
We now turn to the proof of the proposition.
Proof of Proposition 4.5. Fix β ∈ [β 0 , β 1 ]. We first show (4.5) for this β. Recall the definition of the event A his t given above (3.7). First note that it suffices to show that P β A his an ∩ |L X an |1I {τ X ≥an} ≥ 3n 1/4 log n (4.5) satisfies the claimed bound, due to (3.7). Also note that A his an ⊆ A his s for s ≤ a n . Consider the interval [t i , t i+1 ), the stopping times σ i k , τ i k and the variable k i 0 from the preceding discussion. Keeping i fixed for now, we will drop it from the superscript on σ k , τ k and k 0 . We claim that, under P(·|F t i ), the random variable k 0 is stochastically dominated by a geometric distribution with parameter z, that is to say, Here z > 0 is the survival probability of a simple exploration, see Proposition 3.4. The claim is easily established by induction, using and We now show that there exist constants C 2 , c 2 > 0, uniform in n and in i, such that for all t > 0, on the event A his an ∩ {τ X ≥ a n } we have First we establish that there are C 1 , c 1 > 0 such that for any k ≥ 0 and any t > 0, on A his an ∩ {τ X ≥ a n } we have The second term is at most Ce −ct , for some C, c > 0, by (3.5) from Proposition 3.4. To estimate the first term, note that τ k ≤ t/2 together with σ k+1 − σ k ≥ t implies that σ k+1 − (σ k + τ k ) ≥ t/2, in particular X does not visit previously unexplored vertices for time at least t/2 after σ k + τ k . Thus By Corollary 3.7 the probability is at most e −c t for some c > 0, which together with the previous estimate proves (4.8). Now note that where by (4.8) each summand, conditionally on all previous terms, has exponential tails. Since k 0 , the number of summands, is by (4.6) itself dominated by a geometric random variable, one may conclude that the sum itself has exponential tails, as claimed in (4.7). In more detail, we have for any k > 0 that Now for any θ > 0 we have Here the inner factor may be written as Using (4.8) we conclude that we may choose θ > 0 (depending on constants c 1 , C 1 in (4.8)) such that, on A his an ∩ {τ X ≥ a n }, say, for all k ≥ 0. It follows by induction that and hence Setting k = θ 2 t and using (4.6), this gives (4.7). Recall the notion of a failed coupling from Lemma 3.3. The bound (4.7) tells us that typically we don't wait too long for a coupling with a simple exploration process that survives. If the coupling doesn't fail, we will be able to transfer estimates of the winding process from the simple exploration to the process X.
To this end we distinguish three possible scenarios of what can happen during a given time interval. We say that the interval [t i , t i+1 ) is good, denoting this event by G i , if the following hold: • τ i k 0 = +∞, and • none of the k 0 attempted couplings failed until time T = t i+1 − t i ≤ n 1/6 . On the event G i the coupling started at time σ i k 0 survives and it lasts until time t i+1 , in particular X cannot close its loop before time t i+1 (as this would entail X returning to some vertex visited before time σ i k 0 and hence Y k 0 returning to its starting point, i.e. τ Y k 0 < ∞). Thus G i ⊆ {τ X ≥ t i+1 }. Next, we say that the interval [t i , t i+1 ) is terminal if τ i k 0 < +∞ and σ i k 0 +1 = t i+1 , and we denote this event by T i . Note that {τ X < t i+1 } ⊆ T i . Finally we let B i = (G i ∪ T i ) c , and if this event occurs we say that the interval [t i , t i+1 ) is bad. On this event one of the attempted couplings failed.
Let us now estimate the winding process on each of the above events. We have where for the second and third term we used the trivial estimate that We will now estimate each of the three terms in (4.10) separately. Let us start with the first one. We will estimate |L X t i+1 − L X t i | on the event G i ∩ A his an . Since L X increases at rate at most 1, we have the estimate By (4.7), on A his an the first term on the right hand side in (4.11) is at most log 2 n with probability at least 1 − C 2 e −c 2 log 2 n . In the second term we may, on is the simple exploration started at time σ i k 0 . Now we apply Proposition 4.3 with s = 1 2 log n and T = b n . As Thus P A his an ∩ |L X t i+1 − L X t i |1I G i ≥ b 1/2 n log n ≤ C 3 e −c 3 log 2 n , for some C 3 , c 3 > 0. Furthermore, applying a union bound we obtain that (recalling m = n 1/6 − 1) We now move to the second term of (4.10). We need to estimate P(B i |F t i ). To this end notice that by Lemma 3.3 the probability for any given coupling to fail is bounded above by 4βb n (J X an + βb n )/n ≤ 4βa n (J X an + βa n )/n. Defining D 0 := J X an ≤ 4βa n , we have, for any k > 0, that Recalling that a n = n 1/3 and choosing k = n 1/12 it follows that, for some C > 0, We claim that, on the event D 1 := {J X an ≤ 2βa n }, we have for large enough n that p n,i ≤ 2Cn −1/4 with C from (4.12). Indeed, . On the right-hand-side, the indicator vanishes on D 1 , and the probability is at most C 1 e −c 1 an for some C 1 , c 1 > 0, since J X t is a counting process with intensity bounded above by β, see Lemma 3.5 and the argument for (3.7). The claim follows.
Thus, employing (4.12), we obtain for large enough n that . The sum inside the first probability is a martingale, with increments bounded by 1. Thus by the Azuma inequality (see e.g. [16, Theorem A.10]) we get As before we have that P(D c 1 ) ≤ e −cn for some c > 0. Taken together, these facts give (4.14) for some C 4, c 4 > 0.
Finally, let us now consider m i=0 1I T i 1I {τ X ≥an} . Observe that for i ≤ m − 1 the event T i ∩ τ X ≥ a n requires that the exploration neither jumps to an unvisited vertex nor closes the loop for a time period of at least n 1/6 . By a similar application of Corollary 3.7 as for (4.9) the latter event has probability smaller than C 4 e −c 4 n 1/6 , for some C 4 , c 4 > 0. We thus have P T i ∩ τ X ≥ a n ≤ C 5 e −c 5 n 1/6 , for some C 5 , c 5 > 0.
Since n 1/6 b 1/2 n log n + b n n 1/12 log n + b n ≤ 3n 1/4 log n, this concludes the proof of (4.5) for a fixed β ∈ [β 0 , β 1 ]. The uniformness over such β follows since the upper bound can be chosen as a continuous function of β.
Now we turn to (4.5). We aim to do a similar decomposition as above, and as before it suffices to work on the event A his an . For i ∈ {0, 1, . . . , m}, let Y i be the coupled simple exploration started at time σ i k 0 , and let S i be the event that Y i survives. On the event G i we have in particular that S i occurs, and we can use Hence P(A his an ∩ {K X an < c 2 a n } ∩ {τ X ≥ a n }) Using (4.7) we get for some C 8 , c 8 > 0 To bound the first probability, we use that The processes Y i are i.i.d. and by (4.3) from Proposition 4.4 we get Hence by standard large deviations estimates we get for some C 7 , c 7 > 0 P m i=0 K Y i bn−log 2 n 1I S i < c 2 a n ≤ C 7 e −c 7 log 2 n provided we pick c 2 small enough. It remains to bound the contributions involving B i and T i . Recall from Proposi- and from the observations preceding (3.7) that the number of 'jumps' J Y i t is dominated by a Poisson process with rate β n n−1 . It follows that, with probability at least 1 − C 8 e −c 8 log 2 n , we have that K Y i bn−log 2 n ≤ 5βb n for all i ≤ m. Consequently, using also (4.14) 1I B i ≥ n 1/12 log n + C 8 e −c 8 log 2 n ≤ C 9 e −c 9 log 2 n + C 8 e −c 8 log 2 n .
Finally, for the terms involving T i we again use that K Y i bn−log 2 n ≤ 5βb n for all i ≤ m, with probability at least 1 − C 8 e −c 8 log 2 n , combined with (4.1) to get This establishes (4.5).

4.2.
Proofs of Propositions 4.1 and 4.2. Now we turn to the proofs of the main results of Section 4, namely Propositions 4.1 and 4.2, which concern the balance B(v, n 1/2 ) in cycles of length at least n 1/2 . We start with the following corollary of Proposition 4.5, which states that the bounds in that proposition hold uniformly over all possible starting points (v, ϕ) ∈ V n × S 1 for the exploration process X. We use the notation L X t (v, ϕ), K X t (v, ϕ) and τ X (v, ϕ) when X starts at (v, ϕ). Corollary 4.6. Let β 1 > β 0 > 1. There exist C, c > 0 such that for all β ∈ [β 0 , β 1 ] we have 15) and, for some c 2 > 0, In the proof we will use the following notation. For a measurable subset A ⊆ E × S 1 and ω ∈ Ω we denote the restriction ω A := {(e, ϕ, m) ∈ ω : (e, ϕ) ∈ A} .
Also recall that we will often identify S 1 with the interval [0, 1).
Proof. We give details for (4.15), the argument for (4.16) is very similar. Write B(v, ϕ) = |L X n 1/3 (v, ϕ)|1I {τ X (v,ϕ)≥n 1/3 } ≥ 3n 1/4 log 2 n . We fix ε > 0 to be a small enough positive constant (to be specific, ε needs to be smaller than the constant c 1 in the exponent on the right-hand-side of (4.5)). Let m = e ε log 2 n and define the growing sequence of sets A i := E × [0, β 0 /β 1 + iδ], where δ = δ n := 1 m 1− β 0 β 1 and i ∈ {0, 1, . . . , m}. We will consider the sequence ω A i which we think of as revealing the configuration ω in increments of size δ. Consider the event D := ω : |ω A i \A i−1 | ≤ 1 for each i ∈ {1, . . . , m} that each step in the sequence reveals at most one more link. Since |ω A i \A i−1 | is Poisson distributed with mean n 2 β n−1 δ n we have for someC, C 2 , c 2 > 0 that (4.17) Thus it suffices to show that P(∪ (v,ϕ) B(v, ϕ) ∩ D) satisfies the bound (4.15). Now on D, to determine if there is some (v, ϕ) ∈ V n × S 1 for which B(v, ϕ) holds it suffices to consider ϕ of the form ϕ i = β 0 /β 1 + iδ for 0 ≤ i ≤ m. Indeed, if ϕ is arbitrary, let i be such that ϕ i−1 ≤ ϕ ≤ ϕ i . Then (on D) the exploration started at (v, ϕ) agrees either with that started at (v, ϕ i−1 ) or that started at (v, ϕ i ) (up to a small time-shift of size at most δ which we will ignore). Hence, using Proposition 4.5, For ε > 0 small enough, this satisfies the claimed bound.
To proceed we will need some notations and observations which will allow us to relate the winding process, L, to the balance of cycles, B. Let X s = X s (v 0 , ϕ 0 , d 0 ) denote the exploration started at (v 0 , ϕ 0 ) in the direction d 0 ∈ {−1, +1}, viewed at time s. Let us write X s = (v s , ϕ s , d s ) and define (Although formally the summation is over an uncountable set, almost surely there is only a finite number of nonzero terms.) In words, B X s totals the number of visits of X to level ϕ = 0, counted with the sign given by the direction of travel. Note that our previously defined balance-quantity may be written as where τ k is the first time at which X has made k visits to level ϕ = 0.
It is easy to see the following: for any starting point (v 0 , ϕ 0 , d 0 ) and any t ≥ 0 we have that Indeed, for t = 0 the two terms are either 1 and 0 (if ϕ 0 = 0) or 0 and 0 (if ϕ 0 = 0). As t increases, |B X t | stays constant until X passes level ϕ = 0, at which time it changes by 1. Until this time |L X t | can change by at most 1, since if it changes more then this necessarily means that X passes level ϕ = 0; hence the difference in (4.18) is certainly bounded by 2. Between successive visits to ϕ = 0 it remains bounded by 2 for the same reason. Finally, after the last visit to ϕ = 0 we may have that L X t changes by up to 1 while |B X t | remains constant. Thus the difference is at most 3.

Proof of Proposition 4.1. Let
where c 2 is as in (4.5). (To see that A n and B n are measurable, note that one gets the same events if ϕ is restricted to rationals.) By Corollary 4.6 we have P(A n ∩B n ) ≥ 1−Ce −c log 2 n for some C, c > 0, so it suffices to consider ω ∈ A n ∩B n .
Suppose v is such that |C ω (v)| ≥ n 1/2 (otherwise there is nothing to prove). For i ≥ 1, let t i = in 1/3 and let i 0 := min{i ≥ 1 : K X t i (v, 0) ≥ n 1/2 }. Thus by time t i 0 the exploration (started at (v, 0)) has visited the first n 1/2 vertices in C ω (v) following v. Since ω ∈ B n , the contributions to K X t between successive t i are all at least c 2 n 1/3 ; using the additivity of K X t we conclude that i 0 ≤ c −1 2 n 1/6 . Let us write X t i = (v i , ϕ i , d i ). Note that (using (4.18)) As ω ∈ A n we get |B(v, n 1/2 )| ≤ (i 0 − 1)(3n 1/4 log 2 n + 3) + n 1/3 ≤ n 5/12 log 3 n, for n large enough, as required.
Proof of Proposition 4.2. This will follow from Proposition 4.1 using a similar argument as for Corollary 4.6. As in that argument, we fix some small enough ε > 0 and we use the same notation m, δ, A i and D.
Note that, on D, for each s ≤ |ω| there is some (random) i such that ω s = ω A i . Hence the probability in (4.2) is at most We observe that under P β the distribution of ω A i is the same as the distribution of ω under Pβ i , withβ i = β (β 0 /β 1 + iδ) ∈ [β 0 , β 1 ]. Using Proposition 4.1 and a straightforward bound on P(D c ∪ {|ω| < n ρ }) we deduce that the probability in (4.2) is at most ne ε log 2 n Ce −c log 2 n + C 2 e −c 2 log 2 n . Choosing ε > 0 small enough concludes the proof.

Poisson-Dirichlet coupling
In this section we prove our main result, Theorem 1.1. From the previous sections, Lemma 2.4 tells us that there are cycles of size of the order n and Propositions 4.1 and 4.2 tell us that these cycles are 'balanced'. The former lemma is stated in terms of a sequentially constructed ω with a fixed number of links, whereas the latter are formulated in terms of ω sampled from the Poisson law P β , so one of our tasks is to combine these two descriptions. Another task is to convert the balance-property of Propositions 4.1 and 4.2 into a quantitative result about the probability of splitting cycles when a uniformly placed link is added, see Lemma 5.1. Following that, the main task will be to provide a coupling of a PD( 1 2 ) sample with the rescaled cycle sizes. We begin by introducing some relevant notation and facts, as well as an outline of the proof. Throughout this section, β > 1 and ν ∈ [0, 1) are fixed. 5.1. Preparation and outline. The coupling with PD( 1 2 ) will involve sequentially appending a small number of uniformly, independently placed links to a random configuration ω. We will do this as follows. First, let ω have distribution P β . Next, let q ≥ 0 be an integer-valued random variable which is independent of ω and bounded (as n → ∞). Recall that ω denotes the ordered sequence of links in ω and that ω s denotes the first s links of ω.
To start the coupling, we will consider ω s for s = |ω|−q. We construct a sequence ω t for t ∈ {s, s + 1, . . . , s + q}, where ω s = ω s and the following ω t are obtained by sequentially and independently appending in total q uniformly placed links one at a time. Obviously the final configuration ω s+q then has |ω| = s + q links, and it agrees in distribution with ω. Letting C t = C ω t denote the cycle structure of ω t , it thus suffices to prove that Theorem 1.1 holds for C q .
Before proceeding, let us recall the key features of ω s = ω s which follow from our work in the previous sections. Precise statements are deferred to Section 5.3. First, it is clear that (since β > 1) we can find a constant c > 1 2 such that the number of links s satisifes s ≥ cn with high probability (converging to 1 as n → ∞). On this event Lemma 2.4 applies to ω s , meaning (roughly speaking) that there are cycles of size of the order n which together occupy a fraction ≈ zn of all vertices. (Here z is the same as in Proposition 3.4.) Second, since q is bounded, Proposition 4.2 certainly applies to ω s . Thus (with high probability), in any of the large cycles of ω s , any segment of n 1/2 consecutive vertices in that cycle has balance |B| < n 5/12 log 3 n.
Next, let us describe the evolution of C t , 0 ≤ t ≤ q, in a way which is suitable for the coupling with PD( 1 2 ). Since PD( 1 2 ) is a probability distribution on 'continuous' partitions of the interval [0, 1) it is convenient to represent C t also as a (labelled) partition of [0, 1) (in the actual proof we will use a different interval but the idea is the same). The mapping is fairly intuitive so we do not give a completely detailed description. Each vertex v ∈ V n is represented as a subinterval I(v) of the form [ i n , i+1 n ) for 0 ≤ i = i(v) ≤ n − 1, and this mapping is chosen so that the cycles of C t become disjoint intervals of the form [ i n , j n ) for 0 ≤ i < j ≤ n − 1, where if u, v are consecutive in a cycle then I(u) and I(v) are consecutive subintervals of [ i n , j n ) (interpreted cyclically). The subintervals I(v) are labelled using the labels ↑, ↓ consistently with the orientations of the vertices within the cycles. See Figure  5.1.
Naturally, this mapping is defined up to (i) cyclic rotations within each cycle, (ii) overall reversal of all the labels (arrows) in cycles, and (iii) the relative placement of the intervals [ i n , j n ) representing the cycles within [0, 1). Regarding the last item, the canonical way to order the intervals would be by decreasing length, but we wish to keep the flexibility of reordering them for the time being. In this setting the dynamics of uniformly placing links may be constructed using two independent uniform random variables U, U in [0, 1): • We first sample the mark m ∈ { , } of the link with probability ν for .
• We then sample U and set the first endpoint of the link to be u if U falls in the interval I(u). • Before selecting the other endpoint we (i) move the (interval [ i n , j n ) representing the) cycle containing u to the front of [0, 1), then (ii) cyclically reorder this cycle so that I(u) = [0, 1 n ). • Now we select the second endpoint by setting it to be v if U ∈ I(v). It may happen that I(v) = I(u); since this has probability 1 n we will in practice be able to disregard this possibility, but to be definite let us say that nothing happens to the cycles in this case. Having selected the endpoints of the link as well as its mark, we apply the rules given in Section 2 for splitting, merging or twisting cycles.
Using this construction, the sequence C 1 , . . . , C q may be obtained starting with C 0 and using a sequence {(U t , U t , m t ))} q t=1 of independent random variables with the above distributions.
We now turn to the task of showing that the probability of splitting a large cycle is close to 1 2 , in a sense which we will make precise. Let us assume that ω s belongs to the event v∈Vn {|B(v, n 1/2 )| < n 5/12 log 3 n} ∪ {|C(v)| < n 1/2 } (5.1) that any cycle of size at least n 1/2 is 'balanced'. This event holds with high probability due to Proposition 4.2. Recalling that the cycles C 0 form a partition of the vertex set V n , we define a refinement S of this partition into 'segments' as follows. For each cycle C ∈ C 0 satisfying |C| ≥ n 1/2 we fix a division of C into nonintersecting sets of consecutive vertices, each of size between n 1/2 and 2 n 1/2 . If |C| < n 1/2 then we declare C to be a segment on its own. On the event in (5.1) we see using the triangle-inequality that each segment S ∈ S satisfying |S| ≥ n 1/2 has balance |B(S)| < 2n 5/12 log 3 n, where B(S) is the difference between the number of ↑ and number of ↓ in S. As we proceed by adding links, and thereby modify the cycle structure, we keep the partition S into segments fixed. That is, at all later steps we will 'remember' for each vertex v ∈ V n which segment S it belonged to at t = 0. After some steps a segment S need no longer be a consecutive set of vertices within a cycle, for example if a cycle is split in the middle of S. We say that a segment S is untouched at step t ∈ {1, . . . , q} if none of the links placed in steps 1, 2, . . . , t − 1 had an endpoint in S, otherwise the segment is touched. If S is untouched then it is also 'intact' in the sense that it is still consists of consecutive vertices in some cycle, and |B(S)| is unchanged from t = 0.
In the representation of C 0 as a collection of marked subintervals of [0, 1), the segments S become subintervals (of length ≤ 2n −1/2 ) of the intervals representing the cycles (possibly we may have to interpret these subintervals cyclically). Recall that we used a uniform random variable U ∈ [0, 1) to select the second endpoint of a uniformly placed link. We now modify this construction slightly, and will instead use two uniform independent U , U ∈ [0, 1). We begin by sampling U , and we note which segment S it falls in (more precisely, which subinterval representing such a segment). If this segment S is touched then we let v be the vertex selected by U as before and we do not use U . However, if S is untouched then we do not record the precise location of U within S; instead we use U to independently select a uniform location within S and we select the second endpoint of the link to be v if U ∈ I(v).
The following result is now straightforward. Intuitively, it tells us that the probability of splitting a long cycle is very close to 1 2 , moreover the choice of whether or not to split is almost independent of the location where we propose to split.
Lemma 5.1. Assume that the event in (5.1) holds at t = 0. At step t ≥ 1 (i.e. in the transition C t−1 → C t ), let u be the vertex selected by U t and let C(u) ∈ C t−1 be the cycle containing u. Fix the orientation of C(u) so that u has label ↑. Suppose that U t selects a segment S which: (i) is untouched, (ii) is in the cycle C(u), i.e. S ⊆ C(u), and (iii) has size |S| ≥ n 1/2 . Let α ∈ {↑, ↓} be the label of the vertex v selected by U t . Then (on the event described) the conditional probability p t = P(α = ↑ | U t ) that v has the same orientation as u satisfies |p t − 1 2 | ≤ n −1/12 log 3 n. Proof. We have that p t = #(↑ in S)/|S| so |p t − 1 2 | = 1 2 |B(S)|/|S| ≤ n −1/12 log 3 n. We now give a brief outline of the rest of this section. First, in Section 5.2, we describe a slight modification of a coupling due to Schramm [21]. The coupling evolves a pair of partitions of the interval [0, 1) such that, firstly, the marginal dynamics have PD(θ) as an invariant distribution, and, secondly, the two partitions become 'close'. Moreover, for θ = 1 2 these dynamics are very similar to the dynamics of C t above (Schramm defined the coupling for θ = 1 but as we will see and as has been noted before [10], the extension to θ ∈ (0, 1] is completely straightforward). Then, in Section 5.3, we focus on the case θ = 1 2 and show how an adaptation of Schramm's coupling allows us to couple a PD( 1 2 )-sample to the 'discrete' partition coming from the cycles C t . This will allow us to prove Theorem 1.1. Lemma 5.1 comes in here and, intuitively speaking, by using the pair (U , U ) as described above we "trade accuracy for independence": U will tell us the exact location for splitting in the PD( 1 2 )-distributed partition, whereas in C t this is decided by U . As we will see, the locations in the segment S defined by U and U are close enough to each other, and p t is close enough to 1 2 , that the two partitions become more and more similar.

5.2.
Schramm's coupling. Fix any θ ∈ (0, 1], later we will take θ = 1 2 . We will define a sequence (Y t , Z t ) : t = 0, 1, . . . of pairs of random partitions of [0, 1) into countably many intervals [a, b), in such a way that (i) the marginal dynamics are stationary for PD(θ), and (ii) regardless of starting configuration, Y t and Z t become 'closer' in a sense to be defined later.
The subintervals [a, b) of [0, 1) constituting the partitions Y t and Z t will be called blocks. We will think of the blocks of Y t and Z t as distinguishable, and as before leave some flexibility about the relative placement of the blocks within [0, 1). By a slight abuse of notation we will identify a block Y t i ⊆ [0, 1) with its length Some of the blocks of Y t will be matched with blocks of Z t , and this relation is Other blocks are unmatched. Matched pairs of blocks have the same size, and such pairs will be created in some instances of the process we are about to describe. The total length of all unmatched blocks will be denoted by R = R t and the total length of matched blocks Q = Q t . We place the matched blocks at the end of [0, 1) and the unmatched blocks at the beginning, and within the matched and unmatched parts we order the blocks by decreasing size. See • in the case of proposing a split, the split is carried out if we have that W ≤ θ.
Thus it is possible to merge blocks in both Y and Z, to merge blocks in one but (propose a) split in the other, or to (propose a) split in both. In the case when we propose a split in both Y and Z, note that the same W is used for both, thus either both split or neither. In this case, if they split then at least two of the newly created blocks are of the exact same size (see Figure 5.5), and those blocks are then declared matched and moved to the matched part. Before the next step the blocks are sorted into the matched and unmatched parts and ordered by size within those parts, as before. The following result about the marginal dynamics is due to Tsilevich [24] (for θ = 1) and Pitman [19] (general θ). Another proof can be found in [10, Theorem 7.1].
Lemma 5.2. If Y 0 (respectively, Z 0 ) has distribution PD(θ) then Y t (respectively, Z t ) has distribution PD(θ) for all t ≥ 0. We will need quantitative results about how the sizes of the largest unmatched blocks evolve under these dynamics. We will present a sequence of results, Lemmas 5.4 to 5.6, which culminate in Corollary 5.7. As the proofs of these lemmas are identical or nearly identical to the corresponding proofs in [21] we omit the details, but give comments where there are differences in the case θ < 1.
Fix ε > 0 and introduce the following notation. Let N ε (Y t ) and N ε (Z t ) denote the number of unmatched blocks of size ≥ ε in Y t and Z t , respectively, and let N t ε = N ε (Y t ) + N ε (Z t ) be the total number of unmatched blocks of size ≥ ε after t steps. Let σ(ε, Y t ) = i Y t i 1I {Y t i <ε} be the total length of blocks smaller than ε in Y t , and similarly define σ(ε, Z t ). Also let ε = ε + σ(ε, Y 0 ) + σ(ε, Z 0 ).
Before presenting the lemmas about the coupling, we note the following a-priori estimates: (1) If for some C > 0 and all ε ∈ (0, 1) we have The proof is sketched (for θ = 1) in [21]. For completeness we give details in Appendix B. In the proof of Theorem 1.1, Y 0 will have the PD( 1 2 )-distribution while Z 0 will satisfy a bound of the form (5.2). Thus, in the following sequence of lemmas, one should think of ε as being of the order ≤ ε log( 1 ε ) as ε → 0, and N 0 ε as of the order ≤ log 2 ( 1 ε ). In the next few results we will be working conditionally on (Y 0 , Z 0 ), hence ε and N 0 ε will be treated as constants. We let q be a random time, independent of the chain ((Y t , Z t ) : t ≥ 0), and write η = max{P(q = t) : t ≥ 0}.
Let y t (1) and z t (1) denote the largest unmatched blocks in Y t and Z t , respectively. In the following result, note that R t − y t (1) ∨ z t (1) is small if most of the unmatched length R t is covered by the largest unmatched block in either Y t or in Z t . Hence the product R t (R t −y t (1) ∨z t (1) ) is small if either R t is small (which is what we want), or the unmatched part contains a large block (which can be handled because such a situation is 'unstable').
When applying this and the following estimates, the main case will be when q is uniformly distributed on {0, 1, . . . , ε −1/2 − 1}. Then η is approximately ε 1/2 and E[q] is of the order ε −1/2 . If ε and N 0 ε are of the order indicated above then the right-hand-side is small (of the order ≤ ε 1/2 log 2 ( 1 ε )). Proof. The proof is essentially identical to the proof in [21,Lemma 3.1] and uses that on the event that up to time q no blocks of size ≤ ε are created or merged, N t ε is non-increasing for t < q. The only extra case which arises for θ < 1 is when, going from t to t + 1, two blocks are merged in Y (respectively, Z) but a split is proposed for Z (respectively, Y) and not accepted. In this case we see that N t+1 . The lemma says that the two largest unmatched entries together dominate the unmatched part (if η, q, ε and N 0 ε are of the order indicated above, then the righthand-side is small as long as, say, ρ ≥ ε 1/10 ).
Proof. This proof is virtually identical to the proof of [21,Lemma 3.2]. We consider whether the event R = {R q − z q (1) < ρ/4} occurs or not. In the case when R does not occur we can apply Lemma 5.4 exactly as in [21]. In the case when R does occur, the key observation in [21] is that there is good probability that z q (1) splits into two blocks of size ≥ ρ/4 while two unmatched blocks of Y q merge, allowing us to apply Lemma 5.4 in the next step instead. The only extra consideration for θ < 1 is that the split must be accepted, which happens with probability θ, resulting in the factor θ −1 in the statement of the lemma.
We next bound the 'average' probability of having a large unmatched block. Its corollary, Corollary 5.7, is especially important for us. Lemma 5.6. Let ε, ρ ∈ (0, 1) and let t > 0 and k be such that t ≥ 2 k /ρ. Then (conditionally on Y 0 , Z 0 ) We now make some additional assumptions on (Y 0 , Z 0 ) and q, which allow us to obtain a more explicit bound on P(y q (1) ≥ ρ). As usual we work conditionally on (Y 0 , Z 0 ). Corollary 5.7. Assume that ε < 1 and that Then for some C = C(γ) we have that for all ρ ∈ (0, 1), Proof. The proof is identical to the proof of [21, Corollary 3.4] (in [21] there is an additional parameter λ which we have set to 1).

5.3.
Proof of Theorem 1.1. We now turn to the proof of our main result. Let ε > 0, to be fixed later. Recall the set-up of Section 5.1: ω is sampled from P β where β > 1, and we consider ω s = ω s for s = |ω|−q. We take q to be uniformly distributed on {0, 1, . . . , ε −1/2 − 1}. Let A 0 be the event that the following conditions hold: • for some c > 1 2 we have s ≥ cn; • the event in (5.1) holds for ω s = ω s ; and • the random graph G(n, s), which has an edge wherever ω s has at least one link, has a unique giant connected component V G containing between 0.99zn and 1.01zn vertices, and any other connected component has size at most log 2 n. (Here z is the same as in Proposition 3.4.) In the following discussion we will assume that A 0 holds, as P(A c 0 ) = o(1) as n → ∞. Note that the cycles C 0 refine the components of G(n, s), hence (on A 0 ) a cycle is either contained in V G or it has size ≤ log 2 n.
We take Y 0 to have distribution PD( 1 2 ). Roughly speaking, Z 0 will be obtained from the cycles C 0 and we want to use the coupling from Section 5.2 to obtain (Y q , Z q ). The main modification of the coupling is that we use the construction in Lemma 5.1 for splitting in Z. There are also several minor modifications to take into account. In what follows we work conditionally on ω 0 .
We subdivide the cycles of C 0 into segments S ∈ S as in Section 5.1. Write m = |V G |. We let Z 0 be a representation of the cycles C 0 as intervals as in Section 5.1, but now as subintervals of [0, n m ) rather than [0, 1). Thus each vertex v is represented by a subinterval Note that [0, n m ) has length roughly 1 z . In keeping with the terminology of the previous subsection, we refer to the intervals which represent the cycles as blocks. The subintervals I(v) representing the vertices are labelled using ↑, ↓, as before. Clearly, a cycle of size ≥ n 1/2 in C 0 is represented as a block of size ≥ n 1/2 /m in Z 0 .
We place the blocks representing the cycles of C 0 which lie in the giant component V G at the start of [0, n m ), i.e. in [0, 1). The blocks representing the remaining cycles are placed in [1, n m ). Within [0, 1) we will later have matched and unmatched blocks, as in Section 5.2, and again we place the unmatched blocks first and within the matched and unmatched parts we order the blocks by decreasing size. See Figure 5.8. Z t consists of blocks representing the cycles C t , placed in the interval [0, n m ). Cycles belonging to V G are placed first, roughly in the interval [0, 1), and are sorted into those matched with a block of Y t and those not. Matched blocks can differ slightly in size. Also Z t will after a few steps have an 'overhang' since the giant component grows. The hatched part consists of blocks representing cycles with vertices that are not in V G .
We will define dynamics for (Y t , Z t ) such that the marginal dynamics for Y t are as in Section 5.2 with θ = 1 2 , and the marginal dynamics for Z t are as in Section 5.1. Thus Y t will have distribution PD( 1 2 ) for all t, due to Lemma 5.2, and Z t will be a representation of the cycles C t as intervals.
In order to be able to define a successful coupling later on, we will need the notion of a forbidden set F t ⊆ [0, n m ) (for any given time t). This set will arise due to small errors which accumulate during the process. Initially, for t = 0, we set F 0 = ∅. In later steps, we define F t as consisting of the following parts: • First, all segments which have been touched up to time t (or rather, the union of the I(v) for v belonging to touched segments) in Z are forbidden. • Second, F t contains an overhang (defined as {s ∈ (1, n m ] : s ∈ I(u) for some u ∈ V G }) which arises because in Z cycles outside the giant component may merge with cycles inside the giant component, meaning that the giant component grows with time.
• Third, it will be necessary to allow matched blocks, defined shortly, to have slightly different sizes, rather than the exact same size as in Section 5.2. When the blocks of Y t and Z t are lined up as in Figure 5.8, the subset in [0, n m ) where part of a matched block does not overlap with its partner belongs to the forbidden set F t . • Also, [0, 1 m ) is forbidden. To understand the meaning of this, recall that once U has been sampled, the block it highlighted is moved and rotated so that the corresponding I(u) is moved to [0, 1 m ). Putting this interval in the forbidden set will simply be a way to enforce that all links have two different endpoints.
Once we have described precisely the transitions in our process we will easily be able to bound the size |F t | by a very small number, see (5.6).
Let us now define a step of the process. Steps will again be accomplished using independent uniform random variables U , U , U and W , but now U and U will be uniformly distributed in [0, n m ) while U and W are still uniform in [0, 1). For Z we will also sample the mark m ∈ { , } of the corresponding link in each step. In what follows we will assume that U and U never fall in the forbidden set. For concreteness, if U or U fell in the forbidden set we would declare the process failed and stop.
Firstly, if U or U falls outside [0, 1) we perform the corresponding transition in Z, using the rules from Section 5.1, but do nothing to Y. Let us now assume that U , U both fall in [0, 1). When U and U highlight different blocks these blocks are merged, as before. For Z t the labels ↑, ↓ must be handled appropriately, taking into account also the mark m ∈ { , } of the link, as in Section 2. In the case when U falls in a block highlighted by U (in either Y, Z or both) it has to be decided if a split should be carried out. There are three cases for how this is decided. First, if a split is proposed in Y only ( Figure 5.7) then it is carried out if W ≤ 1 2 , so this case is the same as in Section 5.2. Second, if a split is proposed only in Z then we decide whether to carry it out by looking at the labels ↑, ↓ in the intervals I(u) and I(v) selected by U and U , as well as the mark m ∈ { , } of the link, and applying the rules of Section 2. In this case we do not need to use U . However, the third and most important case is when a split is proposed in both Y and Z. In this case we do the following: (1) In Y we record the exact location of U . If we decide to carry out the split in Y, then it will be done at the location of U . (2) In Z we only record the segment S in which U falls; (3) Then we use U to independently sample a uniform point within S and make the splitting decision for Z as in Section 5.1.
It only remains to specify how we decide whether or not to split in Y.
Let v be such that U ∈ I(v). Assume that the block of Z highlighted by U has size ≥ n 1/2 /m, and that U , U did not fall in the forbidden set F t . We are then able to apply Lemma 5.1. Thus the conditional probability p t that I(u) and I(v) have the same label ↑ is within n −1/12 log 3 n of 1 2 . Depending on whether p t is bigger or smaller than 1 2 , and also on the mark m ∈ { , } of the link, the conditional probability of splitting in Z t is thus either slightly above or slightly below 1 2 . We wish to 'maximally couple' the decision whether or not to split in Z with the decision in Y, but keeping the splitting probability for Y at exactly 1 2 . Let us describe this assuming p t ≤ 1 2 , the other case is similar.
• If the mark m = , recall that this means that we split in Z if I(v) has label ↑, i.e. with probability p t ≤ 1 2 . Our rule for Y is then: split in Y if I(v) has label ↑, but if I(v) has label ↓ split in Y anyway with probability ( 1 2 − p t )/(1 − p t ) (independently of all other choices).
• If the mark m = , this means that we split in Z if I(v) has label ↓, i.e. with probability 1 − p t ≥ 1 2 . Our rule for Y is then: do nothing (no split) in Y if I(v) has label ↑, but if I(v) has label ↓ also do nothing in Y with probability ( 1 2 − p t )/(1 − p t ) (independently of all other choices). It is not hard to check that these rules ensure that the probability of splitting in Y is exactly 1 2 , independently of the location U of the proposed split. We can also see from this that the probability of Y and Z making different choices (i.e. one splits and the other one twists) is |p t − 1 2 | ≤ n −1/12 log 3 n. If the decision is to split in both Y and Z, the blocks created are declared matched as in Section 5.2 (if the blocks which split were already matched we get two pairs of matched blocks, otherwise one pair). Note that U and U differ by at most 2 n 1/2 /m due to the upper bound on the size of segments S; this will give us a bound on how much matched blocks can differ in size.
There is one final case in which we need to specify the rules for deciding to split, which is when a split is proposed in both Y and Z but the block of Z has size < n 1/2 /m. This is unlikely and we will see that we can assume that this does not occur, but to be definite let us say that in this case we split in Y if W ≤ 1 2 . Now we turn to bounding the size of the forbidden set F t . We claim that, for any t ∈ {0, 1, . . . , q} we have |F t | ≤ 7ε −1/2 n 1/2 m .
(5.6) Indeed, after t steps we have added at most 4t n 1/2 /m due to touched segments, at most t log 2 n/m due to overhang, and at most 2t n 1/2 /m due to the size-difference of matched blocks. Adding to this 1/m for [0, 1 m ) and recalling that t ≤ q ≤ ε −1/2 we arrive at (5.6).
Let us say that the coupling (Y q , Z q ) was successful, denoting this event by G, if the following occur for all t ∈ {1, . . . , q}: • U t , U t do not fall in the forbidden set F t ; • in step t we do not propose to split a block of Z which has size < n 1/2 /m; and • if at step t it is proposed to split a block in both Y and Z then we either split in both or in neither.

(5.7)
Here we bounded the probability that we make different decisions for splitting a large block in Y and in Z by n −1/13 , which is valid for large enough n.
We can now put the different pieces together and wrap up the proof of Theorem 1.1. Having defined (Y q , Z q ), let us now order the blocks in both of them by decreasing size, and let us think of them as two infinite sequences by appending infinitely many 0's at the end. Let δ > 0 be arbitrary and write It suffices to show that P(D) can be made arbitrarily small by choosing ε > 0 small and n large. Recall that we have been working on the event A 0 defined in the beginning in this proof. Also recall the quantities σ(ε, Y) and N ε (Y) defined in Section 5.2. We now define σ(ε, Z) and N ε (Z) similarly but counting only those blocks which intersect [0, 1). Given this, ε and N 0 ε are given as in Section 5.2. By (2.4) in Lemma 2.4 we have for some C > 0 and any ε > 0 that E[σ(ε, Z 0 )] ≤ Cε log( 1 ε ). Hence by Proposition 5.3 we also have E[N ε (Z 0 )] ≤ C log 2 ( 1 ε ). By the same Proposition, the same bounds also apply to Y 0 . Hence E[ε] ≤ 3Cε log( 1 ε ) and E[N 0 ε ] ≤ 2C log 2 ( 1 ε ). Defining the events A 1 = {ε ≤ ε 3/4 } and A 2 = {N 0 ε ≤ ε −1/4 } and using Markov's inequality, we get . On G ∩ A 0 ∩ A 1 ∩ A 2 , we can apply Corollary 5.7, with γ = 1 5 say, and ρ = δ/2. (We use ρ = δ/2 rather than δ to account for such things as the size-difference of matched blocks; thus n should be taken sufficiently large.) This gives, for some C > 0 and n large ≤ o(1) + C ε 1/4 log 2 ( 1 ε ) + 2C δ log(ε −3/4 ) .
The right-hand-side can be made arbitrarily small by picking ε > 0 small and n large. (Since G involves the entire process, to be completely rigorous Lemmas 5.4-5.6 and Corollary 5.7 should be proved on the event G. This can be done by working with the time min{q, τ } where τ is the first time at which G fails. From (5.7) we see that, with high probability, τ > q and so the only change is the addition of an o(1) term.) Appendix A. Proof of Proposition 3.4 We use the notation from Proposition 3.4, noting that S = {Z t > −1 for all t ≥ 0}. Also note that z = P(S) > 0 when β > 1, since if z = 0 then P(∃t : Z t = −1) = 1 which by the Markov property would imply P(lim inf t→∞ Z t = −∞) = 1, contradicting the fact that Z t → +∞ almost surely.
It will be useful to consider the times of record minima m k , which are defined as follows. Let τ 1 , τ 2 , . . . denote the jump times of Z (equivalently, of N ). First we define m 1 := τ 1 , and then inductively m k+1 := min{τ j > m k : Z τ j − < Z m k − }, where min ∅ = ∞.
Importantly, the m k are stopping times and they characterize the frontier time 1 by 1 = max{m k : m k < ∞}, i.e. 1 equals the last record time. The later frontier times k may be expressed similarly using the record minima of Z (k) . Using that the m k are stopping times we have for all k ≥ 1 that P(m k < ∞) = (1 − z) k−1 . (A.1) Proof that z = 1 − e −βz : Write δ = 1 − z = P(∃t : Z t = −1). By conditioning on the first time when Z t hits −1 we see that P(∃t : Z t = −j) = δ j for all j ≥ 1. Since N t only takes integer values it follows δ j = P(∃k ≥ 0 : N k+j = k). From this and the Markov property at time 1 we find that δ = j≥0 e −β β j j! δ j = e −β(1−δ) , as claimed.
Proof that, given S, the sequence ∆ k , (Z (k) t ) 0≤t<∆ k +∞ k=0 is i.i.d.: We start by establishing that This is reasonable since, for example, after 1 we know that Z does not set a new record minimum, which is the same as saying that Z (1) does not hit −1. Using induction, (A.2) follows from these two equalities in law: To prove (A.3), let B be some event. Using the description of 1 in terms of the stopping times m k we see that It will be useful to note the following: P( 1 ≤ a | Z ∈ S) = k≥1 P m k ≤ a, m k+1 = ∞, Z ∈ S /P(S) = k≥1 P m k ≤ a, Z m k − > −1, (Z m k +t − Z m k ) ∈ S /P(S) = k≥1 P m k ≤ a, Z m k > 0).
Appendix B. Proof of Proposition 5.3 Let us write simply Y for Y 0 . For the first part, let K = log 2 (1/ε) . We have Hence using (5.2), which gives the claim. Next, if Y is PD(θ)-distributed then a size-biased sample from Y is Beta(1,θ)distributed. This means that if we select a random index I in such a way that P(I = i | Y) = |Y i |, then as claimed.