The size of the last merger and time reversal in $\Lambda$-coalescents

We consider the number of blocks involved in the last merger of a $\Lambda$-coalescent started with $n$ blocks. We give conditions under which, as $n \to \infty$, the sequence of these random variables a) is tight, b) converges in distribution to a finite random variable or c) converges to infinity in probability. Our conditions are optimal for $\Lambda$-coalescents that have a dust component. For general $\Lambda$, we relate the three cases to the existence, uniqueness and non-existence of quasi-invariant measures for the dynamics of the block-counting process, and in case b) investigate the time-reversal of the block-counting process back from the time of the last merger.


Introduction and main results
We consider coalescents with multiple mergers, also known as Λ-coalescents, which were introduced in 1999 by Pitman [12] and Sagitov [13]. If Λ is a finite measure on [0, 1], then the Λ-coalescent started with n blocks is a continuous-time Markov chain (Π n (t), t ≥ 0) taking its values in the set of partitions of {1, . . . , n}. It has the property that whenever there are b blocks, each possible transition that involves merging k ≥ 2 of the blocks into a single block happens at rate and these are the only possible transitions. One can also define the Λ-coalescent started with infinitely many blocks, which is a continuous-time Markov process (Π ∞ (t), t ≥ 0) taking its values in the set of partitions of the positive integers such that for all n, the restriction of (Π ∞ (t), t ≥ 0) to the integers {1, . . . , n} has the same law as (Π n (t), t ≥ 0). Let N n (t) be the number of blocks in the partition Π n (t). Denote by T n = inf{t : N n (t) = 1} the time of the last merger. In this paper, we are interested in the distribution of for the Λ-coalescent to come down from infinity. In this case the distribution of L n converges as n → ∞ to the distribution of N ∞ (T ∞ −). Therefore, it is necessary to consider only the case in which the Λ-coalescent does not come down from infinity.
Hénard [8] and Möhle [11] were able to calculate the limiting distribution for L n when Λ is the beta distribution with parameters 2 − α and α for 0 < α < 2. Note that this coalescent process comes down from infinity only when 1 < α < 2. Earlier, Goldschmidt and Martin [6] had calculated this distribution for the Bolthausen-Sznitman coalescent, which is the case α = 1. Abraham and Delmas found this limit for α = 1/2 in [1], and for all α ∈ (0, 1/2] in [2]. Theorem 1 gives a condition under which the distribution of the number of blocks involved in the last merger remains tight as n → ∞. Note that the condition (2) fails only when the measure Λ has enough mass near 1, which would make it possible for very large mergers to occur. | log(1 − p)| Λ(dp) < ∞. (2) Then the sequence (L n ) n≥1 is tight.
Under an additional regularity condition, we are able to show that the distribution of the number of blocks involved in the last merger tends to a limit as n → ∞. We call the measure Λ log-nonlattice if Theorem 2. Suppose (2) holds, and Λ is log-nonlattice. Then the sequence (L n ) n≥1 converges in distribution.
In this theorem the log-nonlattice assumption cannot be completely avoided. Indeed we shall show below that when Λ has all its mass at one single point within (0, 1), the sequence (L n ) n≥1 , though tight, does not converge in distribution. It is natural to conjecture that in the lattice case we always will experience such non-convergence.
The next theorem shows that condition (2) is necessary for tightness of the size of the last merger in the presence of dust.
It was shown in [12] that (3) is the condition under which the Λ-coalescent has a dust component, which means that for all t > 0, the partition Π ∞ (t) contains singleton blocks almost surely. We can see from the statements of Theorems 1 and 3 that when Λ satisfies (3), the condition (4) is necessary and sufficient for (5) to hold. Therefore, the only case that remains open is the case when the Λ-coalescent fails to come down from infinity but there is no dust component. In that case, we expect that it is possible that (4) holds but (5) fails to hold.
The central tool for the proof of Theorem 3 is a uniform approximation of log N n (t) by the solution of an SDE driven by a subordinator, see Theorem 9 in Section 3 and its corollaries. These results can be seen as refinement and generalization of the subordinator approximation by Gnedin, Iksanov, and Marynych [5] in the presence of a dust component, see Remark 12 below. Whenever the random variables L n converge in distribution, it is natural to ask whether convergence in distribution holds for the block-counting processes N n = (N n (t)) t≥0 as n → ∞ in any finite observation window around state 1. An appropriate description is by means of time-reversal. As a tool for proving the claimed convergence we use quasi-invariant measures. Let Then ρ ij is the rate at which N n jumps from state i to j, and ρ i is the total rate of a jump from i. A non-trivial, locally finite measure This means that the flow of mass into the state i ≥ 2 equals the flow out of i, and that the total flow into the absorbing state i = 1 is finite. Note that for a quasi-invariant measure µ we have µ i > 0 for all i ≥ 2. Existence and uniqueness of quasi-invariant measures are closely related to the asymptotic behaviour of the sequence of distributions of the last merger sizes L n . (ii) If there is a probability measure π = (π i ) i≥2 on {2, 3, . . .} and a sequence of positive numbers α n , n ≥ 1, not converging to 0, such that as n → ∞ In particular, if the sequence (L n ) n≥1 converges in distribution to a finite random variable L ∞ , then (iii) In all other cases, there exist at least two quasi-invariant measures (not being multiples of each other).
In particular we have at least two quasi-invariant measures, if the sequence (L n ) n≥1 is tight, but not convergent in distribution.
In the case of a coalescent coming down from infinity, as already stated above, item (ii) applies.
In the presence of dust the three cases all occur (see Theorem 2, Theorem 3, and Section 5). At first sight one may expect that the condition P (L n = i) ∼ α n π i in item (ii) will occur only with α n → 1, that is the random variables L n converge in distribution. At the moment, however, we cannot exclude the possibility that the sequence (α n ) is not convergent.
Theorem 4 will allow us to treat the time-reversalN n = (N n (t)) t≥0 of the block-counting process N n . This process is defined as the càdlàg process given bŷ In particular we haveN n (0) = L n .
Theorem 5. If the sequence L n , n ≥ 1, converges in distribution, then also the sequence of processes (N n ) n≥1 converges in distribution in Skorohod space. The limitN ∞ is a Markov jump process with values in {2, 3, . . .} and jump rateŝ where the µ i are the weights of the quasi-invariant measure from Theorem 4 (ii).
The rest of this paper is organized as follows. We prove Theorem 1 in Section 2. In Section 3, we show how to approximate the number of blocks in the Λ-coalescent by means of a subordinator when (3) holds. We prove Theorem 2 in Section 4. In Section 5 we give an example in which (L n ) n≥1 is tight but does not converge in distribution because the log-nonlattice assumption in Theorem 2 fails. We then derive Theorem 3 in Section 6, and we prove Theorems 4 and 5 in Section 7.

Proof of Theorem 1
It will be useful throughout the paper to work with a Poisson process construction of the Λ-coalescent. The construction that we will give is a slight variation of the original such construction provided by Pitman in [12].
Let Π n (0) = {{1}, . . . , {n}} be the partition of the integers 1, . . . , n into singletons. Suppose (t, p, u 1 , . . . , u n ) is a point of Ψ, and Π n (t−) consists of the blocks B 1 , . . . , B b , ranked in order by their smallest element. Then Π n (t) is obtained from Π n (t−) by merging together all of the blocks B i for which u i ≤ p into a single block. These are the only times that mergers occur. This construction is well-defined because almost surely for any fixed t 0 < ∞, there are only finitely many points (t, p, u 1 , . . . , u n ) of Ψ for which t ≤ t 0 and at least two of u 1 , . . . , u n are less than or equal to p. The resulting process (Π n (t), t ≥ 0) is the Λ-coalescent. When (t, p, u 1 , . . . , u n ) is a point of Ψ, we say that a p-merger occurs at time t. We will need the following simple lemma pertaining to the rate at which the number of blocks decreases. Lemma 6. Let Λ be a nonzero finite measure on [0,1]. Consider the Λ-coalescent (Π n (t), t ≥ 0) started with n blocks. Let W n = inf{t : N n (t) ≤ n/2}. Then there exists a positive constant C, depending on Λ but not on n, such that E[W n ] ≤ C for all n ≥ 2.
Proof. For 2 ≤ k ≤ n, the probability that k is the smallest integer in one of the blocks of Π n (t) is bounded above by the probability that the integers 1 and k do not merge before time t, which is e −λ 2,2 t . Therefore, Thus, using Markov's Inequality, Because λ 2,2 = Λ([0, 1]) > 0 by assumption, there exists t 0 > 0 such that P (W n > t 0 ) ≤ 1/2 for sufficiently large n. By increasing the value of t 0 if necessary, we can arrange for this inequality to hold for all n ≥ 2. Then by repeatedly applying the Markov property, we get P (W n > mt 0 ) ≤ 2 −m for all positive integers m. It follows that E[W n ] ≤ 2t 0 for all n ≥ 2, which gives the result. and Moreover, Proof. To prove (6), let ξ 1 , . . . , ξ b be independent random variables with P (ξ i = 1) = p and P (ξ i = 0) = 1 − p. Observe that In particular, if j ≤ b/2k, then the right-hand side is less than 1/2 and, taking complements, we get It follows by taking j = ⌊b/2k⌋ that which gives (6).
To show (7) we obtain from an exponential Markov inequality that with λ > 0. Putting λ = − log p the inequality follows. Finally, we have which equals the right-hand side of (8).
Theorem 1 is an immediate consequence of Proposition 8 below when m = 1. (We state this proposition in a more general form, which we will use in the proof of Theorem 2.) (2) holds. Then for all ε > 0, there exists a positive integer K ε such that P (m < N n (t) ≤ K ε m for some t ≥ 0) > 1 − ε for all integers m and n such that 1 ≤ m < n.
Proof. For K ≥ 2, let A m,n be the complement of the event that m < N n (t) ≤ Km for some t ≥ 0. If A m,n occurs, then for some nonnegative integer ℓ, a single merger takes the coalescent from between 2 ℓ Km + 1 and 2 ℓ+1 Km blocks down to m blocks or fewer.
Suppose there are b blocks in the Λ-coalescent at some time, where b ≥ 2 ℓ Km + 1, and then a p-merger occurs. For the p-merger to take the coalescent down to m blocks or fewer, the number of blocks that participate in the merger must be at least b − m + 1. By (6), if m ≥ 2, then the probability that this occurs is bounded above by If m = 1, this probability is bounded above by p b ≤ 2p 2 ℓ (K/2)−1 . Because, from the Poisson process construction of the Λ-coalescent, we know that p-mergers take place at rate p −2 Λ(dp), it follows that the rate of events that take the coalescent down to m blocks or fewer is bounded above by 2 1 0 p 2 ℓ (K/2)−3 Λ(dp). By Lemma 6, the expected amount of time for which the number of blocks is between 2 ℓ Km + 1 and 2 ℓ+1 Km is bounded above by C for all ℓ. Therefore, For any a > 0 and any x ∈ (0, 1), we have Therefore, if 1 ≤ m < n, then for K > 6 It follows from (2) and the Dominated Convergence Theorem that this expression tends to zero as K → ∞, which gives the result.
3 An approximation in the case of dust This subordinator first appeared in the work of Pitman [12] and was used to approximate the blockcounting process by Gnedin et al. [5] and Möhle [10]. The next theorem provides a refinement. Define From (3), we see that f (y) is finite for all y ∈ R. Also f is decreasing with lim y→∞ f (y) = 0, because for fixed p the integrand has this behaviour. Let Y n = (Y n (t)) t≥0 be the solution of the SDE Our goal is to show that for coalescents with dust the log of the block-counting process follows closely the process Y n , up to the time when N n has nearly reached the state 1. For this purpose, we define for any k > 1 τ k,n := inf{t ≥ 0 : N n (t) < k}.
Theorem 9. Under assumption (3), for all ε > 0 there is an integer k ≥ 2 such that for all n, P sup Note that (13) controls the distance between Y n and log N n up to the first time point when N n jumps below k. This time point is excluded only if the jump leads directly to 1, i.e. on the event {τ k,n = T n }.
Before proving this theorem let us derive some consequences. (3), for all ε > 0 there is an integer ℓ such that

Corollary 10. Under assumption
Proof. For τ k,n < t < T n and | log Hence, since f is decreasing, By the strong Markov property, T n −τ k,n is stochastically bounded from above by T k and similarly The claim now follows from Theorem 9.
Since f (x) → 0 for x → ∞, the processes Y n and log n − S are in view of (11) close to each other, and one may wonder whether also log n − S is suitable to approximate the log of the block-counting process. This works under a stronger condition.

Corollary 11. Under the assumption
for all ε > 0 there is an integer k ≥ 2 such that for all n, P sup For any integer i we have on the event sup From Lemma 6 and the strong Markov property there is a C > 0 such that Choosing i large enough this bound may be made arbitrarily small. In view of (11) and Theorem 9 our claim follows.
Remark 12. Gnedin, Iskanov, and Marynych [5] also studied the absorption time T n by coupling with a subordinator. The hypothesis of Lemma 4.2 in [5] is that . This condition is equivalent to (15). To see this, note that We now come to the proof of Theorem 9. It requires two preparatory lemmas.
Lemma 13. Suppose X has a binomial distribution with parameters b and p. Then where Proof. By the Mean Value Theorem, if x > 0 and y > 0, then there exists a positive number z between x and y such that log x − log y = z −1 (x − y). Therefore, there exists a random variable Z between (X + 1)/(b + 1) and p such that Clearly R ′ ≥ 0. It remains to bound E[R ′ ]. Because Z must be between (X + 1)/(b + 1) and p, we see that |1/Z − 1/p| can be bounded from above by substituting (X + 1)/(b + 1) in place of Z. We get Now by (8), Therefore, and define τ k,n as in (12). Then there exists a positive constant C 1 , depending on Λ but not on n, such that for all By the Law of Large Numbers, there exists a positive integer m such that for b ≥ m, whenever the coalescent has b blocks, the rate of mergers that will bring the coalescent down to fewer than (1 − a)b blocks is at least c. Let e b be the expected time, when the coalescent starts with b blocks, before the number of blocks drops below (1 − a)b. Let Then, for all b ≥ 2, if the coalescent starts with b blocks, the expected time before the number of blocks drops below (1 − a)b is at most C. For positive integers j, let Then the expected Lebesgue measure of {t : N n (t) ∈ B j } is at most C. Therefore, Proof of Theorem 9. Again we construct the Λ-coalescent from the Poisson point process Ψ, as described at the beginning of Section 2. Enumerate the points of Ψ as which is the number of extant lines that are not included in the merger at time t i . Conditional on p i and N n (t i −), the distribution of X i is binomial with parameters N n (t i −) and 1 − p i . Also, for all i ∈ N, we have N n (t i ) = X i + 1 {X i <Nn(t i −)} . Dividing both sides by N n (t i −) and taking logs, we get Also, It follows that for t > 0, Noting log and letting where R i is defined as in (17), with N n (t i −) in place of n, X i in place of X, and 1 − p i in place of p.
We now break this sum into pieces. Let ε > 0, and let J = {i ∈ N : The probability that N n (t i ) = 1, conditional on N n (t i −) and on the event {i / ∈ J}, is at least 1 − ε/4. Therefore, Conditional on p i and N n (t i −), the random variable has mean zero and variance In particular, the process (M n (t), t ≥ 0) is a martingale. Recalling the definition of τ k,n from (12) and putting l p := ⌈ε/(4(1 − p))⌉, we get for the bracket process M n Combining this result with (18) and using τ k,n ∧ τ lp,n = τ k∨lp,n we obtain which is finite by (3) and goes to 0 for k → ∞. Therefore, by the L 2 Maximum Inequality for martingales and Markov's inequality, we get that for k sufficiently large and P sup We now consider the process (V n (t), t ≥ 0). By Lemma 13, Thus as above, if k is sufficiently large, Together with (19) and (20) we arrive at P sup Now we approximate U n (t) by t 0 f (log N n (s)) ds, uniformly for t ≤ τ k,n . Note that by (3), there are only finitely many t i such that t i ≤ T n and X i < N n (t i −). Denote these points by s 1 < · · · < s m , and also set s 0 = 0 and s m+1 = ∞. Note that s m = T n . When the coalescent has b blocks, the points s i appear at rate Therefore, the random variables G i = (s i+1 −s i )ρ(N n (s i )) for 0 ≤ i ≤ m−1 are independent standard exponential random variables, also independent of the process N n (s j ), j ≥ 1. Recalling (10), Using that the second sum has just one non-vanishing summand, and that We show that for k sufficiently large the supremum over t ≤ τ k,n of the right-hand side gets arbitrarily small in probability, uniformly in n. To this end we deal with the three summands on the r.h.s. of (22) in reverse order.
First we have and so by Lemma 14 Second, since E[G 2 i ] = 2, we have for u > 0 P max where we used (23) in the last inequality. Third let and again by means of the L 2 Maximum inequality and (23) Using these three estimates we obtain from (22) that for any ε > 0 P sup if k is sufficiently large. Combining this bound with (21) we arrive at the formula P sup To finish the proof we define for t ≥ 0 For fixed t and n we consider the event A ≥ := {t < T n , t ≤ τ k,n , log N n (t) ≥ Y n (t)} and define the random time σ t := sup{s ≤ t : log N n (s) ≤ Y n (s)}.
Then on the event A ≥ we have log N n (σ t −) − Y n (σ t −) ≤ 0 and f (log N n (s)) − f (Y n (s)) ≤ 0 for s > σ t , since f is decreasing. Thus, on A ≥ , Similarly on Recalling (24), this implies that for sufficiently large k, which was the claim.

Proof of Theorem 2
In this section we prove Theorem 2. First we provide a lemma which gives a uniform lower bound for the probability that the block-counting process does not jump over certain intervals.
Proof. We distinguish two cases. First assume that for all η > 0 we have Λ((0, η]) > 0. Let η = 4 −2K/δ and define N ′ n , N ′′ n to be the block-counting processes belonging to the two coalescents arising by restricting Λ to the intervals either [0, η] or (η, 1], and using the same Poisson process Ψ. The processes N ′ n , N ′′ n are independent, therefore for any u > 0 By assumption the process N ′ n is non-degenerate. Thus in view of Lemma 6 the expectation of W ′ n := min{t ≥ 0 : N ′ n (t) ≤ (1 − δ)n/K} is bounded by a constant κ, depending on δ and K but not on n. Choosing u = 2κ we obtain from Markov's Inequality Moreover P (N ′′ n (2κ) = n) ≥ e −2κ 1 η p −2 Λ(dp) > 0.
Finally, for the rate at which N ′ n performs at time t a jump of size larger than δn/K, we obtain from (7) and from the choice of η for n ≥ 4K/δ the bound Putting our estimates together we arrive at for n sufficiently large and any m with m < n ≤ Km. A further lowering of this bound makes the estimate valid for all n. Letting α = 1 our claim follows. For the second part of the proof let Λ([0, η]) = 0 for some η > 0. Then (15) is satisfied such that we may resort to Corollary 11. Note that our log-nonlattice assumption means that the random walk (S(i), i ∈ N 0 ) is non-lattice in the usual sense. Condition (2) implies E[S(1)] < ∞. Therefore the classical renewal theorem implies that with α sufficiently small there is a constant 0 < C ≤ 1/2 depending on δ such that for all s ≥ 0 and consequently for m < n (letting s = log n − log m) P ∃t ≥ 0 : Next, choose k according to Corollary 11 so that (16) holds with ε = 1 4 C ∧ 1 3 | log(1 − δ)|. Let k be so large that by Theorem 1, we have P (τ k,n = T n ) = P (N n (T n −) ≥ k) ≤ 1 4 C for all n. Then P sup In particular with t = τ k,n , and hence for n sufficiently large (because m ≥ K/n) P ∃t ≥ τ k,n : log n − S(t) > log αm + Combining this estimate with (25) we obtain P ∃t ≤ τ k,n : Hence from (26) it follows for n sufficiently large and m < n ≤ Km P ∃t ≤ τ k,n : log(1 − δ) + log αm ≤ log N n (t) ≤ log αm ≥ C.
Again by suitably lowering the constant C this estimate holds for all n, which then translates into our claim.
Proof of Theorem 2. We prove this result by coupling. Let ε > 0. It suffices to show that there exists a positive integer n 0 such that if n 0 < n 1 < n 2 , then we can construct Λ-coalescents (Π n 1 (t), t ≥ 0) and (Π n 2 (t), t ≥ 0) started with n 1 and n 2 blocks respectively such that By Theorem 1, we can choose a positive integer ℓ such that P (N n (T n −) ≤ ℓ) > 1 − ε/4 for all n. Let C be the constant from Lemma 15 with δ = ε/(4ℓ) and with the constant K = K 1/2 from Proposition 8. Choose a positive integer J large enough that Then for 1 ≤ j ≤ J, let m j = ⌊n j/J 0 ⌋. For 1 ≤ j ≤ J and i ∈ {1, 2}, let A i,j be the event that m j < N n i (t) ≤ Km j for some t ≥ 0, and let D i,j be the event that (1 − δ)αm j ≤ N n i (t) ≤ αm j for some t ≥ 0, with the constant α as in Lemma 15. It follows from Proposition 8 and Lemma 15 that for 1 ≤ j ≤ J and i ∈ {1, 2}, we have We will need to establish that a similar inequality holds when we condition on the events D i,k for k > j. To this end, let U i,J = 0 for i ∈ {1, 2}, and for 1 ≤ j ≤ J − 1 and i ∈ {1, 2}, define the stopping time U i,j = inf{t ≥ 0 : N n i (t) ≤ αm j+1 }. For 1 ≤ j ≤ J and i ∈ {1, 2}, let , t ≥ 0) be the natural filtration associated with the process (Π n i (t), t ≥ 0). With N n i (U i,j ) figuring as the new starting point, the reasoning leading to (28) implies that for 1 ≤ j ≤ J and i ∈ {1, 2}, we have, on the event G i,j , Because m j+1 /m j → ∞ as n 0 → ∞, it follows from Proposition 8 that Since D i,k ∈ F i (U i,j ) for 1 ≤ j < k ≤ J and i ∈ {1, 2}, the results (29) and (30) imply that if the processes (Π n 1 (t), t ≥ 0) and (Π n 2 (t), t ≥ 0) are independent, then lim sup We now couple the processes (Π n 1 (t), t ≥ 0) and (Π n 2 (t), t ≥ 0). We allow the two processes to evolve independently until the times U 1,J−1 and U 2,J−1 respectively. If D 1,J ∩ D 2,J occurs, then we stop. Otherwise, we allow the processes to continue to evolve independently until the times U 1,J−2 and U 2,J−2 respectively. Then we stop if D 1,J−1 ∩ D 2,J−1 occurs, and otherwise continue as before. According to (31), with probability at least 1 − ε/2, we will eventually come to a value of j such that D 1,j ∩ D 2,j occurs. In that case, the independent constructions will be stopped at the times U 1,j−1 and U 2,j−1 respectively, at which times both processes will have between (1 − δ)αm j and αm j blocks.

Non-convergence for Eldon-Wakeley coalescents
To provide an example where the distribution of the size of the last merger does not converge as n → ∞, we now focus on the class of coalescents proposed in [4] and thus assume that the measure Λ is concentrated in one point p = 0, 1. Because of Theorem 1, for such coalescents the size of the last merger is tight. We claim that still L n does not converge in distribution as n → ∞.
There are obvious relations to non-convergence and periodicity phenomena in the so-called leader election, see e.g. Grübel and Hagemann [7] and references therein. For notational convenience we restrict ourselves to the case Λ = p 2 δ p and p = e −1 . Then the points of the Possion point process Ψ are of the form (σ i , p, u 1 , . . . , u n ), i = 1, 2, . . ., where the numbers 0 < σ 1 < σ 2 < · · · form a standard Poisson point process on R + . Define τ k,n as in (12).
We shall argue by contradiction, so let us assume that L n does converge in distribution. Then, as shown in Theorem 5, the sequence of time-reversed Markov chains converges as n → ∞ in distribution to a limiting Markov chain. This implies Together with N n we consider a process N n ≥ N n defined inductively as follows: N n (0) = N n (0) and at times σ i the random number N n (σ i ) is thinned according to p and afterwards is increased by one. Thinking of N n and N n as numbers of lines, the difference between both processes only arises, when by a thinning no line of N n is affected. Then N n does not change its value but N n increases by 1. Given N n (t) = m this takes place with probability q m with q = 1 − p. This may occur several times, and, as long as N n stays at level m, the expected increase of N n is bounded from above by q m /(1 − q m ) ≤ q m /p. Therefore, given ε > 0 there is a k such that Combined with (32) we obtain that also for N n the size of the first jump to 1 converges in distribution with n → ∞. Now consider a representation N n = U n + V n with random variables U n (0) and V n (0) to be specified below, where at the times σ i both U n and V n are thinned independently according to p and then V n is enlarged by 1. Note that for independent U n (0) and V n (0) the Markov chains U n and V n are independent as well. Also U n converges a.s. to zero, whereas V n is an aperiodic, irreducible chain, which is positive recurrent in view of E[V n (σ m+1 ) − V n (σ m ) | V n (σ m )] = 1 − pV n (σ m ) a.s. Let π be its stationary distribution.
Replacing λ by λ + n and letting n → ∞ we obtain The function f is smooth with period 1. By our assumption that L n converges in distribution as n → ∞, the function f does not depend on λ. To get a contradiction we compute its Fourier coefficients. They are given bŷ where the distribution of G is standard Gumbel. The characteristic function of the standard Gumbel distribution is equal to ϕ(t) = Γ(1 − it), t ∈ R. Also the gamma function is known to possess no zeros in the complex plane, thus none of the Fourier coefficients of f vanishes. Therefore f is non-constant, and we arrive at the promised contradiction.
To prove Theorem 3, we will need to establish a version of this result which holds for processes that can be obtained by adding a small state-dependent negative drift to a subordinator.
For all z > 0, define the process (Y z t ) t≥0 to be the solution to the SDE For all y ∈ R, let τ z y = inf{t ≥ 0 : Y z t ≤ y}. Then for all real numbers K > 0, we have Equation (35) says that for any bounded interval the probability that Y z jumps over the interval [−K, K] tends to one as the starting point z → ∞.
Proof. We will prove this result by following some of the ideas from [3] in the proof of Blackwell's Renewal Theorem in the infinite mean case. Let β z K = P (Y z τ z K ∈ [−K, K]), and let Seeking a contradiction, suppose β K > 0 for some K. Because β K is a nondecreasing function of K, it suffices to obtain a contradiction when K is chosen to be a sufficiently large positive integer. We will choose K to be large enough to satisfy the following four conditions: 1. We require g(K) < K, which is true for sufficiently large K by (33).

We require
for all positive integers ℓ. Note that (37) may fail for small values of K, in particular when S 1 has a lattice distribution, but will hold for sufficiently large K.
3. We require Note that this holds for sufficiently large K in view of (33) and the fact that t −1 S t → ∞ as t → ∞ by the Law of Large Numbers for subordinators.

Let
which tends to a finite limit as K → ∞ by (33). We require If β K > 0 for some K, then this condition holds for sufficiently large K by (33) and the fact that β K is a nondecreasing function of K.
Because (35) does not depend on the behavior of the process after time τ z K , we may consider instead the processes (Z z t ) t≥0 , defined as the solution to the SDE The processes Z z and Y z are the same until time τ z K , which implies that However, after time τ K z the process Z z is no longer affected by the drift term involving g. Because g is nonincreasing, we have Z z t ≤ z − S t + g(K)t for all t ≥ 0. Therefore, (38) implies that Let U z denote the potential measure associated with the process Z z , meaning that for all Borel subsets A of R. Suppose z > K, and n > K is a positive integer. If the process Z z enters the interval (n − 1, n], then it drops below n − 2 after a time whose expectation is at most α K , and then by (42) and the strong Markov property, the probability that the process Z z never returns to (n − 1, n] is at least 1/2. It follows that Let 0 < H 1 < H 2 < . . . denote the points of a rate one Poisson process, independent of (S t ) t≥0 . Note that the process (Z z Hn ) ∞ n=1 has the same potential measure as (Z z t ) t≥0 , in the sense that for all Borel subsets A of R, We can choose an increasing sequence (z m ) ∞ m=1 tending to infinity such that It follows from (41) and the monotonicity of g that Let ε > 0. Choose a positive integer L large enough that P (S H 1 ≥ 2LK) < ε. By (33) we can choose a positive integer m 0 large enough that for all m ≥ m 0 This together with (45) implies for all For the following we also require that z m 0 − 2LK > K. Let µ z denote the distribution of Z z H 1 . By applying the strong Markov property at time H 1 , we get for m ≥ m 0 , Write It follows from (44) and (46) that By (33), for all ℓ ∈ {0, 1, . . . , L} we have It follows from (36) By taking ε → 0, we see that for any fixed nonnegative integer ℓ, we have Now we also see from (47) In view of (37) and (50) Fix a positive integer M . By (44) and (51), we can choose m sufficiently large that β zm K > 2β K /3 and for ℓ ∈ {1, . . . , 3M }, there exists a point x ℓ ∈ [z m − 2ℓK, z m − 2(ℓ − 1)K) such that β x ℓ K > 2β K /3. Set x 0 = z m . We now consider the processes Z x 0 , Z x 3 , Z x 6 , . . . , Z x 3M , which satisfy the stochastic differential equation (41) with the same driving subordinator but different initial values. For 1 ≤ ℓ ≤ M , we have Because g is nonincreasing, the processes Z x 3(ℓ−1) and Z x 3ℓ get closer together over time but do not cross, which means In view of (43), we get a telescoping sum, and Let D ℓ be the event that By Markov's Inequality and (54), It follows from (52) that on the event D ℓ , we have Z K , the process Z x 3ℓ is no longer affected by the drift term involving g, and thus it decreases at least as fast as Z x 3(ℓ−1) . It follows that on D ℓ , we have Z for all t ≥ 0, and thus the process Z x 3ℓ can not be in the interval [−(K + 1), K] at the same time as Z x 3(ℓ−1) or any other process Z x 3j with j < ℓ. Let The discussion above implies that the sets I ℓ are disjoint. Let Given the event D ℓ ∩ {Z τ x 3ℓ K ∈ [−K, K]}, the expected Lebesgue measure of I ℓ is at least κ. Therefore, using (55) and the fact that β x 3ℓ K > 2β K /3 followed by (40), we get On the event that Z x 3ℓ , because of (53), we have Z zm τ x 3ℓ K ≤ (8ℓ + 1)K. During the next time unit, the process Z zm can increase by at most g(K), so if t ∈ I ℓ , then using that g(K) < K, we get Z zm t ≤ (8ℓ + 1)K + g(K) ≤ 10ℓK.
We next note that if t ∈ I ℓ then Z zm and therefore if y ≥ 10K, then Because the process (Z zm Hn ) ∞ n=0 is decreasing after it drops below the level K, it can only jump below zero one time. In particular, the expected number of times the process jumps below zero is bounded above by one. Therefore, letting ν x denote the conditional distribution of Z zm Hn − Z zm H n+1 given Z Hn = x, we have Let µ denote the distribution of the random variable S H 1 − H 1 g(K). Because g is decreasing, we have ν x ([x, ∞)) ≥ µ([x, ∞)) for all x ≥ K. Therefore, Combining this result with (56) gives = ∞, so the right-hand side is bigger than one for sufficiently large positive integers M , a contradiction.
Proof of Theorem 3. Let K ≥ 2 be a positive integer. If 2 ≤ N n (T n −) ≤ K and the event in (14) holds, then and the left inequality holds with T n − replaced by any t ∈ [0, T n ). In particular, putting K ′ := L + log K, we have The right inequality in (57) says that Y n (T n −) ≤ K ′ . With z := log n we have Y n (t) = Y z t in the notation of Proposition 16, hence τ z K ′ < T n . Thus −K ′ ≤ Y z τ z K ′ by (58). On the other hand we have Y z τ z K ′ ≤ K ′ by definition, and consequently Y z (4). Therefore, combining (35) and (14) we see that P (N n (T n −) ≤ K) → 0 as n → ∞, which proves Theorem 3.

Proof of Theorems 4 and 5
We prepare the proof of Theorem 4 by a few lemmas.
Lemma 17. Let i ≥ 2 and ε > 0. Then there is a k > i such that for all n Proof. Without loss of generality Λ({0}) = 0, because otherwise the coalescent comes down from infinity, and the claim is immediate.
We estimate these probabilities. First denote σ j = τ n/κ j ,n , j = 0, 1, . . . , and let r be the smallest integer such that n/κ r ≤ ℓ. Then From Lemma 6 we have E[σ j+1 − σ j ] ≤ C κ for a suitable constant C κ depending on κ, thus Thus, if we choose ℓ sufficiently large we obtain Second we have on the event B with b = N n (τ ℓ,n −) .
Recall that ρ ij denotes the rate for a jump of N n from state i to j, and ρ i is the rate at which N n leaves i. Next let for n ∈ N µ (n) i := 1 ρ i P (N n (t) = i for some t ≥ 0).

Also let
be the transition probability from state i to j of the block-counting process of our Λ-coalescent.
Lemma 18. Suppose that there are numbers µ i , i ≥ 2, not all vanishing, such that for some increasing sequence (n m ) m≥1 of natural numbers, as m → ∞, and therefore in the limit (along the specified sequence) by Fatou's Lemma Second for 2 ≤ i < k P (L n = i, N n (t) ∈ [i + 1, k] for all t ≥ 0) = j>k P (N n (t) = j for some t ≥ 0)P ji P i1 Applying Lemma 17 to the left-hand term it follows that for any ε > 0 there is a k such that for all n j>k µ (n) Therefore we may proceed in the equation along the given subsequence to the limit to obtain Thus µ is quasi-invariant.
Lemma 19. Let ν = (ν i ) i≥2 be a quasi-invariant measure such that i≥2 ν i ρ i1 = 1. Then for any integer a ≥ 1 there are probability measures ω a = (ω i,a ) 1≤i≤a on {1, . . . , a} such that for any i ≥ 2 we have ω i,a → 0 as a → ∞, and for 1 ≤ i ≤ a ν i = a n=i µ (n) i ω n,a .
Proof. Denote for i, j ≥ 1P where we set the undefined quantity ν 1 ρ 1 equal to 1. Then the quasi-invariance and the norming of ν implies jP ij = 1 for i ≥ 1. Thus we may consider the Markov chain (X r ) r=0,1,... on N with initial stateX 0 = 1 and transition matrix (P ij ). We claim that it fulfils the equation ν j ρ j = P (X r = j for some r), j ≥ 1.
We show this claim by induction. For j = 1 both terms are equal to 1. Suppose that it holds for 1 ≤ i ≤ j − 1. Then P (X r = j for some r) = Next define for an integer a > 1 the random times ξ a := max{r ≥ 0 :X r ≤ a} and for 1 ≤ i < a η ia := P (X 1 > a |X 0 = i).
Then for a > 1 and 1 = i 0 < i 1 < i 2 < · · · < i r ≤ a P (X 0 = i 0 ,X 1 = i 1 ,X 2 = i 2 , . . . ,X r = i r , ξ a = r) =P 1i 1P i 1 i 2 · · ·P i r−1 ir η ira = ω ir,a P iri r−1 · · · P i 2 i 1 P i 1 1 with ω i,a := ν i ρ i η ia , 1 ≤ i < a and i r = 1 in the case r = 0 (then both products of transition probabilities are set to be 1). For fixed i, summing over 1 < i 1 < i 2 < · · · < i r := i ≤ a and r ≥ 0 we obtain the equality P (X ξa = i)η ia = ω i,a , and thus 1≤i≤a ω i,a = 1. Therefore we may view the time-reversed process Y 0 =X ξa , Y 1 =X ξa−1 , . . . , Y ξa =X 0 as a Markov chain on {1, . . . , a} with initial distribution ω a , transition probabilities P ij and killed after reaching 1. This process coincides in distribution with the block-counting process of our original coalescent in discrete time, now with initial distribution ω a . This gives another way to express ν i : For 1 ≤ i < a ρ i ν i = P (Y r = i for some r ≤ ξ a ) = a−1 n=i ρ i µ (n) i ω n,a , which is (64). Also η ia → 0 for a → ∞, which implies ω i,a → 0. Thus the proof is finished.
In the limit a → ∞, since ω n,a → 0 for fixed n, we obtain ν i ≤ ε. Thus ν i = 0 for all i ≥ 2, which is a contradiction. Hence there is no quasi-invariant measure.
(ii) Now by assumption there is an increasing sequence of natural numbers n m , m ≥ 1, such that as m → ∞ for all i ≥ 2 and for some α > 0. From Lemma 18 it follows that µ i := π i /ρ i1 are the weights of a quasi-invariant measure µ. Now let ν be any quasi-invariant measure. Again we may assume j≥2 ν j ρ j1 = 1. By assumption we have µ 2 as n → ∞. Therefore from Lemma 19 it follows by a similar argument as in the proof of (i) that, as a → ∞, This shows that ν is a multiple of µ.
(iii) In the remaining situation by means of a diagonal argument there are two increasing sequences such that µ (n) i converges along both sequences for all i ≥ 2, but now the limiting measures are not multiples of each other. Thus another application of Lemma 18 gives the claim. This finishes the proof.