A Spectral Decomposition for the Block Counting Process of the Bolthausen–sznitman Coalescent

A spectral decomposition for the generator and the transition probabilities of the block counting process of the Bolthausen–Sznitman coalescent is derived. This decomposition is closely related to the Stirling numbers of the first and second kind. The proof is based on generating functions and exploits a certain factorization property of the Bolthausen–Sznitman coalescent. As an application we derive a formula for the hitting probability h(i, j) that the block counting process of the Bolthausen– Sznitman coalescent ever visits state j when started from state i ≥ j. Moreover, explicit formulas are derived for the moments and the distribution function of the absorption time τn of the Bolthausen–Sznitman coalescent started in a partition with n blocks. We provide an elementary proof for the well known convergence of τn − log log n in distribution to the standard Gumbel distribution. It is shown that the speed of this convergence is of order 1/ log n.


Introduction and results
The Bolthausen-Sznitman coalescent [2] is the particular Λ-coalescent Π = (Π t ) t≥0 with Λ being the uniform distribution on the unit interval [0, 1].Λ-coalescents are continuous time Markovian processes with state space P, the set of partitions of N := {1, 2, . ..}.During each transition blocks merge to form a single block.For more information on Λ-coalescents we refer the reader to [14] and [15].For t ∈ [0, ∞) let N t denote the number of blocks of Π t .It is well known that (N t ) t≥0 is a Markovian process, called the block counting process of Π.Let Q = (q ij ) i,j∈N denote the generator of (N t ) t≥0 .It is well known that Q is a lower left triangular matrix with entries , for i > j, q ii = − i−1 j=1 q ij = 1 − i and q ij = 0 for i < j.The quantities q i := −q ii = i − 1, i ∈ N, are (called) the total rates of the Bolthausen-Sznitman coalescent.
The main result (Theorem 1.1) provides a spectral decomposition for Q, which is closely related to the Stirling numbers.The proof of Theorem 1.1 is given in Section 2.
Remark 1.2.For alternative formulas for r ij and l ij we refer the reader to (2.12), (2.13) and (2.15).The first entries of R and L are provided in (2.16).
The next corollary provides the spectral decomposition of the transition matrix of the block counting process (N t ) t≥0 of the Bolthausen-Sznitman coalescent.
Corollary 1.3.(Spectral decomposition of the transition matrix) For all t ∈ [0, ∞) the transition matrix P (t) := (P(N t = j | N 0 = i)) i,j∈N of the block counting process (N t ) t≥0 of the Bolthausen-Sznitman coalescent has spectral decomposition P (t) = Re tD L, i.e.
i, j ∈ N, where D is the diagonal matrix defined in Theorem 1.1 and R = (r ij ) i,j∈N and L = (l ij ) i,j∈N are defined via (1.2).
Remark 1.4.As a consequence of Corollary 1.3, the block counting process has resolvent (see, for example, Norris [13, p. 146 As an application we provide in the following a formula for the probability that the block counting process ever visits state j ∈ N when started from state i ∈ N with i ≥ j.

Corollary 1.5. (Hitting probabilities)
The probability h(i, j) := P(N t = j for some t ≥ 0 | N 0 = i) that the block counting process hits state j ∈ N started from state i ∈ N with i ≥ j is given by h(i, 1) = 1 and Spectral decomposition for the Bolthausen-Sznitman coalescent As a further application we provide in the following formulas for the moments, the distribution function and the Laplace transform of the absorption time τ n until the Bolthausen-Sznitman coalescent, started in a partition with n ∈ N blocks, reaches its absorbing state.In the biological context τ n is called the time back to the most recent common ancestor of a sample of size n. ( and Laplace transform In particular, τ n − log log n → τ in distribution as n → ∞, where τ is standard Gumbel distributed with distribution function F (t) := e −e −t , t ∈ R. ).Our alternative proof of this convergence is based on the last expression in (1.5) and, hence, rather elementary and follows by a straightforward application of Stirling's formula for Γ(x) as x → ∞.
2. Let Ψ(x) := (d/dx)(log Γ(x)) = Γ (x)/Γ(x) denote the digamma function (logarithmic derivative of the Gamma function).Taking the derivative with respect to t in (1.5) it is readily seen that for n ≥ 2 the absorption time τ n has density Note that, for n ≥ 2, P(τ n ≤ t) ∼ t/(n − 1) as t 0 and, hence, lim t 0 f τn (t) = 1/(n − 1).Moreover, straightforward calculations show that f τn (t + log log n) → f (t) as n → ∞, where f is the density of the standard Gumbel distribution, i.e. f (t) := F (t) = e −t F (t) for all t ∈ R. Note that this local convergence holds uniformly in t ∈ R due to the inversion formula for densities, so we have lim n→∞ sup t∈R |f τn (t + log log n) − f (t)| = 0.
3. The fact that the distribution function (1.5) of the absorption time τ n is known explicitly can be further exploited.For example, it is readily checked that for all x ∈ R, P(τ n − log log n ≤ x) − F (x) ∼ −γe −x F (x)/ log n as n → ∞, where γ ≈ 0.577216 denotes Euler's constant.Thus, the speed of the convergence of τ n − log log n to the Gumbel distribution is of order 1/ log n.
Final remark.For the Kingman coalescent the spectral decomposition Q = RDL is well known (see Appendix).For the star-shaped coalescent the spectral decomposition Q = RDL can be readily calculated and is omitted here.Finding the spectral decomposition of the generator Q of the block counting process for other exchangeable coalescents, for example for the Λ-coalescent with Λ = β(2 − α, α) being the beta-distribution with parameters 2 − α and α, α ∈ (0, 2) \ {1}, is an open problem.Note that the spectral decomposition Q = RDL exists for any exchangeable coalescent, since the generator Q is lower left triangular.More precisely, R and L are recursively defined via the first equation in (2.3) and (2.4) respectively, where q ij and q i are the rates and total rates of the exchangeable coalescent.

Proofs
Before Theorem 1.1 will be proven, we mention two well known recursions for the Stirling numbers.These recursions are provided here, since they will be used in the proof of Theorem 1.1.
whereas the Stirling numbers of the second kind satisfy the recursion Proof.Let us verify (2.1) by induction on i. Obviously, (2.1) holds for i = 1, since |s(1, j)| = δ j1 = |s(0, j − 1)| for all j ∈ N. The induction from i to i + 1 works as follows.
We verify (2.2) as follows.There are i−1 k possibilities to choose a subset A of size k from {1, . . ., i − 1} and there are S(k, j − 1) possibilities to partition this subset A into j − 1 nonempty subsets.Together with the nonempty set ({1, . . ., i − 1} \ A) ∪ {i} one obtains, after summing over all possible values of k, the number S(i, j) of partitions of {1, . . ., i} into j nonempty subsets.
Let us now turn to the proof of the spectral decomposition Q = RDL (see Theorem 1.1).The following proof is based on generating functions and has the advantage that the solution (1.2) for R and L does not need to be known in advance in order to perform the proof.The solution (1.2) pops up naturally during the proof as the coefficients of appropriately chosen generating functions.Crucial for the proof is the factorization formula (2.5), which essentially goes back to the particular form of the rates q ij in (1.1).
Proof.(of Theorem 1.1) Let D = (d ij ) i,j∈N be the diagonal matrix with entries d ii := q ii = −q i = 1 − i, i ∈ N, and d ij := 0 for i = j.Furthermore, let R = (r ij ) i,j∈N be the ECP 19 (2014), paper 47. lower left triangular matrix with entries recursively defined for each j ∈ N via r jj := 1 and (2.3) Since q ii = −q i , i ∈ N, we conclude that r ij q jj = i k=j q ik r kj .Thus, the entries r ij of R are defined such that RD = QR.Define L := R −1 .Then, the spectral decomposition Q = RDL holds.Moreover, DL = LQ and, hence, q ii l ij = i k=j l ik q kj , i, j ∈ N. Since q ii = −q i , i ∈ N, we obtain for each i ∈ N the backward recursion l ii = 1 and (2.4) Let us verify by induction on i (≥ j) that with the convention that empty products are equal to 1.For i = j this inequality obviously holds, since r jj = 1.The induction step from {j, . . ., i − 1} to i (> j) works as follows.By (2.3) and by induction, q l q l − q j , which completes the induction.
Remark 2.2. 1.Let us provide some additional information on R and L. Plugging in the formula for the Stirling numbers of the first kind into the first formula in (1.2) leads to (2.12) In particular, r i2 = (i/2) , where h 0 := 0 and h n := n k=1 1/k denotes the n-th harmonic number, n ∈ N. Thus, r i2 ∼ log i as i → ∞.More generally, for arbitrary but fixed j ∈ N the absolute Stirling numbers of the first kind |s(i, j)| satisfy the asymptotics |s(i, j)| ∼ (i−1)!(log i) j−1 /(j −1)! as i → ∞, see for example [1, p. 824].Thus, for each j ∈ N we conclude from (1.2) that r ij ∼ (log i) j−1 as i → ∞.For each j ∈ N it follows from the middle expression in (2.8) that the sequence (r ij ) i∈N is monotone increasing in i. Plugging in (2.10) into the second formula in (1.2) leads to Alternatively one may also plug in the formula for the Stirling numbers of the second kind into the second formula in (1.2) leading to (2.15) Note that formula (2.15) for l ij has structural similarities with formula (2.12) for r ij .
These similarities point to the spectral decomposition of the partition valued Bolthausen-Sznitman n-coalescent, which will be provided in [9].
2. Eqs.(2.12), (2.13) and (2.15) are useful to compute the entries of R and L numerically.One obtains −2 1 2).Therefore, the spectral decomposition of the transition matrix is P (t) = e tQ = e tRDL = Re tD L. Thus, p ij (t) := P(N t = j | N 0 = i) = i k=j e −q k t r ik l kj for all i, j ∈ N. The last equality in (1.3) follows from (1.2).
Proof.(of Corollary 1.5)The Green matrix G of (N t ) t≥0 is given by G := ∞ 0 P (t) dt, cf.[13, p. 145].Writing G = (g(i, j)) i,j∈N and using the spectral decomposition of P (t) provided in Corollary 1.3 we obtain for 2 ≤ j ≤ i where the last equality follows from (1.2).We have g(i, j) = h(i, j)/(q j (1 − f j )), where h(i, j) is the probability of hitting j starting from i and f j is the return probability for j.
Clearly, h(i, 1) = 1, since the block counting process, started at i ∈ N, reaches its absorbing state 1 almost surely.
Remark 2.3.Alternatively, the product formula (1.5) for the distribution function of τ n follows from the construction of the Bolthausen-Sznitman coalescent via the random recursive tree due to Goldschmidt and Martin [6] as follows.Recall a random recursive tree with n vertices is a uniform tree in the set of labelled trees that have increasing labels along the branches from the root, labelled 1.A recursive construction is the following.Start with the tree reduced to a single vertex labelled 1.When the tree with n − 1 vertices is built, the nth vertex is attached by an edge to a uniformly chosen vertex with label in {1, . . ., n − 1}.The Bolthausen-Sznitman coalescent is then obtained (see [6]) by cutting down this tree, and, in this construction, the event {τ n ≤ t} coincides with n i=2 ({i is a child of the root, e i ≤ t} ∪ {i is not a child of the root}), where e 2 , . . .e n are independent standard exponential random variables.These events, indexed by i, are furthermore independent.Formula (1.5) now follows by the straightforward calculation The following third approach derives formula (1.5) via the Chinese restaurant process.It is known (see [14,Theorem 14]) that Π t is PD(e −t , 0)-distributed.On the other hand the distribution PD(α, θ) is obtained by considering the ranked frequencies of the partition of an (α, θ)-Chinese restaurant process.Thus, the event {τ n ≤ t} coincides with {the first n customers in a (e −t , 0)-Chinese restaurant sit at table 1}.
Since any new customer joins the m k > 0 customers at table k with probability (m k − e −t )/m, where m := k m k denotes the number of already present customers, we obtain as required.

Appendix
For completeness we provide the spectral decomposition of the block counting process of the Kingman coalescent, which is the particular Λ-coalescent with Λ = δ 0 being the Dirac measure at 0. The block counting process (N t ) t≥0 of the Kingman coalescent is a pure death process with total rates q i = i(i − 1)/2, i ∈ N. The generator Q of (N t ) t≥0 has spectral decomposition (see, for example, [ = (−1) i−j (i − 1)!i!(i + j − 2)! (j − 1)!j!(i − j)!(2i − 2)! , i ≥ j.

Corollary 1 . 7 .
(Moments of the absorption time) The absorption time τ n of the Bolthausen-Sznitman coalescent has moments

Lemma 2 . 1 .
The absolute Stirling numbers of the first kind satisfy the recursion