On the Moment-Transfer Approach for Random Variables Satisfying a One-Sided Distributional Recurrence

The moment-transfer approach is a standard tool for deriving limit laws of sequences of random variables satisfying a distributional recurrence. However, so far the approach could not be applied to certain "one-sided" recurrences with slowly varying moments and normal limit law. In this paper, we propose a modified version of the moment-transfer approach which can be applied to such recurrences. Moreover, we demonstrate the usefulness of our approach by re-deriving several recent results in an almost automatic fashion.


Introduction
In combinatorics and computer science, one often encounters sequence of random variables which satisfy a distributional recurrence. For instance, the following recurrence arises in the analysis of quicksort (see Hwang and Neininger [13] for background): let X n be a sequence of random variables satisfying X n d = X I n + X * n−1−I n where X 0 = 0 and I n = Uniform{0, . . . , n − 1}, X n d = X * n with (I n ) n≥1 , (X n ) n≥0 , (X * n ) n≥0 ,, (T n ) n≥1 independent. One is then normally interested in properties such as asymptotic behavior of mean and variance as well as deeper properties such as limit laws, rates of convergence, etc.
As for limit laws, the so-called moment-transfer approach 1 has evolved into a major tool in recent years. Since in this work we only consider sequences of random variables with normal limit law, we explain the approach for this special case (for the general case see [13]). Roughly speaking, the approach consists of the following steps (see Figure 1): first, one observes that all moments (centered or non-centered) of X n satisfy a recurrence of the same type (the so-called underlying recurrence). For instance, the underlying recurrence for X n above is given by a n = 2 n n−1 j=0 a j + b n , (n ≥ 1), where a 0 = 0 and b n is a given sequence (called the toll sequence). Second, one derives general results that link the asymptotic behavior of b n to that of a n (called transfer theorems). Third, one uses the transfer theorems to obtain an asymptotic expansion for the mean. Forth, one derives the recurrences of the central moments (this step is called shifting-the-mean). Fifth, one uses the transfer theorems together with the expansion for the mean to derive an asymptotic expansion for the variance. Sixth, one uses induction to derive the first order asymptotics of all higher moments (the last two steps can actually be merged. However, one normally needs the variance to be able to guess the first order asymptotics of all higher moments). Finally, weak convergence follows from the Fréchet-Shohat theorem (see Lemma 1.43 in Elliott [7]). This approach has been used to treat numerous examples; see Chern et al. [3] and the survey article of Hwang [12] for many recent references.
Overall, the main ingredients in the moment-transfer approach are the transfer theorems, the remaining steps being almost automatic. However, maybe surprisingly, the approach does not work for some sequences of random variables satisfying particular simple distributional recurrences. One such example is given by the one-sided variant of (1). More precisely, let X n be a sequence of random variables satisfying X n d = X I n + 1, (n ≥ 1), where X 0 = 0 and I n = Uniform{0, . . . , n − 1} with (I n ) n≥1 and (X n ) n≥0 independent.
We provide some more details to illuminate where the approach fails. Therefore, observe that the underlying recurrence is given by a n = 1 n n−1 j=0 a j + b n , (n ≥ 1), where a 0 = 0 and b n is a given sequence. The next step is to obtain transfer theorems. For our crude purpose the following transfer theorems are enough: for α a non-negative integer, we have (i) b n ∼ log α n =⇒ a n ∼ log α+1 n/(α + 1) (ii) b n = log α n =⇒ a n = log α+1 n (these and more precise results will be proved in the next section; see also [13]). Now, the mean E(X n ) satisfies (3) with b n = 1. Hence, by transfer (i) above E(X n ) ∼ log n. Next, we shift the mean. Therefore, let A [r] n = E(X n − E(X n )) r . Then, We first treat the variance which is obtained by setting r = 2. This yields where we have used the asymptotics of the mean. Hence, again by transfer (i) above Var(X n ) ∼ log n. Finally, we generalize the latter argument to obtain the first order asymptotics of all central moments. Since we want to show a central limit theorem, we have to prove that for all m ≥ 0 Note that the claim holds for m = 0. As for the induction step assume that the claim is proved for all m < m. Then, in order to prove it for m, we first look at the toll sequence. In the even case, we have We first consider the term with k = 2m − 1. Here, log m n . This is, however, not strong enough to imply our claim. A similar problem occurs as well when considering odd central moments.
The problem why the approach fails is that the term with k = 2m − 1 has in fact a smaller order due to an additional cancellation. This cancellation can, however, not be detected if one only assumes (5). A similar problem arises in a great number of examples all of them having the common feature that the distributional recurrence is "one-sided", the moments are slowly varying and the limit law is normal. Here, the "one-sidedness" arises from the underlying process which solves a problem by breaking it into two parts, throwing one part away and proceeding only with the other one (many examples will be given below). In some particular examples, the moment-transfer approach still applies; see Bagchi and Pal [1]. However, in these cases the reason why the approach still works seems to be problem-specific. Here, we aim for a general method which can be universally applied to a great number of examples all of them exhibiting the same phenomena as above. For essentially the same class of examples, Neininger and Rüschendorf already proposed a refinement of their contraction method in [18]. So, our work can be regarded as an analogue by the moment-transfer approach of [18].
Apart from the moment-transfer approach and the contraction method, there are also many other approaches which can be used to prove asymptotic normality. For instance for (2), a straightforward computation shows that the probability generating function P n (t) = E(e t X n ) is given by where Γ(z) denotes the Γ-function. Then, the central limit theorem follows by singularity analysis and either classical tools or Hwang's quasi-power theorem; see page 644 in Flajolet and Sedgewick [10] for a detailed discussion. Yet another method uses approximation via a sum of an independent sequence of random variable and was used in [13]; see also Devroye [6] and Mahmoud [16].
Compared to other methods, the moment-transfer approach is sometimes considered as the last weapon for proving asymptotic normality due to its brute-force character. However, the method does offer a couple of advantages. First, it requires less complicated tools and is quite automatic once the transfer theorems of the underlying recurrence are derived. Secondly, it is well-suited for sequences of random variables satisfying a distributional recurrence, a situation often encountered in combinatorics and computer science. Finally, it also proves convergence of all moments which is stronger than just weak convergence.
We conclude the introduction by giving a short sketch of the paper. In the next section, we introduce our approach and apply it to X n above. Then, in the third section, we re-derive recent results on priority trees. This will put these results in a larger context. Moreover, our approach yields proofs which are simpler than the previous ones. In a final section, we discuss further examples which can be handled by our approach as well.
Notations. We use ε to denote a sufficient small constant which might change from one occurrence to the next. Similarly, Pol(x) denotes an unspecified polynomial which again might change from one occurrence to the next. Moreover, if needed, we indicate its degree as a subindex.

Asymptotic Normality of the Stirling Cycle Distribution
In this section, we show how to modify the moment-transfer approach such that it can be applied to (2).
Before starting, we give some motivation as for why we are interested in (2). The easiest example of a sequence X n leading to (2) is the number of cycles in a random permutation of size n. Indeed, let σ 1 · · · σ k denote the canonical cycle decomposition of a permutation of size n. Then, it is easy to see that the probability that σ k has length j equals 1/n. Consequently, Hence, X n satisfies our recurrence. Of course, the probability distribution of X n is well-known where c(n, k) denote the Stirling cycle numbers (or signless Stirling numbers of first kind). The asymptotic behavior of X n is important in many applications such as number of data moves in insitu permutation, number of key updates in selection sort, root degree in random recursive trees, depth of the first node in inorder traversal, etc.; for further examples see Bai et al. [2]. Now, we explain our modified moment-transfer approach. Our idea is simple: we only add one additional step to the scheme from the introduction. This new step will be added before the treatment of the variance and higher central moments and consists of showing by induction that all central moments admit an expansion of a specific form which in all our examples is an unspecified polynomial in log n plus an error term. This allows us to detect the additional cancellation which caused the problem in the introduction. Moreover, all remaining steps concerning the asymptotic order of the variance and higher central moments then just become claims about the degree and the leading coefficient of the polynomial and will be derived by another induction; see Figure 2 for a schematic diagram illustrating our approach. We mention in passing that the two induction steps could be combined (by proving two claims in the induction step). However, we keep them separate because of two reasons: first, for the sake of clarity and second to make the comparison of our modified moment-transfer approach with the previous version easier. Now, we show how to apply our modified approach to (2). Before we can do so, we however need a refinement of the transfer theorems from the introduction.
where c is a suitable constant.
Proof. It is easy to check that (3) has the general solution Now, in order to prove (i) observe that a n = 1 where the series is absolute convergent due to the assumption.
Also part (ii) immediately follows from (6) by a standard application of Euler-Maclaurin summation formula (for the latter see Section 4.5 in Flajolet and Sedgewick [9]).
Finally, part (iii) and (iv) are simple consequences of part (ii).
The above transfer theorem can be used to prove the following refinement of the asymptotic expansion of the mean from the introduction where c 0 is a suitable constant and ε > 0 is suitable small.
Next, we turn to central moments. As already mentioned, we first show that all central moments have an asymptotic expansion in powers of log n plus an error term.

Proposition 2. For all r ≥ 1, we have
where ε > 0 is suitable small.
Proof. Note that the claim is trivial for r = 1. Assume now that it holds for all r < r. In order to prove it for r, we plug the induction hypothesis and (7) into (4). Then, by applying the Euler-Maclaurin summation formula, we obtain B [r] n = Pol(log n) + (1/n ε ). Hence, the claim follows by the transfer theorem. Now, as already mentioned above, (5) is only a claim concerning the degree and the leading term of the polynomial in the previous proposition.
Proof. Note that the claim holds for m = 0. Assume now that the claim holds for all m < m. We prove it for m.
First, consider the even case. Then, the toll sequence is given by We start by looking at the contribution of k = 2m − 1 which is where c 1 is a suitable constant. Since the above integral vanishes this part contributes o(log m−1 n).
As for all other parts, using a similar reasoning shows that they contribute o(log m−1 n). Hence, Using the transfer theorem proves the claim in the even case.
As for the odd case, here the toll sequence becomes Using similar reasoning as above, the term with k = 2m contributes o(log m n). Overall, we have (re-)proved the following result which can be traced at least back to Goncharov [11].
Theorem 1 (Goncharov). As n → ∞, we have To summarize, the only difference of our approach to the previous version of the moment-transfer approach are two induction steps instead of only one. The first induction step establishes a certain shape of all central moments. Then, the second induction is used to derive more details concerning the leading term. Again, the main tool is the transfer theorem. Once such a result is established, the remaining proof is rather automatic.
We apply our new approach to a couple of other examples in the subsequent sections.

Analysis of Priority Trees
Priority trees are a data structure which are used for the implementation of priority queues. They are defined as binary, labelled trees whose left path has the property that every node except the last one has a (possibly empty) right subtree which is again a priority tree with all labels smaller than the label of the root and larger than the label of the left child of the root. A random priority tree is built from a random permutation on the set {1, . . . , n}. Random priority trees have been analyzed in several recent papers; see Kuba and Panholzer [15], Panholzer [19] and Panholzer and Prodinger [20].
In this section, we demonstrate that our modified moment-transfer approach applies straightforwardly to the analysis of random priority trees. Since we are just interested in the applicability of our approach, we only give the probabilistic problem and direct the interested reader to the literature for background.

Length of the Left Path.
We only briefly discuss this example due to its similarity to the example from the previous section. Let X n be the length of the left path in a random priority tree built from n records. Then, it was shown in [20] that where X 0 = Z 0 = 0, Y 0 = 1 and I n = Uniform{0, . . . , n − 1} with (I n ) n≥1 , (Y n ) n≥0 , (Z n ) n≥0 independent.
So, the central moments of Y n and Z n can be treated as in the previous section. Moreover, due to the first recurrence, the (centered and non-centered) moments of X n are connected to those of Y n and Z n . Using this connection it is straightforward to prove that E(X n ) ∼ 2 log n and that the r-th central moment of X n (denoted as in the previous section) satisfies Consequently, we have re-derived the following result.

Number of Key Comparisons for
Insertion. This is a more sophisticated example whose proof of the central limit theorem was only sketched in [15]. We will see that our approach applies quite straightforwardly. So, let X n denote the number of key comparisons when inserting a random node in a random priority tree built from n records. Then, as explained in [15], for n ≥ 1, where P(I n = j) = 1/n, 0 ≤ j < n, X 0 = 0, Y 0 = Z 0 = 1, the probability generating function of U n is given by Now, we apply our modified moment-transfer approach. The first step is to find the underlying recurrence which needs some tedious (but straightforward) computations. Therefore, let X (s, t) = n≥0 (n + 1)E e t X n s n ; Then, from the above distributional recurrences, we get with initial conditions X (0, t) = 1 and Y (0, t) = Z(0, t) = e t . Eliminating Y (s, t) and Z(s, t) gives . Now, letP n (t) = (n + 1)E(e t X n ). Reading off coefficients from the above differential equation yields where c 1 (t) = −4e t and c 2 (t) = 3 + 2e t and initial conditionsP 0 (t) = 1,P 1 (t) = 2e t , andP 2 (t) = e t + 2e 2t .
From this by differentiating and setting t = 0, we obtain the recurrence for the mean where c 0 = c 1 = −2 and c 2 = 5, initial conditions E(X 0 ) = 0, E(X 1 ) = 2, E(X 2 ) = 5, and toll sequence The same recurrence is also obtained for all higher moments (with a different toll sequence). Hence, the underlying recurrence is given by with certain initial conditions (note that in slight difference to the previous sections, this is the recurrence satisfied by the moments of X n multiplied with n + 1). So, we need a transfer theorem for this recurrence. Fortunately, this and more general recurrences were already studied in Chern et al. [5]. where c is a suitable constant.
Proof. See the method of Section 2 in [5].
Before we use this result to treat mean and central moments, we need a technical lemma. where H n = n j=1 1/ j is the n-th harmonic number. Hence, the claim follows from the well-known asymptotic expansion H n = log n + γ + (1/n), where γ denotes Euler's constant.

Lemma 1. We have
Assume now that the claim holds for all k < k. In order to prove it for k, observe that  where ε > 0 is suitable small.
Proof. All of the claims follow similarly. Hence, we just prove the first one. Therefore, note that ( j + 1) First, for α n observe that where we have used Euler-Maclaurin summation formula. Next, we treat β n . Here, we apply Corollary 1 and obtain where c 1 and c 2 are suitable constants. By another application of the Euler-Maclaurin summation formula and a trivial estimate for the remainder, where c 3 is a suitable constant and ε > 0 is suitable small. Overall, Hence, by the transfer theorem E(X n ) = 1 3 log 2 n + Pol 1 (log n) + (1/n ε ) .
Next, we turn to central moments. Therefore, set P n (t) = (n + 1)E(e t(X n −E(X n )) ). Then, from (8), we obtain for n ≥ 3, with initial conditions P 0 (t) = 1, P 1 (t) = 2e −t , and P 2 (t) = e −4t + 2e −3t . Next, set A [r] n = E(X n − E(X n )) r . Taking derivatives r times and setting t = 0 yields with initial conditions A [r] 2 = (−1) r (4 r + 2 · 3 r ) and toll sequence As before, we first show that all central moments admit an expansion in powers of log n plus an error term.

Proposition 5. For all r ≥ 1, we have
where ε > 0 is suitable small.
Proof. We use induction on r. Note that the claim for r = 1 is trivial. Next, we assume that the claim holds for all r < r. In order to show it for r, we again break the toll sequence into two parts α n and β n , where Now, α n and β n are treated with exactly the same ideas as for the mean above. For instance, when plugging the induction hypothesis and the expansion for the mean into α n one obtains sums such as Pol(log j) + 1/ j ε which due to Euler-Maclaurin summation formula yield nPol(log n) + n 1−ε . Hence, α n = nPol(log n) + n 1−ε . Similarly, by plugging the induction hypothesis and the expansion of the mean into β n and using Euler-Maclaurin summation formula and Corollary 1 one obtains that β n = nPol(log n) + n 1−ε . Overall, Applying the transform theorem concludes the induction.
Next, we refine the above expansion for the variance. Hence, we choose r = 2 in (11). Again we start with the toll sequence which we break into two parts For α n , we first consider i 1 = 0, i 2 = 2 and i 3 = 0, Next, we treat β n . Here, we first consider i 3 = 0. Then, after a similar computation as for α n , Proof. We use induction on m. Note that the claim holds for m = 0. Next assume that the claim holds for all m < m. We show that it holds for m as well.
First, let us consider the even case. Then, as before, we are going to break the toll sequence of (11) into two parts α n and β n , where :=P α n i 1 ,i 2 ,i 3 and β n = We first treat α n which we again break into two parts x [α] n and y [α] n according to whether i 3 is even or not, i.e., As for x [α] n , we first consider i 1 = 0, i 2 = 2 and i 3 = m − 1. Then, For all other terms, again by similar ideas, we obtain the bound n log 3m−2 n . Therefore, Next, we turn to β n which is handled in exactly the same manner. So, we again break it into two parts x [β] n and y [β] n according to whether i 3 is even or not. Consequently, As for x [β] n , we first consider i 3 = m − 1. Here, by Corollary 1 and the induction hypothesis,  and using the transfer theorem concludes the proof in the even case.
Next, we briefly sketch the odd case which can be treated with the same ideas as the even case. Again, we break the toll sequence into two parts α n and β n which are defined as above (with the only difference that 2m is replaced by 2m + 1). Then, as above, one shows that Finally, by the Fréchet-Shohat theorem, the last proposition implies the following theorem. Another example which is very similar to the one above is the depth of a random node in a random priority tree of size n; see [19]. Here, the underlying recurrence is as above. Hence, one can again use the transfer theorem to derive the central limit theorem. Since the details are straightforward, we do not give them.

Further Examples
In this final section, we briefly sketch some further examples. It should by now be clear that our approach essentially rests on the transfer theorem. Once such a result is established, the remaining proof is rather automatic. Hence, for the subsequent examples, we only give the distributional recurrence, the underlying recurrence, the transfer theorem and the final result.

Number of Key Comparisons for Insertion and Depth in Binary Search Trees.
These examples are similar but easier than the examples discussed in the previous section. For instance, let X n denote the number of key comparisons when inserting a random node in a random binary search tree built from n records (this quantity is also called "unsuccessful search"; see Chapter 2 in Mahmoud [16] for background). Then, for n ≥ 1, with probability ( j + 1)/(n + 1), X n−1− j + 1, with probability (n − j)/(n + 1) with P(I n = j) = 1/n, 0 ≤ j < n and X 0 = 0. From this, a straightforward computation reveals that the underlying recurrence (with a scaling factor n + 1 as in the previous section) is given by with a 0 = 0. A transfer theorem for this recurrence of similar type as in the previous section is easily derived and can be found in [13].
Hence, our approach applies and yields the following result. There are two interesting special cases of this result. First, for m = 2, the result gives the central limit theorem of the depth in random median-of-(2t+1) binary search trees, a result first derived in [6]. Second, for t = 0, the result gives the central limit theorem of the depth in m-ary search trees, a result first proved by Mahmoud and Pittel in [17].
Next, we consider d-dimensional grid trees. Here, we have for n ≥ m − 1, with a 0 = a 1 = · · · = a m−2 = 0 and This recurrence was studied in [3]. The following transfer theorem can be proved with tools from the latter paper.
where a 1 = 0. Unfortunately, due to the more complicated nature of π n, j this recurrence is more involved. In particular, we have not been able to prove an analogous result to part (i) of the transfer results above. However, we strongly conjecture that the following claim holds true. Conjecture 1. Consider (15). Let b n = (1/n ε ) with ε > 0 suitable small. Then, a n = c + (1/n ε ) , where c is a suitable constant.
As before, apart from this property, we need a couple of other transfer properties. However, once this conjecture is established, the other properties can be deduced from it. Proposition 10. Assume that the above conjecture holds.
Proof. All these properties follow from the conjecture by using similar ideas as in [14].
Finally, by applying our approach, we obtain the following result.