Feller coupling of cycles of permutations and Poisson spacings in inhomogeneous Bernoulli trials

Feller (1945) provided a coupling between the counts of cycles of various sizes in a uniform random permutation of [n] and the spacings between successes in a sequence of n independent Bernoulli trials with success probability 1/n at the nth trial. Arratia, Barbour and Tavaré (1992) extended Feller’s coupling, to associate cycles of random permutations governed by the Ewens (θ) distribution with spacings derived from independent Bernoulli trials with success probability θ/(n−1+θ) at the nth trial, and to conclude that in an infinite sequence of such trials, the numbers of spacings of length ` are independent Poisson variables with means θ/`. Ignatov (1978) first discovered this remarkable result in the uniform case θ = 1, by constructing Bernoulli (1/n) trials as the indicators of record values in a sequence of i.i.d. uniform [0, 1] variables. In the present article, the Poisson property of inhomogeneous Bernoulli spacings is explained by a variation of Ignatov’s approach for a general θ > 0. Moreover, our approach naturally provides random permutations of infinite sets whose cycle counts are exactly given by independent Poisson random variables.


Introduction
In [6], Feller introduces a coupling between the cycle structure of a uniformly distributed random permutation of order n and the spacings between successes in a sequence of n independent Bernoulli variables of parameters 1/n, 1/(n − 1), . . . , 1/2, 1. This coupling has been generalized to Ewens distributions for any parameter θ: a recent discussion on this topic, with references to further work is provided by Arratia, Barbour, and Tavaré [3], largely following their earlier work [1]. Their coupling, for a general positive integer n and θ > 0, may be constructed as follows. Consider a sequence (B i (θ)) 1≤i≤n of independent Bernoulli variables, B i (θ) with parameters θ/(θ + i − 1). Conditionally on (B i (θ)) 1≤i≤n , construct the random permutation σ of the set [n] := {1, 2, . . . , n}, as follows. First, define X 1 := 1, and then, recursively for 2 ≤ i ≤ n: • If B n+2−i (θ) = 1, X i is the smallest element of [n], different from X 1 , . . . , X i−1 .
Note that written in this fashion, each cycle starts with its minimal element, and the cycles are written in increasing order of their minimal elements.
The cycle structure of π n,θ can be deduced from the spacings between the Bernoulli variables B i (θ) which are equal to 1. More precisely, for ≥ 1, let us say that anspacing occurs in a sequence a 1 , a 2 , . . . of 0s and 1s, starting at position i − and ending at position i, if a i− · · · a i = 1 0 −1 1 meaning that the string of length + 1 is a 1 followed by − 1 zeros followed by 1. If C n, (θ) is the number of -spacings in B 1 (θ), . . . , B n (θ), 1, 0, 0, 0, . . . then there is the equality (C n, (θ), 1 ≤ ≤ n) = (K (π n,θ ), 1 ≤ ≤ n) where K (π n,θ ) is the number of cycles of length in the permutation π n,θ . By regarding the sequence (B i (θ)) 1≤i≤n as the first n terms of an infinite sequence (B i (θ)) i≥1 of independent Bernoulli variables, we get a coupling, on a single probability space, of the families of cycle lengths (K (π n,θ )) 1≤ ≤n for all values of n. We quickly deduce the following result by Arratia, Barbour and Tavaré, for which we provide a sketch of proof here for the reader's convenience in comparing with later arguments: Theorem 1.1 (Arratia,Barbour and Tavaré ([1], [3])). If C ∞, (θ) is the number ofspacings in the infinite sequence (B i (θ)) i≥1 of Bernoulli variables, B i (θ) having parameter θ/(θ + i − 1), then (C ∞, (θ)) ≥1 is a sequence of independent Poisson(θ/ ) variables.
Proof. If L n (θ) is the position of the last 1 in (B i (θ)) 1≤i≤n and J n (θ) := n + 1 − L n (θ) is the last spacing in the finite n scheme, then ECP 25 (2020), paper 73.
with strict inequality iff there is an -spacing in the infinite sequence (B j (θ)) j≥1 starting at j = i − and ending at j = i > n. Now, this event and the event {J n (θ) = } have probability tending to zero when n → ∞, so for fixed , C n, (θ) = C ∞, (θ) with probability tending to one. On the other hand, by (1.1), for any fixed K ≥ 1, (C n, (θ), 1 ≤ ≤ K) tends in law to independent Poisson(θ/ ) variables. The two last facts together imply the theorem.
Combining (1.1) and (1.2), we get a coupling of counts of small cycles (K (π n,θ )) 1≤ ≤k of a Ewens(θ) permutation to independent Poisson (θ/ ) counts (C ∞, (θ)) 1≤ ≤k , with a total variation error depending on k and θ which is easily bounded explicitly. This implies in particular that for every fixed k, as well as estimates of total variation error in this approximation which are useful for k = o(n): see [1,Theorems 1 and 3]. See also Sethuraman and Sethuraman [19] for a review of studies of the distribution of the numbers of -spacings in infinite sequences of independent Bernoulli trials with sequences of probabilities p i other than the sequence p i = θ/(θ + i − 1) involved in this coupling with a sequence of Ewens(θ) permutations.
The coupling described above provides a way to define a sequence of Ewens(θ) random permutations (π n,θ ) n≥1 whose cycle structures for different values of n are strongly related: from π n,θ to π n+1,θ , either a single fixed point is added, or a single cycle of π n,θ has its length increased by one. However, the coupling above does not uniquely define a joint distribution for π n,θ and π n+1,θ , because it does not say how the content of the cycles of π n+1,θ and π n,θ are related.
In the particular case θ = 1, when each π n,1 is a uniform random element of the set S n of permutations of [n], Ignatov [13] provides a nice construction which defines the joint law of (π n,1 ) n≥1 in a unique way. Let (U i ) i≥1 be a sequence of pairwise distinct elements of [0, 1], with no smallest element. From this sequence, define the lower record indices I 1 < I 2 < I 3 < . . . , as the set of indices I such that U I is smaller than U i for all i < I, the lower indicators (B i ) i≥1 , given by B i = 1 if i is a lower record index and by B i = 0 otherwise, and the inter-record stretches (V k ) k≥1 given by: We notice the following facts: • the inter-record stretches are elements of the space ∪ ∞ =1 [0, 1] of finite sequences in [0, 1] with undetermined length; • the first term of the stretch V k is the k-th lower record value R k := U I k ; • this first term R k of V k is the minimal term of the stretch V k ; • the length of the stretch V k is I k+1 − I k , the k-th inter-record spacing.
We can then define, for all n ≥ 1, a permutation π U n of {U 1 , . . . , U n } whose cycle structure is given by the inter-record stretches: more precisely, π U n (U i−1 ) = U i for all i ∈ {2, . . . , n} which are not lower record indices, π U n (U I k+1 −1 ) = U I k if k ≥ 1 is such that I k+1 − 1 ≤ n, and π U n (U n ) = U Ij where I j is the last lower record index such that I j ≤ n. The permutation π U n acts on the set {U 1 , . . . , U n }: it induces a permutation π n of [n] if we rename the m-th smallest element of this set by m (for example, the permutation 0.2 → 0.9, 0.4 → 0.5, 0.5 → 0.2, 0.9 → 0.4 induces the permutation 1 → 4, 2 → 3, 3 → 1, 4 → 2). It is not difficult to check that the permutation π n depends only on the relative order of U 1 , . . . , U n , in a way which induces a bijective map from S n to itself. This ECP 25 (2020), paper 73. bijection was proposed by Rényi [17] in the early 60s, and called the "transformation fondamentale" in a paper by Foata and Schützenberger [8], in a more general setting of combinatorics on words. Diaconis and Pitman [5] exploited this bijection to obtain the convergence in distribution (1.3) in the case θ = 1, with a total variation bound. This bound was sharpened and extended to the case of a general parameter θ > 0 in [1], as indicated above. But this argument for general θ loses track of the full Poisson structure of the record process for θ = 1.
Let us recall how this Poisson structure for θ = 1 was first exposed by Ignatov [13]. If (U i ) i≥1 is a sequence of i.i.d., uniform variables on [0, 1], then for all n ≥ 1, all possible orders of U 1 , . . . , U n occur with the same probability and then π n is uniformly distributed on S n . On the other hand, the lower record indicators (B i ) i≥1 are independent, Bernoulli variables, B i having parameter 1/i. The link between the construction of π n and the Feller coupling is the following: conditionally on (B i ) i≥1 , the distribution of π n is uniform on the set of permutations whose lengths of the cycles, ordered by increasing lowest element, are equal to the successive spacings between the 1's in the sequence (1, B n , B n−1 , . . . , B 1 ). One easily deduces the following result: given (B i ) 1≤i≤n , the conditional distribution of π n is the same as that given by the Feller coupling procedure using B i (1) = B i for all i.
The Poisson structure obtained by taking all the inter-record stretches together is quite remarkable: where P (•) is the conditional distribution of (U 1 , . . . , U ) given that U 1 < U i for every 1 < i ≤ .
To illustrate the notation: , that is the conditional distribution of (U 1 , U 2 ) given the event (U 1 < U 2 ) of probability 1/2. For it is easily seen that given all the points {(R k , I k+1 − I k ), k ≥ 1}, for each particular k, the conditional distribution of the stretch V k depends only on R k and I k+1 − I k , and given R k = r and I k+1 − I k = , the stretch V k with initial term r and length has the distribution of (U 1 , . . . , U ) given U 1 = r and r < U i for all 1 < i ≤ .
If we only consider the length of the inter-record stretches, we immediately deduce from Theorem 1.2 that the inter-record spacings (I k+1 − I k ) k≥1 , form a Poisson point process on the positive integers, with intensity 1/ at , meaning that the random variables ECP 25 (2020), paper 73. are independent Poisson variables with means 1/ . This results corresponds to the case θ = 1 of Theorem 1.1, since the lower record indicators (B i ) i≥1 are independent, Bernoulli variables, B i having parameter 1/i.
Regarded as a fact about inhomogeneous Bernoulli trials, this result is not at all obvious without a broader context involving additional randomization, such as Ignatov's context of record sequences, or the context of the Feller coupling for random permutations.
The link between the Feller coupling and the lower records of a sequence of random variables can be extended to the setting of Ewens distributed permutations with general parameter θ > 0, by changing the distribution of the sequence (U i ) i≥1 . We will prove the following result: Theorem 1.3. For θ > 0, let P θ be the probability measure on the set of infinite sequences in [0, 1], endowed with its Borel σ-algebra, such that for (U i ) i≥1 following the law P θ : • The first term U 1 is Beta distributed with parameters θ and 1.
• Conditionally on (U 1 , . . . , U n ), for min(U 1 , . . . , U n ) = r, the distribution of U n+1 is the mixture with weights r and 1 − r of the distribution of r times a Beta variable with parameters θ and 1, and the uniform distribution on [r, 1], i.e.
Then, the following statements hold: • The finite dimensional distributions of (U 1 , . . . , U n ) under P θ are absolutely continuous with respect to the Lebesgue measure on [0, 1] n , with density dP θ dP 1 (u 1 , . . . , u n ) = θ Kn min(u 1 , . . . , u n ) θ−1 where K n is the number of lower records in the sequence (u 1 , . . . , u n ). In particular, under P 1 , the variables (U i ) i≥1 are i.i.d., uniform on [0, 1]. • If (U i ) i≥1 follows the law P θ , then this sequence has a.s. no smallest element, the U i 's are pairwise distinct, and the inter-record stretches (V k ) k≥1 form a Poisson point process on ∪ ∞ =1 [0, 1] with mean measure θµ(•) for µ(•) as in (1.5). The fact that (U i ) i≥1 are i.i.d., uniform under P 1 is a restatement of Ignatov's Theorem 1.2. The description of the law of (U 1 , . . . , U n ) under P θ for general θ has already been indicated by Kerov and Tsilevich [15,Lemma 2], with upper rather than lower records, which exchanges U i with 1 − U i in the formulas. Kerov and Tsilevich have also associated random permutations to sequences following the distribution P θ , and these permutations are distributed with respect to the Ewens measure of parameter θ. However, the construction of [15] does not coincide with the construction given in the present paper The fact that P θ may also be described as in Theorem  . . , U n } whose cycle structure is given by the inter-record stretches, and let π n be the corresponding permutation of [n], with the same cycle structure. Then, for (U i ) i≥1 governed by the law P θ , • The B i are independent Bernoulli (θ/(i − 1 + θ)); • Conditionally on all the B i , the permutation π n is uniformly distributed among the permutations whose cycle lengths, ordered by increasing lowest elements, are given by the successive spacings between 1's in the sequence (1, B n , B n−1 , . . . , B 1 ); • Given (B i ) 1≤i≤n , the conditional distribution of π n is the same as that given by the Feller coupling procedure using B i (θ) = B i for all i; • The unconditional distribution of π n is Ewens with parameter θ.
Another thing we can remark is that all the values of θ > 0 can be coupled on a single probability space. Indeed, under P θ , the family of inter-record stretches forms a Poisson point process of intensity θµ(•), so can be constructed simultaneously for all θ by taking the points of a Poisson process of intensity equal to the product of Lebesgue measure on R + by the measure µ(•), and extracting the points for which the R + coordinate is smaller than θ. Such a coupling provides a dynamic version of the Feller coupling, with the parameter θ of the Ewens measure as its time parameter. The path structure of this S n -valued process (π n,θ , θ ≥ 0) can be understood as follows. It may be constructed with right-continuous step function paths, in which each jump involves insertion of a new cycle of some length from 1 to n, corresponding to a Poisson point which is a sequence in some component [0, 1] k of the sequence space with k ≥ , whose initial term is greater than the initial term of at least one sequence contributing to the current permutation of [n]. This insertion may delete some cycles, and/or shorten the final cycle, depending on the rank of the initial term of the new sequence relative to the initial terms associated with existing cycles. It does not seem easy to give a full probabilistic description of the dynamics of this S n -valued process. In particular, it may not be Markovian, due to the latent initial terms of the sequential fragments which determine the order of the cycles. As the partition of n induced by π n,θ is not necessarily refining as θ increases, this process is not the same as the evolution described by Gnedin and Pitman [10], in which partitions following the Ewens (θ) distribution are constructed for all values of θ > 0 to be refining as θ increases. Theorem 1.3 and Corollary 1.4 are proven in Section 2 of the present article. In Section 3, we use the measure P θ in order to construct some infinite random permutations, in a way which generalizes the Feller coupling. In Section 4, we provide a link between our construction and a result by Shepp and Lloyd on the cycle counts of permutations of random order.

Proof of Theorem 1.3 and Corollary 1.4
By induction on n, using the definition of P θ , we see that the density at (u 1 , . . . , u n ) of the law of (U 1 , . . . , U n ) under P θ is given by  Let (F i ) i≥0 be the filtration generated by the variables (U i ) i≥1 , and let T s be the first index i such that U i ≤ s: it is clear that T s is a stopping time with respect to (F i ) i≥0 . The equality above shows that for any event A n which is F n -measurable, where K n is the number of lower records in the sequence (U 1 , . . . , U n ). Now, let E be an event which is measurable with respect to the family of all inter-record stretches starting above the level s, and let L be the total length of these stretches. We can check that for n ≥ 0, the intersection of E and the event {L = n} can be written as the intersection of A n and T s = n + 1 for some F n -measurable event A n , which gives where N s is the number of inter-record stretches starting above the level s. Hence Since s ∈ (0, 1) can be arbitrarily chosen, we get the second statement of Theorem 1.3.
For the corollaries, we use the following key property: the density on F n of P θ with respect to P 1 can be written as the product of a function of the relative order of (U 1 , . . . , U n ), ECP 25 (2020), paper 73.
i.e. θ Kn , and a function of the order statistics of (U 1 , . . . , U n ), i.e. (min(U 1 , . . . , U n )) θ−1 . Since the relative order and the order statistics of (U 1 , . . . , U n ) are independent under P 1 , they remain independent under P θ for all θ > 0. Moreover, under P 1 , the record indicators B i are independent Bernoulli(1/i) variables, and the change of measure on these variables when we go from P 1 to P θ corresponds to a density factor proportional This easily implies that under P θ , the record indicators are independent Bernoulli(p i (θ)), This gives the first item of Corollary 1.4. Moreover, conditionally on the lower record indicators and the order statistics of (U 1 , . . . , U n ), all the possible relative orders of (U 1 , . . . , U n ) have the same probability, because of the form of the density of P θ with respect to P 1 . This implies the second item of Corollary 1.4, since the permutation π n is uniquely determined by the relative order of (U 1 , . . . , U n ). The third item of Corollary 1.4 is a direct consequence of the two first items, and the last item is due to the classical properties of the Feller coupling.

Infinite permutations
From any sequence (U i ) i≥1 of elements in [0, 1], with distinct values and no smallest element, we have seen how to construct a permutation π U n of {U 1 , . . . , U n } and a permutation π n of [n] from the inter-record stretches. It is also possible to define a permutation π U ∞ of the infinite set {U i , i ≥ 1}, in such a way that the cycles are given by the set of all inter-record stretches, i.e. π U ∞ (U i−1 ) = U i for all i ≥ 2 which are not lower record indices, and π U ∞ (U I k+1 −1 ) = U I k for all k ≥ 1. One easily checks that π U ∞ coincides with π U n on the set {U 1 , . . . , U n−1 } for all n ≥ 1.
The construction of π U ∞ can be seen as some kind of Feller coupling of infinite order, since the construction of π U n and π n can be related with the Feller coupling of order n, as we have seen previously. However, we observe that contrary to the case of the permutation π n which acts on the fixed set [n], the infinite set on which π U ∞ acts is itself a random set. Moreover, we observe that the cycles of π ∞ appear in decreasing order of their smallest element, i.e. in the reverse order with respect to the usual description of the Feller coupling. If we look at the sequence of permutations (π U n ) n≥1 we get a coupling of permutations of different orders, which has the property noted in the analysis of [1, §3], that "the cycles are built and completed one by one, in contrast to the Chinese Restaurant Process", with reference to the alternative construction of cycle-consistent random permutations of [n] discussed in [1, §2], and [16]. If the sequence of variables (U i ) i≥1 is distributed like P θ , then by Theorem 1.3, the cycle structure of π U ∞ is directly given by a Poisson point process on ∪ ∞ =1 [0, 1] with mean measure θµ(•). In particular, the number of cycles of different lengths is given by independent Poisson random variables of parameter θ/ , which generalizes the case θ = 1 studied by Ignatov.
If we consider, as at the end of the introduction, the dynamical version of our construction, where all the values of θ > 0 are coupled together, then the evolution of the cycle structure of π U ∞ when θ varies is easy to describe in terms of Poisson processes, contrary to the case where we consider permutations of finite order. In particular, the set of cycles of the permutation corresponding to θ = θ 1 + θ 2 has the same law as the ECP 25 (2020), paper 73. union of two independent sets of cycles, corresponding to the parameters θ = θ 1 and θ = θ 2 .

Connection with work by Shepp and Lloyd
In this section, we connect Theorem 1.3 to a model for a random permutation π of a set of random size N , first introduced by the work of Shepp and Lloyd [20] on the distribution of the lengths of the longest and shortest cycles of a uniform random permutation. In the Shepp and Lloyd model, N is assigned the geometric (p) distribution P(N ≥ n) = (1 − p) n for n ≥ 0. In a following paper [4], Balakrishnan, Sankaranarayanan, and Suyambulingom extended the model of Shepp and Lloyd to a much more general model of random permutations π of a set of random size N . For a particular choice of parameters, which was not singled out for special discussion in [4], the model of [4] assigns N a negative binomial distribution, as in the following Corollary, and given N = n the permutation π is governed by the Ewens(θ) distribution. See also [9], [12], [11] (Lemma 2.1). [21] (Theorem 2) and [14] for variants of this result with different interpretations, and further references.  ([20], [4], [9], [12]). Let θ > 0 and p ∈ (0, 1), and let N (θ, p) denote a random variable with the negative binomial (θ, p) distribution: for n ≥ 1, which implies that EN (θ, p) = θ(1 − p)/p. Let π be a permutation of random order, such that conditionally given N (θ, p) = n, π has order [n] and is distributed according to the Ewens(θ) measure. Then the number of cycles of π of different orders are independent Poisson variables of parameter (1 − p) / .
Proof. Let us consider, under P θ , the permutation π n of random order, n + 1 being the first index such that U n+1 < p. If we condition on the value of this index and on the order statistics of (U 1 , . . . , U n ), we get, from the expression of the density dP θ /dP 1 , a permutation π n following Ewens distribution of parameter θ. On the other hand, Theorem 1.3 implies that the number of cycles of different sizes in π n are independent Poisson variables, the expectation of the number of -cycles being: θµ([p, 1] ) = θ(1 − p) / .
Hence, the corollary is proven if we show that the law of the size n of the permutation is negative binomal (θ, p). Since the cycle lengths form a Poisson process with intensity proportional to θ when p is fixed, the law of the size of the permutation in function of θ corresponds to the marginals of a Lévy process. It is also the case for the negative binomial distribution, so it is enough to check that for θ = 1, n is geometrically distributed with parameter p. This fact is immediate since n + 1 is the first time when an i.i.d. sequence of uniform variables on [0, 1] hits the interval [0, p].
The proof above is related to the fact that if K (π) is the number of -cycles of π, N (θ, p) = ∞ =1 K (π) K of cycles of a uniform random permutation π n of [n]: by use of his coupling with Bernoulli(1/i) variables for 1 ≤ i ≤ n. This comes immediately after discussion of the compound Poisson representation of the negative binomial distribution, but without the connection indicated in Corollary 4.1. This model for constructing a negative binomial variable from independent Poisson counts of cycles of a random permutation of random size is also not mentioned in the otherwise very comprehensive account [2] of models related to the Ewens sampling formula.