Record indices and age-ordered frequencies in Exchangeable Gibbs Partitions

The frequencies X 1 , X 2 , . . . of an exchangeable Gibbs random partition Π of N = { 1 , 2 , . . . } (Gnedin and Pitman (2005)) are considered in their age-order , i.e. their size-biased order. We study their dependence on the sequence i 1 , i 2 , . . . of least elements of the blocks of Π. In particular, conditioning on 1 = i 1 < i 2 < . . . , a representation is shown to be where { ξ j : j = 1 , 2 , . . . } is a sequence of independent Beta random variables. Sequences with such a product form are called neutral to the left . We show that the property of conditional left-neutrality in fact characterizes the Gibbs family among all exchangeable partitions, and leads to further interesting results on: (i) the conditional Mellin transform of X k , given i k , and (ii) the conditional distribution of the ﬁrst k normalized frequencies, given P kj =1 X j and i k ; the latter turns out to be a mixture of Dirichlet distributions. Many of the mentioned representations are extensions of Griﬃths and Lessard (2005) results on Ewens’ partitions. Beta-Stacy distribution, Neutral distributions, Record indices..


Introduction.
A random partition of [n] = {1, . . . , n} is a random collection Π n = {Π n1 , . . . , Π nKn } of disjoint nonempty subsets of [n] whose union is [n]. The K n classes of Π n , where K n is a random integer in [n], are conventionally ordered by their least elements 1 = i 1 < i 2 < . . . < i Kn ≤ n. We call {i j } the sequence of record indices of Π n , and define the age-ordered frequencies of Π n to be the vector n = (n 1 , . . . , n k ) such that n j is the cardinality of Π nj . Consistent Markov partitions Π = (Π 1 , Π 2 , . . .) can be generated by a set of predictive distributions specifying, for each n, how Π n+1 is likely to extend Π n , that is: given Π n , a conditional probability is assigned for the integer (n + 1) to join any particular class of Π n or to start a new class.
Every partition Π of N so generated is called an exchangeable Gibbs partition with parameters (α, V ) (EGP(α, V )), where exchangeable means that, for every n, the distribution of Π n is a symmetric function of the vector n = (n 1 , . . . , n k ) of its frequencies ( [25]) (see section 2 below). Actually, the whole family of EGPs, treated in [14] includes also the value α = −∞, for which the definition (1)-(2) should be modified; this case will not be treated in the present paper.
A special subfamily of EGPs is Pitman's two-parameter family, for which V is given by where either α ∈ [0, 1] and θ ≥ −α or α < 0 and θ = m|α| for some integer m. Here and in the following sections, a (x) will denote the generalized increasing factorial i.e. a (x) = Γ(a + x)/Γ(a), where Γ(·) is the Gamma function.
Pitman's family is characterized as the unique class of EGPs with V -coefficients of the form for some sequence of constants (c n ) ( [14], Corollary 4). If we let α = 0 in (4), we obtain the well known Ewens' partition for which: Ewens' family arose in the context of Population Genetics to describe the properties of a population of genes under the so-called infinitely-many-alleles model with parent-independent mutation (see e.g. [32], [21]) and became a paradigm for the modern developments of a theory of exchangeable random partitions ( [1], [25], [11]).
For every fixed α, the set of all EGP(α, V ) forms a convex set; Gnedin and Pitman proved it and gave a complete description of the extreme points ( [14], Theorem 12). It turns out, in particular, that for every α ≤ 0, the extreme set is given by partitions from Pitman's two-parameter family. For each α ∈ (0, 1), the extreme points are all partitions of the so-called Poisson-Kingman type with parameters (α, s), s > 0, whose V -coefficients are given by: with G α (q, t) : where f α is an α-stable density ( [29], Theorem 4.5). The partition induced by (6) has limit frequencies (ranked in a decreasing order) equal in distribution to the jump-sizes of the process (S t /S 1 : t ∈ [0, 1]), conditioned on S 1 = s, where (S t : t > 0) is a stable subordinator with density f α . The parameter s has the interpretation as the (a.s.) limit of the ratio K n /n α as n → ∞, where K n is the number of classes in the partition Π n generated via V n,k (s). K n is shown in [14] to play a central role in determining the extreme set of V n,k for every α; the distribution of K n , for every n, turns out to be of the form P(K n = k) = V n,k n k α .
where n k α are generalized Stirling numbers, defined as the coefficients of x n in n! α k k!
(see [14] and reference therein). As n → ∞, K n behaves differently for different choices of the parameter α: almost surely it will be finite for α < 0, K n ∼ S log n for α = 0 and K n ∼ Sn α for positive α, for some positive random variable S.
In this paper we want to study how the distribution of the limit age-ordered frequencies X j = lim n→∞ n j /n (j = 1, 2, . . .) in an Exchangeable Gibbs partition depends on its record indices i = (1 = i 1 < i 2 < . . .). To this purpose, we adopt a combinatorial approach proposed by Griffiths and Lessard [15] to study the distribution of the age-ordered allele frequencies X 1 , X 2 , . . . in a population corresponding to the so-called coalescent process with mutation (see e.g. [32]), whose equilibrium distribution is given by Ewens' partition (5), for some mutation parameter θ > 0. In such a context, the record index i j has the interpretation as the number of ancestral lineages surviving back in the past, just before the last gene of the j-th oldest type, observed in the current generation, is lost by mutation.
Following Griffiths and Lessard's steps we will (i) find, for every n, the distribution of the age ordered frequencies n = (n 1 , . . . , n k ), conditional on the record indices i n = (1 = i 1 < i 2 < . . . < i k ) of Π n , as well as the distribution of i n ; (ii) take their limits as n → ∞; (iii) for m = 1, 2, . . . , describe the distribution of the m-th age-ordered frequency conditional on i m alone. We will follow such steps, respectively, in sections 3.1, 3.2, 4. In addition, we will derive in section 5 a representation for the distribution of the first k age-ordered frequencies, conditional on their cumulative sum and on i k .
In section 3.2 we stress the connection between the representation (9) and a wide class of random discrete distributions, known in the literature of Bayesian nonparametric statistics as Neutral to the Left (NTL) processes ( [10], [5]) and use such a connection to show that the structure (9) with independent {ξ j } actually characterizes EGP's among all exchangeable partitions of N.
The representation (9) is useful to find the moments of both X j and j i=1 X i , conditional on the j-th record index i j alone, as shown in section 4. In the same section a recursive formula is found for the Mellin transform of both random quantities, in terms of the Mellin transform of the size-biased pick X 1 .
Finally in section 5 we obtain an expression for the density of the first k age-ordered frequencies X 1 , . . . , X k , conditional on k i=1 X i and i k , as a mixture of Dirichlet distributions on the (k − 1)-dimensional simplex (k = 1, 2, . . .). Such a result leads to a self-contained proof for the marginal distribution of i k , whose formula is closely related to Gnedin and Pitman's result (8).
As a completion to our results, it should be noticed that a representation of the unconditional distribution of the age-ordered frequencies of an EGP can be derived as a mixture of the ageordered distributions of their extreme points, which are known: for α ≤ 0, the extreme ageordered distribution is the celebrated two-parameter GEM distribution ( [25], [26]), for which: for a sequence (B j : j = 1, 2, . . .) of independent Beta random variables with parameters, respectively {(1 − α, θ + jα) : j = 1, 2, . . .}. Such a representation reflects a property of rightneutrality, which in a sense is the inverse of (9), as it will be clear in section 3.2. When α is strictly positive, the structure of the age-ordered frequencies in the extreme points lose such a simple structure. A description is available in [24].
We want to embed Griffiths and Lessard's method in the general setting of Pitman's theory of exchangeable and partially exchangeable random partitions, for which our main reference is [25]. Pitman's theory will be summarized in section 2 . The key role played by record indices in the study of random partitions has been emphasized by several authors, among which Kerov [19], Kerov and Tsilevich [20], and more recently by Gnedin [13], and Nacu [23], who showed that the law of a partially exchangeable random partition is completely determined by that of its record indices. We are indebted to an anonymous referee for signalling the last two references, whose findings have an intrinsic connection with many formulae in our section 2.
2 Exchangeable and partially exchangeable random partitions.
If q µ (n 1 , . . . , n k ) is symmetric with respect to permutations of its arguments, it is called an exchangeable partition probability function (EPPF), and the corresponding partition Π n an exchangeable random partition.
A minimal sufficient statistic for an exchangeable Π n is given, because of the symmetry of its EPPF, by its unordered frequencies (i.e. the count of how many frequencies in Π n are equal to 1, ..., to n), whose distribution is given by their (unordered) sampling formula: where |n| n = |n|! k j=1 n j ! and b i is the number of n j 's in n equal to i (i = 1, . . . , n).
It is easy to see that for a Ewens' partition (whose EPPF is (13) with α = 0 and V given by (5)), formula (14) returns the celebrated Ewens' sampling formula.
The distribution of the age-ordered frequencies differs from (14) only by a counting factor, where is the distribution of the size-biased permutation of n.
The notion of PEPF gives a generalized version of Hoppe's urn scheme, i.e. a predictive distribution for (the age-ordered frequencies of) Π n+1 , given (those of) Π n . In an urn of Hoppe's type there are colored balls and a black ball. Every time we draw a black ball, we return it in the urn with a new ball of a new distinct color. Otherwise, we add in the urn a ball of the same color as the ball just drawn.
Pitman's extended urn scheme works as follows. Let q be a PEPF, and assume that initially in the urn there is only the black ball. Label with j the j-th distinct color appearing in the sample. After n ≥ 1 samples, suppose we have put in the urn n = (n 1 , . . . , n k ) balls of colors 1, . . . , k, respectively, with colors labeled by their order of appearance. The probability that the next ball is of color j is where e j = (δ ij : i = 1, . . . , k) and δ xy is the Kronecker delta. The event (j = k + 1), in the last term of the right-hand side of (17), corresponds to a new distinct color being added to the urn.
The predictive distribution of a Gibbs partition is obtained from its EPPF by substituting (13) into (17): which gives back our definition (1)-(2) of an EGP.
The use of an urn scheme of the form (1)- (2) in Population Genetics is due to Hoppe [16] in the context of Ewens' partitions (infinitely-many-alleles model), for which the connection between order of appearance in a sample and age-order of alleles is shown by [8]. In [10] an extended version of Hoppe's approach is suggested for more complicated, still exchangeable population models (where e.g. mutation can be recurrent). Outside Population Genetics, the use of (1)-(2) for generating trees leading to Pitman's two-parameter GEM frequencies, can be found in the literature of random recursive trees (see e.g. [7]). Urn schemes of the form (17) are a most natural tool to express one's a priori opinions in a Bayesian statistical context, as pointed out by [33] and [27]. Examples of recent applications of Exchangeable Gibbs partitions in Bayesian nonparametric statistics are in [22], [17]. The connection between (not necessarily infinite) Gibbs partitions and coagulation-fragmentation processes is explored by [3] (see also [2] and reference therein).

Distribution of record indices in partially exchangeable partitions.
Let Π be a partially exchangeable random partition. Since, for every n, its collection of ageordered frequencies n = (n 1 , . . . , n k ) is a sufficient statistic for Π n , all realizations π n with the same n and the same record indices i n = 1 < i 2 < . . . < i k ≤ n must have equal probability.
To evaluate the joint probability of the pair (n, i n ), we only need to replace a(n) in (15) by an appropriate counting factor. This is equal to the number of arrangements of n balls, labelled from 1 to n, in k boxes with the constraint that exactly n j balls fall in the same box as the ball i j . Such a number was shown by [15] to be equal to is a partially exchangeable random partition with PEPF q n , then the joint probability of age-ordered frequencies and record indices is The distribution of the record indices can be easily derived by marginalizing: where B n (i n ) = {(n 1 , . . . , n k ) : is the set of all possible n compatible with i. In [15] such a formula is derived for the particular case of Ewens' partitions. For general random partitions see also [23], section 2.
Notice that, for every n such that |n| = n, is the set of all possible i n compatible with n. Then the marginal distribution of the age-ordered frequencies (15) is recovered by summing (19) over C(n).
This observation incidentally links a classical combinatorial result to partially exchangeable random partitions.
(i) Given the frequencies n = (n 1 , . . . , n k ) in age-order, the probability that the least elements of the classes of Π n are i n = (i 1 , . . . , i k ), does not depend on q µ and is given by (ii) Let W j = lim n→∞ S j /n. Conditional on {W j : j = 1, 2, . . .}, the waiting times are independent geometric random variables, each with parameter (1 − W j−1 ), respectively.
Proof. Part (i) can be obtained by a manipulation of a standard result on uniform random permutations of [n]. Part two can be proved by using a representation theorem due to Pitman ([25], Theorem 8). We prefer to give a direct proof of both parts to make clear their connection. Simply notice that, for every n and i, the right-hand side of (21) is equal to a(n, i n )/a(n). Then, for every n, whereμ(n) is as in (15), hence P(i n |n) is a regular conditional probability and (i) is proved. Now, consider the set Also define, for j = 1, . . . , k − l + 1, n * j := S * j − S * j−1 with S * j := S j+l−1 − i l + 1, and i * j := i j+l−1 − i l + 1. Then C [i l ,l] (n) = C(n * ) so that, for a fixed l ≤ k, the conditional probability of i 2 , . . . , i l , given n = (n 1 , . . . , n k ), is where n * = n − i l + 1. The sum in (22) is 1; multiply and divide the remaining part by The probability can therefore be rewritten as which is the distribution of k − 1 independent geometric random variables, each with parameter (1 − W j ), and the proof is complete.
3 Age-ordered frequencies conditional on the record indices in Exchangeable Gibbs partitions 3.1 Conditional distribution of sample frequencies.
From now on we will focus only on EGP(α, V ). We have seen that the conditional distribution of the record indices, given the age-ordered frequencies of a partially exchangeable random partition, is purely combinatorial as it does not depend on its PEPF. We will now find the conditional distribution of the age-ordered frequencies n given the record indices, i.e. the step (i) of the plan outlined in the introduction. We show that such a distribution does not depend on the parameter V , which in fact affects only the marginal distribution of i n , as explained in the following Lemma.
. For each n, the probability that the record indices in Π n are i n = (i 1 , . . . , i k ) is where .
The sequence i 1 , i 2 , . . . forms a non-homogeneous Markov chain starting at i 1 = 1 and with transition probabilities given by Proof. The proof can be carried out by using the urn scheme (18). For every n, let K n be the number of distinct colors which appeared before the n + 1-th ball was picked. From (18), the sequence (K n : n ≥ 1) starts from K 1 = 1 and obeys, for every n, the prediction rule: By definition, K n jumps at points 1 < i 2 < . . ., due to the equivalence Thus, for every n, k ≤ n every sequence i n = (1 = i 1 < . . . < i k ≤ n) corresponds to a sequence (k 1 , . . . , k n ) such that k i j = j, j = 1, . . . , k and k m = k m−1 ∀m ∈ [n] : m / ∈ i n . From (27), The last product in (28) is equal to and this proves (24). The second part of the Lemma (i.e. the transition probabilities (26)) follow immediately just by replacing, in (24), n with i k , for every k, to show that for P j satisfying (26) for every j.
The distribution of the age-ordered frequencies in an EGP Π n , conditional on the record indices, can be easily obtained from Lemma 3.1 and (19).
Proposition 3.1. Let Π = (Π n ) be an EGP(α, V ), for some α ∈ (−∞, 1) and V = (V n,k : k ≤ n = 1, 2, . . .). For each n, the conditional distribution of the sample frequencies n in age-order, given the vector i n of indices, is independent of V and is equal tō Remark 3.1. Notice that, as α → 0, formula (30) reduces to: This is known as the law of cycle partitions of a permutation given the minimal elements of cycles, derived in the context of Ewens' partitions in [15].
Proof. Recall that the probability of a pair (n, i n ) is given bȳ Now it is easy to derive the conditional distribution of a configuration given a sequence i n , as: and the proof is complete.

The distribution of the limit frequencies given the record indices.
We now have all elements to derive a representation for the limit relative frequencies in ageorder, conditional on the limit sequence of record indices i = (i 1 < i 2 < . . .) generated by an EGP(α, V ).
Remark 3.2. Proposition 3.2 is a statement about a regular conditional distribution. The question about the existence of a limit conditional distribution of X|i as a function of i = lim n i n has different answer according to the choice of α, as a consequence of the limit behavior of K n , the number of blocks of an EGP Π n , as recalled in the introduction. For α < 0, i is almost surely a finite sequence; for nonnegative α, the length k of i will be a.s. either k ∼ s log n (for α = 0) or k ∼ sn α (for α > 0), for some s ∈ [0, ∞]. The infinite product representation (33) still holds in any case if we adopt the convention i k ≡ ∞ for every k > K ∞ where K ∞ := lim n→∞ K n .
Proof. The form (30) of the conditional density µ α (n|i n ) implies For some r < k, let a 2 , . . . , a r be positive integers and set a 1 = 0 and a r+1 = . . . = a k = 0.
Now take the sum (34) with i n replaced by i ′ n , and multiply it by ψ α,n (i n ). We obtain where the expectation is taken with respect toμ α (·|i n ). The left hand side of (35) is where ξ 1 , . . . , ξ k−1 are independent Beta random variables, each with parameters (1 − α, i j+1 − jα − 1).
Let b j = j i=1 a i . On the right hand side of (35), S 0 = 0, S k = n, so the product is equal to Since a j = 0 for j = 1 and j > r, as k, n → ∞ the product inside square brackets converges to 1 so the limit of (37) is where W j = lim n→∞ S j /n. Hence from (35) it follows that in the limit which gives the limit distribution of the cumulative sums: and the proof is complete.

Conditional Gibbs frequencies, Neutral distributions and invariance under size-biased permutation.
Proposition 3.2 says that, conditional on all the record indices i 1 , i 2 , . . ., the sequence of relative increments of an EGP(α, V ) is a sequence of independent coordinates. In fact, such a process can be interpreted as the negative, time-reversed version of a so-called Beta-Stacy process, a particular class of random discrete distributions, introduced in the context of Bayesian nonparametric statistics as a useful tool to make inference for right-censored data (see [30], [31] for a modern account). It is possible to show that such an independence property of the ξ sequence (conditional on the indices) actually characterizes the family of EGP partitions. To make clear such a statement we recall a concept of neutrality for random [0, 1]-valued sequences, introduced by Connor and Mosimann [4] in 1962 and refined in 1974 by Doksum [6] in the context of nonparametric inference and, more recently, by Walker and Muliere [31]. (i) Let P = (P 1 , P 2 , . . . , P k ) be a random point in [0, 1] k such that |P | = k i=1 P i ≤ 1 and, for every j = 1, . . . , k − 1 denote F j = j i=1 P i . P is called a Neutral to the Right (NTR) sequence if the vector (B j : 1 ≤ j < k) of relative increments . A NTR vector P such that |P | = 1 almost surely and, for every j < k, every increment B j is a Beta (α j , β j ), is called a Beta-Stacy distribution with parameter (α, β).
(ii) A Neutral to the left (NTL) vector P = (P 1 , P 2 , . . . , P k ) is a vector such that P * := (P k , P k−1 , . . . , P 1 ) is NTR. A Left-Beta-Stacy distribution is a NTL vector P such that P * is Beta-Stacy.
A known result due to [26] is that the only class of exchangeable random partitions whose limit age-ordered frequencies are (unconditionally) a NTR distribution, is Pitman's two-parameter family, i.e. the EGP(α, V ) with V -coefficients given by (4). In this case, the age-ordered frequencies follow the so-called two-parameter GEM distribution, a special case of Beta-Stacy distribution with each B j being a Beta(1 − α, θ + jα) random variable. The age-ordered frequencies of all other Gibbs partitions are not NTR; on the other side, Proposition 3.2 shows that, conditional on the record indices i 1 , i 2 , . . ., and on W k they are all NTL distributions. For a fixed k set By construction, the sequence Y 1 , . . . , Y k is a Beta-Stacy sequence with parameters α k,j = 1 − α and β k, . The property of (conditional) left-neutrality is maintained as k → ∞ (just condition on W K∞ = 1 where K ∞ = lim n→∞ K n ).
The following proposition is a converse of Proposition 3.2.
Proposition 3.3. Let X = (X 1 , X 2 , . . .) ∈ ∆ be the age-ordered frequencies of an infinite exchangeable random partition Π of N. Assume, conditionally on the record indices of Π, X is a NTL sequence. Then Π is an exchangeable Gibbs partition for some parameters (α, V ).
Proof. The frequencies of an exchangeable random partition of N are in age-order if and only if their distribution is invariant under size-biased permutation (ISBP, see [9], [26], [12]). To prove the proposition, we combine two known results: the first is a characterization of ISBP distributions; the second is a characterization of the Dirichlet distribution in terms of NTR processes. We recall such results in two lemmas.

Lemma 3.2.
Invariance under size-biased permutation ( [26], Theorem 4). Let X be a random point of [0, 1] ∞ such that |X| = 1 almost surely with respect to a probability measure dµ. For every k, let µ k denote the distribution of X 1 , . . . , X k , and G k the measure on [0, 1] k , absolutely continuous with respect to µ k with density where w j = j i=1 x i , j = 1, 2, . . . X is invariant under size-biased permutation if and only if G k is symmetric with respect to permutations of the coordinates in R k .
Let X be the frequencies of an exchangeable partition Π, and denote with P µ the marginal law of the record indices of Π. Consider the measure G k of Lemma 3.2. By Proposition 2.1 (ii), for every k An equivalent characterization of ISBP measure is: The law of X is invariant under size-biased permutation if and only if, for every k, there is a version of the conditional distribution The other result we recall is about Dirichlet distributions. . Let P be a random k-dimensional vector with positive components such that their sum equals 1. If P is NTR and P k does not depend on (1 − P k ) −1 (P 1 , . . . , P k−1 ). Then P has the Dirichlet distribution.
Now we have all elements to prove Proposition 3.3. Let µ(·|i 1 , i 2 . . .) be the distribution of a NTL vector X such that the distribution of ξ j := X j+1 /W j+1 , (with W j = j i=1 X i ) has marginal law γ j for j = 1, 2, . . .. For every k, given i 1 , . . . , i k , the vector (X 2 /W 2 , . . . , X k /W k ) is conditionally independent of W k and where ζ k is the conditional law of W k given i k = k. For X to be ISBP, corollary 3.1 implies that the product must be a symmetric function of x 1 , . . . , x k . Then, for every k, the vector (X 1 /W k , . . . , X k /W k ) is both NTL and NTR, which implies in particular that X k /W k is independent of W −1 k−1 (X 1 , . . . , X k−1 ). Therefore, by Lemma 3.3 and symmetry, ( X 1 W k , . . . , X k W k ) is, conditionally on W k and {i k = k}, a symmetric Dirichlet distribution, with parameter, say, 1 − α > 0. By (40), the EPPF corresponding to dµ is equal to By the NTL assumption, we can write where S j = j i=1 n i (j = 1, . . . , k). The last equality is due to .
which completes the proof.

Age-ordered frequencies conditional on a single record index.
A representation for the Mellin transform of the m-th age-ordered cumulative frequencies W m , conditional on i m alone (m = 1, 2, . . .) can be derived by using Proposition 3.2. We first point out a characterization for the moments of W m , stated in the following Lemma.
Lemma 4.1. Let X 1 , X 2 , . . . be the limit age-ordered frequencies generated by a Gibbs partition with parameters α, V . For every m = 1, 2, . . . and nonnegative integer n and Proof. Let Π be an EGP (α, V ) and denote Y j : j = 1, 2, . . . the sequence of indicators {0, 1} such that Y j = 1 if j is a record index of Π. Then Y 1 = 1 and, for every l, m ≤ l, By proposition 2.1 and formula (13), given the cumulative frequencies W = W 1 , W 2 , . . ., (see also [25], Theorem 6). Obviously this also implies that, conditional on W , the random sequence K l := l i=1 Y i , (l = 1, 2, . . .) is Markov, so we can write, for every l, m which proves the proposition for m = 1. The Markov property of K n and (45) also lead, for every m, to where the last equality is obtained as an n-fold iteration of (46).
The second part of the Lemma (formula (44)) follows from Proposition 3.2: which combined with (43) completes the proof.
Given the coefficients {V n,k }, analogous formulas to (43) and (44) can be obtained to describe the conditional Mellin transforms of W m and X m (respectively), in terms of the Mellin transform of the size-biased pick X 1 .
When α > 0, we saw in the introduction that every extreme point in the Gibbs family is a Poisson-Kingman (α, s) partition for some s > 0; in this case the density of X 1 is for an α-stable density f α ( [28], (57)), leading to where G α is as in (7). Thus, for every α, the structural distribution of a Gibbs(α, V ) partition, which defines V 1,1 [φ] in (49), can be obtained as mixture of the corresponding extreme structural distributions.
Proof. Note that, for φ = 0, 1, 2, . . ., the proposition holds by Lemma 4.1 with V n,m [φ] ≡ V n+φ,m . For general φ ≥ 0 observe that, for every m, n ∈ N, To see this, consider the random sequences Y n , K n defined in the proof of Lemma 4.1. By (45), where the last two equalities follow from (45), the Markov property of K n and the exchangeability of the Y ′ s.

A representation for normalized age-ordered frequencies in an exchangeable Gibbs partition.
In this section we provide a characterization of the density of the first k (normalized) age-ordered frequencies, given i k and W k , and an explicit formula for the marginal distribution of i k . We give a direct proof, obtained by comparison of the unconditional distribution of X 1 , . . . , X k , (k = 1, 2, . . .), in a general Gibbs partition, with its analogue in Pitman's two-parameter model. Such a comparison is naturally induced by proposition 3.2, which says that, conditional on the record indices, the distribution of the age-ordered frequencies is the same for every Gibbs partition. Remember that the limit (unconditional) age-ordered frequencies in such a family are described by the two-parameter GEM distribution, for which for a sequence B 1 , B 2 , . . . of independent Beta random variables with parameters, respectively, (1 − α, θ + jα) (see e.g. [27]).
Remark 5.1. The density of i k can be expressed in terms of generalized Stirling numbers n k α , defined as the coefficients of x n in [14]). Formula (60) can be re-expressed as: which makes clear the connection between the distribution of i k and the distribution of K n recalled in the introduction (formula (8)). In fact, (60) can be deduced simply from (8) by the Markov property of the sequence K n as However here we give a self-contained proof in order to show how (60) is implied by proposition 3.2 through (59).
Proof. From (10), we know that By proposition 3.2 and Lemma 4.1, where the last equality follows after multiplying and dividing all terms by (1 − α) n 1 . Consequently, a moment formula for general Gibbs partitions is of the form: where, for 1 < j ≤ k − 1, .
Then (63) reads For Pitman's two-parameter family, this becomes The two-parameter GEM distribution implies that .
To prove part (ii), we only have to notice that, from (73) it must follow that hence a version of the marginal probability of i k is, for every k This can be also argued directly, simply by noting that, in an EGP(α, V ), and that V k+i,k V k+i−1,k−1 = P(i k = k + i|i k−1 ≤ k + i − 1, i k > k + i − 1).
We want to find an expression for the inner sum of (74). If we reconsider the term c(m) as in (70) (for m ∈ N k−1 0 : |m| = i), we see that