The Annals of Applied Probability

Ordered and size-biased frequencies in GEM and Gibbs’ models for species sampling

Jim Pitman and Yuri Yakubovich

We describe the distribution of frequencies ordered by sample values in a random sample of size $n$ from the two parameter $\mathsf{GEM}(\alpha,\theta)$ random discrete distribution on the positive integers. These frequencies are a (size-$\alpha$)-biased random permutation of the sample frequencies in either ranked order, or in the order of appearance of values in the sampling process. This generalizes a well-known identity in distribution due to Donnelly and Tavaré [Adv. in Appl. Probab. 18 (1986) 1–19] for $\alpha=0$ to the case $0\le\alpha<1$. This description extends to sampling from $\operatorname{Gibbs}(\alpha)$ frequencies obtained by suitable conditioning of the $\mathsf{GEM}(\alpha,\theta)$ model, and yields a value-ordered version of the Chinese restaurant construction of $\mathsf{GEM}(\alpha,\theta)$ and $\operatorname{Gibbs}(\alpha)$ frequencies in the more usual size-biased order of their appearance. The proofs are based on a general construction of a finite sample $(X_{1},\dots,X_{n})$ from any random frequencies in size-biased order from the associated exchangeable random partition $\Pi_{\infty}$ of $\mathbb{N}$ which they generate.

Article information

Ann. Appl. Probab., Volume 28, Number 3 (2018), 1793-1820.

Received: April 2017
Revised: August 2017
First available in Project Euclid: 1 June 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60C05: Combinatorial probability
Secondary: 60G09: Exchangeability

Species sampling random exchangeable partition size-biased order GEM distribution Gibbs’ partitions Chinese restaurant construction


Pitman, Jim; Yakubovich, Yuri. Ordered and size-biased frequencies in GEM and Gibbs’ models for species sampling. Ann. Appl. Probab. 28 (2018), no. 3, 1793--1820. doi:10.1214/17-AAP1343.

