The Annals of Applied Probability

Conditional formulae for Gibbs-type exchangeable random partitions

Stefano Favaro, Antonio Lijoi, and Igor Prünster

Full-text: Open access


Gibbs-type random probability measures and the exchangeable random partitions they induce represent an important framework both from a theoretical and applied point of view. In the present paper, motivated by species sampling problems, we investigate some properties concerning the conditional distribution of the number of blocks with a certain frequency generated by Gibbs-type random partitions. The general results are then specialized to three noteworthy examples yielding completely explicit expressions of their distributions, moments and asymptotic behaviors. Such expressions can be interpreted as Bayesian nonparametric estimators of the rare species variety and their performance is tested on some real genomic data.

Article information

Ann. Appl. Probab., Volume 23, Number 5 (2013), 1721-1754.

First available in Project Euclid: 28 August 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60G57: Random measures 62G05: Estimation 62F15: Bayesian inference

Bayesian nonparametrics Exchangeable random partitions Gibbs-type random partitions sampling formulae small blocks species sampling problems $\sigma$-diversity


Favaro, Stefano; Lijoi, Antonio; Prünster, Igor. Conditional formulae for Gibbs-type exchangeable random partitions. Ann. Appl. Probab. 23 (2013), no. 5, 1721--1754. doi:10.1214/12-AAP843.

Export citation


  • [1] Arratia, R., Barbour, A. D. and Tavaré, S. (1992). Poisson process approximations for the Ewens sampling formula. Ann. Appl. Probab. 2 519–535.
  • [2] Arratia, R., Barbour, A. D. and Tavaré, S. (2003). Logarithmic Combinatorial Structures: A Probabilistic Approach. European Mathematical Society, Zürich.
  • [3] Barbour, A. D. (1992). Refined approximations for the Ewens sampling formula. Random Structures Algorithms 3 267–276.
  • [4] Charalambides, C. A. (2005). Combinatorial Methods in Discrete Distributions. Wiley-Interscience, Hoboken, NJ.
  • [5] Durden, C. and Dong, Q. (2009). RICHEST—A web server for richness estimation in biological data. Bioinformation 3 296–298.
  • [6] Ewens, W. J. (1972). The sampling theory of selectively neutral alleles. Theoret. Population Biology 3 87–112.
  • [7] Ewens, W. J. and Tavaré, S. (1998). The Ewens sampling formula, Update Vol. 2. In Encyclopedia of Statistical Science (S. Kotz, C. B. Read andD. L. Banks, eds.) 230–234. Wiley, New York.
  • [8] Favaro, S., Lijoi, A., Mena, R. H. and Prünster, I. (2009). Bayesian non-parametric inference for species variety with a two-parameter Poisson–Dirichlet process prior. J. R. Stat. Soc. Ser. B Stat. Methodol. 71 993–1008.
  • [9] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
  • [10] Gnedin, A. (2010). A species sampling model with finitely many types. Electron. Commun. Probab. 15 79–88.
  • [11] Gnedin, A. and Pitman, J. (2005). Exchangeable Gibbs partitions and Stirling triangles. Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 325 83–102.
  • [12] Griffiths, R. C. and Spanò, D. (2007). Record indices and age-ordered frequencies in exchangeable Gibbs partitions. Electron. J. Probab. 12 1101–1130.
  • [13] Ho, M. W., James, L. F. and Lau, J. W. (2007). Gibbs partitions (EPPF’s) derived from a stable subordinator are Fox H and Meijer G transforms. MatharXiv preprint. Available at arXiv:0708.0619v2.
  • [14] James, L. F. (2010). Lamperti-type laws. Ann. Appl. Probab. 20 1303–1340.
  • [15] Kingman, J. F. C. (1978). The representation of partition structures. J. Lond. Math. Soc. (2) 18 374–380.
  • [16] Kingman, J. F. C. (1982). The coalescent. Stochastic Process. Appl. 13 235–248.
  • [17] Lijoi, A., Mena, R. H. and Prünster, I. (2007). A Bayesian nonparametric method for prediction in EST analysis. BMC Bioinformatics 8 339.
  • [18] Lijoi, A., Mena, R. H. and Prünster, I. (2007). Bayesian nonparametric estimation of the probability of discovering new species. Biometrika 94 769–786.
  • [19] Lijoi, A., Mena, R. H. and Prünster, I. (2007). Controlling the reinforcement in Bayesian non-parametric mixture models. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 715–740.
  • [20] Lijoi, A., Prünster, I. and Walker, S. G. (2008). Bayesian nonparametric estimators derived from conditional Gibbs structures. Ann. Appl. Probab. 18 1519–1547.
  • [21] Pitman, J. (1995). Exchangeable and partially exchangeable random partitions. Probab. Theory Related Fields 102 145–158.
  • [22] Pitman, J. (2003). Poisson–Kingman partitions. In Statistics and Science: A Festschrift for Terry Speed (D. R. Goldstein, ed.). Institute of Mathematical Statistics Lecture Notes—Monograph Series 40 1–34. IMS, Beachwood, OH.
  • [23] Pitman, J. (2006). Combinatorial Stochastic Processes. Lecture Notes in Math. 1875. Springer, Berlin.
  • [24] Quackenbush, J., Cho, J., Lee, D., Liang, F., Holt, I., Karamycheva, S., Parvizi, B., Pertea, G., Sultana, R. and White, J. (2001). The TIGR gene indices: Analysis of gene transcript sequences in highly sampled eukaryotic species. Nucleic Acids Res. 29 159–164.
  • [25] Schweinsberg, J. (2010). The number of small blocks in exchangeable random partitions. ALEA Lat. Am. J. Probab. Math. Stat. 7 217–242.
  • [26] Valen, E. (2009). Deciphering transcriptional regulation—Computational approaches. Ph.D. thesis, Bioinformatics Centre, Univ. Copenhagen.