Electronic Journal of Statistics

Posterior sampling from $\varepsilon$-approximation of normalized completely random measure mixtures

Raffaele Argiento, Ilaria Bianchini, and Alessandra Guglielmi

Full-text: Open access

Abstract

This paper adopts a Bayesian nonparametric mixture model where the mixing distribution belongs to the wide class of normalized homogeneous completely random measures. We propose a truncation method for the mixing distribution by discarding the weights of the unnormalized measure smaller than a threshold. We prove convergence in law of our approximation, provide some theoretical properties, and characterize its posterior distribution so that a blocked Gibbs sampler is devised.

The versatility of the approximation is illustrated by two different applications. In the first the normalized Bessel random measure, encompassing the Dirichlet process, is introduced; goodness of fit indexes show its good performances as mixing measure for density estimation. The second describes how to incorporate covariates in the support of the normalized measure, leading to a linear dependent model for regression and clustering.

Article information

Source
Electron. J. Statist., Volume 10, Number 2 (2016), 3516-3547.

Dates
Received: September 2015
First available in Project Euclid: 16 November 2016

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1479287230

Digital Object Identifier
doi:10.1214/16-EJS1168

Mathematical Reviews number (MathSciNet)
MR3572858

Zentralblatt MATH identifier
1358.62034

Keywords
Bayesian nonparametric mixture models normalized completely random measures blocked Gibbs sampler finite dimensional approximation

Citation

Argiento, Raffaele; Bianchini, Ilaria; Guglielmi, Alessandra. Posterior sampling from $\varepsilon$-approximation of normalized completely random measure mixtures. Electron. J. Statist. 10 (2016), no. 2, 3516--3547. doi:10.1214/16-EJS1168. https://projecteuclid.org/euclid.ejs/1479287230


Export citation

References

  • [1] Antoniak, C.E. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems., The Annals of Statistics 2, 1152–1174.
  • [2] Arbel, J. and Prünster, I. (2016). A moment-matching Ferguson & Klass algorithm., Statistics and Computing, doi: 10.1007/s11222-016-9676-8.
  • [3] Argiento, R., Bianchini, I. and Guglielmi, A. (2016). A blocked Gibbs sampler for NGG-mixture models via a priori truncation., Statistics and Computing 26, 641–661.
  • [4] Argiento, R., Guglielmi, A., Hsiao, C., Ruggeri, F. and Wang, C. (2015). Modelling the association between clusters of SNPs and disease responses. In R. Mitra and P. Mueller (Eds.), Nonparametric Bayesian Methods in Biostatistics and Bioinformatics. Springer.
  • [5] Argiento, R., Guglielmi, A. and Pievatolo, A. (2010). Bayesian density estimation and model selection using nonparametric hierarchical mixtures., Computational Statistics and Data Analysis 54, 816–832.
  • [6] Asmussen, S. and Glynn, P.W., Stochastic simulation: algorithms and analysis, volume 57. Springer, New York, 2007.
  • [7] Barndorff-Nielsen, O.E. (2000)., Probability densities and Lévy densities. University of Aarhus. Centre for Mathematical Physics and Stochastics.
  • [8] Barrientos, A.F., Jara, A. and Quintana, F.A. (2012). On the support of MacEacherns dependent Dirichlet processes and extensions., Bayesian Analysis 7, 277–310.
  • [9] Barrios, E., Lijoi, A., Nieto-Barajas, L.E. and Prünster, I. (2013). Modeling with normalized random measure mixture models., Statistical Science 28, 313–334.
  • [10] Basford, K., McLachlan, G. and York, M. (1997). Modelling the distribution of stamp paper thickness via finite normal mixtures: The 1872 Hidalgo stamp issue of Mexico revisited., Journal of Applied Statistics 24, 169–180.
  • [11] Bondesson, L. (1982). On simulation from infinitely divisible distributions., Advances in Applied Probability 14, 855–869.
  • [12] Cook, R.D. and Weisberg, S. (1994)., An introduction to regression graphics. John Wiley & Son.
  • [13] Daley, D.J. and Vere-Jones, D. (2007)., An introduction to the theory of point processes: vol. II: general theory and structure. Springer.
  • [14] De Iorio, M., Johnson, W.O., Müller, P. and Rosner G.L. (2009). Bayesian nonparametric nonproportional hazards survival modeling., Biometrics 65, 762–771.
  • [15] Erdélyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G. and Bateman, H. (1953)., Higher transcendental functions, Volume 2. McGraw-Hill New York.
  • [16] Escobar, M. and West, M. (1995). Bayesian density estimation and inference using mixtures., Journal of the American Statistical Association 90, 577–588.
  • [17] Favaro, S. and Teh, Y. (2013). MCMC for normalized random measure mixture models., Statistical Science 28, 335–359.
  • [18] Feller, W. (1971)., An introduction to probability theory and its Applications, vol. II (Second Edition ed.). John Wiley, New York.
  • [19] Ferguson, T.S. and Klass, M.J. (1972). A representation of independent increment processes without Gaussian components., The Annals of Mathematical Statistics 43, 1634–1643.
  • [20] Foti, N. and Williamson, S. (2015). A survey of non-exchangeable priors for Bayesian nonparametric models., IEEE Transactions on pattern Analysis and Machine Intelligence 37, 359–371.
  • [21] Gelman, A., Hwang, J. and Vehtari, A. (2014). Understanding predictive information criteria for Bayesian models., Statistics and Computing 24, 997–1016.
  • [22] Gradshteyn, I. and Ryzhik, L. (2007)., Table of integrals, series, and products - Seventh Edition (Sixth ed.). San Diego (USA): Academic Press.
  • [23] Griffin, J. and Walker, S.G. (2011). Posterior simulation of normalized random measure mixtures., Journal of Computational and Graphical Statistics 20, 241–259.
  • [24] Griffin, J.E. (2013). An adaptive truncation method for inference in Bayesian nonparametric models., arXiv:1308.2045.
  • [25] Ishwaran, H. and James, L. (2001). Gibbs sampling methods for stick-breaking priors., J. Amer. Statist. Assoc. 96, 161–173.
  • [26] Ishwaran, H. and James, L.F. (2002). Approximate Dirichlet process computing in finite normal mixtures., Journal of Computational and Graphical Statistics 11, 508–532.
  • [27] James, L., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measures with independent increments., Scandinavian Journal of Statistics 36, 76–97.
  • [28] Jara, A., Hanson, T.E., Quintana, F.A., Müller, P. and Rosner, G.L. (2011). DPpackage: Bayesian semi-and nonparametric modeling in R., Journal of Statistical Software 40, 1.
  • [29] Kingman, J.F.C. (1975). Random discrete distributions., Journal of the Royal Statistical Society 37, 1–22.
  • [30] Kingman, J.F.C. (1993)., Poisson processes, Volume 3. Oxford university press.
  • [31] Lau, J.W. and Green, P.J. (2007). Bayesian model based clustering procedures., Journal of Computational and Graphical Statistics 16, 526–558.
  • [32] Lijoi, A., Mena, R.H. and Prünster, I. (2005). Hierarchical mixture modeling with normalized inverse-gaussian priors., Journal of the American Statistical Association 100, 1278–1291.
  • [33] Lo, A.J. (1984). On a class of Bayesian nonparametric estimates: I. density estimates., The Annals of Statistics 12, 351–357.
  • [34] Lomelí, M., Favaro, S. and Teh, Y.W. (2016). A marginal sampler for $\sigma$-stable Poisson-Kingman mixture models., Journal of Computational and Graphical Statistics, Latest articles.
  • [35] MacEachern, S.N. (2000). Dependent Dirichlet processes. Technical report, Department of Statistics, The Ohio State, University.
  • [36] McAuliffe, J.D., Blei, D.M. and Jordan, M.I. (2006). Nonparametric empirical Bayes for the Dirichlet process mixture model., Statistics and Computing 16, 5–14.
  • [37] Nieto-Barajas, L.E. (2013). Lévy-driven processes in bayesian nonparametric inference., Boletín de la Sociedad Matemática Mexicana (3) 19.
  • [38] Pitman, J. (1996). Some developments of the Blackwell-Macqueen urn scheme. In T. S. Ferguson, L. S. Shapley, and M. J. B. (Eds.), Statistics, Probability and Game Theory: Papers in Honor of David Blackwell, Volume 30 of IMS Lecture Notes-Monograph Series, pp. 245–267. Hayward (USA): Institute of Mathematical Statistics.
  • [39] Pitman, J. (2003). Poisson-Kingman partitions. In, Science and Statistics: a Festschrift for Terry Speed, Volume 40 of IMS Lecture Notes-Monograph Series, pp. 1–34. Hayward (USA): Institute of Mathematical Statistics.
  • [40] Pitman, J. (2006)., Combinatorial Stochastic Processes – Ecole D’Eté de Probabilités de Saint-Flour XXXII. New York: Springer.
  • [41] Regazzini, E., Lijoi, A. and Prünster, I. (2003). Distributional results for means of random measures with independent increments., The Annals of Statistics 31, 560–585.
  • [42] Rosinski, J. Series representations of lévy processes from the perspective of point processes. In, Lévy processes, pages 401–415. Springer, 2001.
  • [43] Trippa, L. and Favaro, S. (2012). A class of normalized random measures with an exact predictive sampling scheme., Scandinavian Journal of Statistics, 39, 444–460.
  • [44] Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory., The Journal of Machine Learning Research 11, 3571–3594.
  • [45] Wilson, I. (1983). Add a new dimension to your philately., The American Philatelist 97, 342–349.