Statistical Science

Modeling with Normalized Random Measure Mixture Models

Ernesto Barrios, Antonio Lijoi, Luis E. Nieto-Barajas, and Igor Prünster

Full-text: Open access

Abstract

The Dirichlet process mixture model and more general mixtures based on discrete random probability measures have been shown to be flexible and accurate models for density estimation and clustering. The goal of this paper is to illustrate the use of normalized random measures as mixing measures in nonparametric hierarchical mixture models and point out how possible computational issues can be successfully addressed. To this end, we first provide a concise and accessible introduction to normalized random measures with independent increments. Then, we explain in detail a particular way of sampling from the posterior using the Ferguson–Klass representation. We develop a thorough comparative analysis for location-scale mixtures that considers a set of alternatives for the mixture kernel and for the nonparametric component. Simulation results indicate that normalized random measure mixtures potentially represent a valid default choice for density estimation problems. As a byproduct of this study an R package to fit these models was produced and is available in the Comprehensive R Archive Network (CRAN).

Article information

Source
Statist. Sci., Volume 28, Number 3 (2013), 313-334.

Dates
First available in Project Euclid: 28 August 2013

Permanent link to this document
https://projecteuclid.org/euclid.ss/1377696939

Digital Object Identifier
doi:10.1214/13-STS416

Mathematical Reviews number (MathSciNet)
MR3135535

Zentralblatt MATH identifier
1331.62120

Keywords
Bayesian nonparametrics completely random measure clustering density estimation Dirichlet process increasing additive process latent variables mixture model normalized generalized gamma process normalized inverse Gaussian process normalized random measure normalized stable process

Citation

Barrios, Ernesto; Lijoi, Antonio; Nieto-Barajas, Luis E.; Prünster, Igor. Modeling with Normalized Random Measure Mixture Models. Statist. Sci. 28 (2013), no. 3, 313--334. doi:10.1214/13-STS416. https://projecteuclid.org/euclid.ss/1377696939


Export citation

References

  • Argiento, R., Guglielmi, A. and Pievatolo, A. (2010). Bayesian density estimation and model selection using nonparametric hierarchical mixtures. Comput. Statist. Data Anal. 54 816–832.
  • Berry, D. A. and Christensen, R. (1979). Empirical Bayes estimation of a binomial parameter via mixtures of Dirichlet processes. Ann. Statist. 7 558–568.
  • Blackwell, D. (1973). Discreteness of Ferguson selections. Ann. Statist. 1 356–358.
  • Brix, A. (1999). Generalized gamma measures and shot-noise Cox processes. Adv. in Appl. Probab. 31 929–953.
  • Burden, R. L. and Faires, J. D. (1993). Numerical Analysis. PWS Publishing Company, Boston.
  • Bush, C. A. and MacEachern, S. N. (1996). A semiparametric Bayesian model for randomised block designs. Biometrika 83 275–285.
  • Daley, D. J. and Vere-Jones, D. (2008). An Introduction to the Theory of Point Processes. Vol. II, General Theory and Structure, 2nd ed. Springer, New York.
  • Damien, P., Wakefield, J. and Walker, S. (1999). Gibbs sampling for Bayesian non-conjugate and hierarchical models by using auxiliary variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 61 331–344.
  • De Iorio, M., Müller, P., Rosner, G. L. and MacEachern, S. N. (2004). An ANOVA model for dependent random measures. J. Amer. Statist. Assoc. 99 205–215.
  • Escobar, M. D. and West, M. (1995). Bayesian density estimation and inference using mixtures. J. Amer. Statist. Assoc. 90 577–588.
  • Favaro, S., Lijoi, A. and Prünster, I. (2012). On the stick-breaking representation of normalized inverse Gaussian priors. Biometrika 99 663–674.
  • Favaro, S. and Teh, Y. W. (2013). MCMC for normalized random measure mixture models. Statist. Sci. 28 335–359.
  • Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Ann. Statist. 1 209–230.
  • Ferguson, T. S. (1974). Prior distributions on spaces of probability measures. Ann. Statist. 2 615–629.
  • Ferguson, T. S. (1983). Bayesian density estimation by mixtures of normal distributions. In Recent Advances in Statistics 287–302. Academic Press, New York.
  • Ferguson, T. S. and Klass, M. J. (1972). A representation of independent increment processes without Gaussian components. Ann. Math. Statist. 43 1634–1643.
  • Gelfand, A. E., Dey, D. K. and Chang, H. (1992). Model determination using predictive distributions with implementation via sampling-based methods. In Bayesian Statistics 4 147–167. Oxford Univ. Press, New York.
  • Ishwaran, H. and James, L. F. (2001). Gibbs sampling methods for stick-breaking priors. J. Amer. Statist. Assoc. 96 161–173.
  • James, L. F., Lijoi, A. and Prünster, I. (2006). Conjugacy as a distinctive feature of the Dirichlet process. Scand. J. Stat. 33 105–120.
  • James, L. F., Lijoi, A. and Prünster, I. (2009). Posterior analysis for normalized random measures with independent increments. Scand. J. Stat. 36 76–97.
  • Kingman, J. F. C. (1967). Completely random measures. Pacific J. Math. 21 59–78.
  • Kingman, J. F. C. (1975). Random discrete distribution. J. Roy. Statist. Soc. Ser. B 37 1–22.
  • Kingman, J. F. C. (1993). Poisson Processes. Oxford Studies in Probability 3. Oxford Univ. Press, New York.
  • Lijoi, A., Mena, R. H. and Prünster, I. (2005). Hierarchical mixture modeling with normalized inverse-Gaussian priors. J. Amer. Statist. Assoc. 100 1278–1291.
  • Lijoi, A., Mena, R. H. and Prünster, I. (2007). Controlling the reinforcement in Bayesian non-parametric mixture models. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 715–740.
  • Lijoi, A. and Prünster, I. (2010). Models beyond the Dirichlet process. In Bayesian Nonparametrics (N. L. Hjort, C. C. Holmes, P. Müller and S. G. Walker, eds.) 80–136. Cambridge Univ. Press, Cambridge.
  • Lijoi, A., Nipoti, B. and Prünster, I. (2013). Bayesian inference with dependent normalized completely random measures. Bernoulli. To appear. DOI:10.3150/13-BEJ521.
  • Lo, A. Y. (1984). On a class of Bayesian nonparametric estimates. I. Density estimates. Ann. Statist. 12 351–357.
  • MacEachern, S. N. and Müller, P. (1998). Estimating mixtures of Dirichlet process models. J. Comput. Graph. Statist. 7 223–238.
  • MacEachern, S. and Müller, P. (2000). Efficient MCMC schemes for robust model extensions using encompassing Dirichlet process mixture models. In Robust Bayesian Analysis. Lecture Notes in Statist. 152 295–315. Springer, New York.
  • Marron, J. S. and Wand, M. P. (1992). Exact mean integrated squared error. Ann. Statist. 20 712–736.
  • Muliere, P. and Tardella, L. (1998). Approximating distributions of random functionals of Ferguson–Dirichlet priors. Canad. J. Statist. 26 283–297.
  • Müller, P. and Quintana, F. A. (2004). Nonparametric Bayesian data analysis. Statist. Sci. 19 95–110.
  • Müller, P. and Vidakovic, B. (1998). Bayesian inference with wavelets: Density estimation. J. Comput. Graph. Statist. 7 456–468.
  • Nieto-Barajas, L. E., Prünster, I. and Walker, S. G. (2004). Normalized random measures driven by increasing additive processes. Ann. Statist. 32 2343–2360.
  • Nieto-Barajas, L. E. and Prünster, I. (2009). A sensitivity analysis for Bayesian nonparametric density estimators. Statist. Sinica 19 685–705.
  • Orbanz, P. and Williamson, S. (2011). Unit–rate Poisson representations of completely random measures. Technical report.
  • Papaspiliopoulos, O. and Roberts, G. O. (2008). Retrospective Markov chain Monte Carlo methods for Dirichlet process hierarchical models. Biometrika 95 169–186.
  • Pitman, J. (2003). Poisson–Kingman partitions. In Statistics and Science: A Festschrift for Terry Speed (D. R. Goldstein, ed.). Institute of Mathematical Statistics Lecture Notes—Monograph Series 40 1–34. IMS, Beachwood, OH.
  • Regazzini, E., Lijoi, A. and Prünster, I. (2003). Distributional results for means of normalized random measures with independent increments. Ann. Statist. 31 560–585.
  • Richardson, S. and Green, P. J. (1997). On Bayesian analysis of mixtures with an unknown number of components. J. Roy. Statist. Soc. Ser. B 59 731–792.
  • Roeder, K. and Wasserman, L. (1997). Practical Bayesian density estimation using mixtures of normals. J. Amer. Statist. Assoc. 92 894–902.
  • Sato, K. (1990). Lévy Processes and Infinitely Divisible Distributions. Cambridge Univ. Press, Cambridge.
  • Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statist. Sinica 4 639–650.
  • Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall, London.
  • Tierney, L. (1994). Markov chains for exploring posterior distributions. Ann. Statist. 22 1701–1762.
  • Walker, S. G. (2007). Sampling the Dirichlet mixture model with slices. Comm. Statist. Simulation Comput. 36 45–54.
  • Walker, S. and Damien, P. (2000). Representations of Lévy processes without Gaussian components. Biometrika 87 477–483.
  • Wolpert, R. L. and Ickstadt, K. (1998). Poisson/gamma random field models for spatial statistics. Biometrika 85 251–267.