Bayesian Analysis

Latent Nested Nonparametric Priors

Federico Camerlenghi, David B. Dunson, Antonio Lijoi, Igor Prünster, and Abel Rodríguez

Advance publication

This article is in its final form and can be cited using the date of online publication and the DOI.

Full-text: Open access

Abstract

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalizing to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop a Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by-product. The results and their inferential implications are showcased on synthetic and real data.

Article information

Source
Bayesian Anal., Advance publication (2018), 23 pages.

Dates
First available in Project Euclid: 27 June 2019

Permanent link to this document
https://projecteuclid.org/euclid.ba/1561601089

Digital Object Identifier
doi:10.1214/19-BA1169

Subjects
Primary: 60G57: Random measures 62G05: Estimation 62F15: Bayesian inference

Keywords
Bayesian nonparametrics completely random measures dependent nonparametric priors heterogeneity mixture models nested processes

Rights
Creative Commons Attribution 4.0 International License.

Citation

Camerlenghi, Federico; Dunson, David B.; Lijoi, Antonio; Prünster, Igor; Rodríguez, Abel. Latent Nested Nonparametric Priors. Bayesian Anal., advance publication, 27 June 2019. doi:10.1214/19-BA1169. https://projecteuclid.org/euclid.ba/1561601089


Export citation

References

  • Barrientos, A. F., Jara, A., and Quintana, F. A. (2017). “Fully nonparametric regression for bounded data using dependent Bernstein polynomials.” Journal of the American Statistical Association, to appear.
  • Bhattacharya, A. and Dunson, D. (2012). “Nonparametric Bayes classification and hypothesis testing on manifolds.” Journal of Multivariate Analysis, 111: 1–19.
  • Blei, D. M. and Frazier, P. I. (2011). “Distance dependent Chinese restaurant process.” Journal of Machine Learning Research, 12: 2383–2410.
  • Blei, D. M., NG, A. Y., and Jordan, M. I. (2003). “Latent Dirichlet allocation.” Journal of Machine Learning Research, 3: 993–1022.
  • Camerlenghi, F., Lijoi, A., Orbanz, P., and Prünster, I. (2019a). “Distribution theory for hierarchical processes.” Annals of Statistics, 47(1): 67–92.
  • Camerlenghi, F., Dunson, D. B., Lijoi, A., Prünster, I., and Rodríguez, A. (2019b). “Supplementary material to Latent nested nonparametric priors.” Bayesian Analysis.
  • Chung, Y. and Dunson, D. B. (2009). “Nonparametric Bayes conditional distribution modeling with variable selection.” Journal of the American Statistical Association, 104(488): 1646–1660.
  • Dahl, D. B., Day, R., and Tsai, J. W. (2017). “Random partition distribution indexed by pairwise information.” Journal of the American Statistical Association to appear.
  • De Iorio, M., Johnson, W. O., Müller, P., and Rosner, G. L. (2009). “Bayesian nonparametric nonproportional hazards survival modeling.” Biometrics, 65(3): 762–771.
  • De Iorio, M., Müller, P., Rosner, G. L., and MacEachern, S. N. (2004). “An ANOVA model for dependent random measures.” Journal of the American Statistical Association, 99(465): 205–215.
  • Filippi, S. and Holmes, C. C. (2017). “A Bayesian nonparametric approach for quantifying dependence between random variables.” Bayesian Analysis, 12(4): 919–938.
  • Gelfand, A. E., Kottas, A., and MacEachern, S. N. (2005). “Bayesian nonparametric spatial modeling with Dirichlet process mixing.” Journal of the American Statistical Association, 100(471): 1021–1035.
  • Griffin, J. E., Kolossiatis, M., and Steel, M. F. J. (2013). “Comparing distributions by using dependent normalized random-measure mixtures.” Journal of the Royal Statistical Society. Series B, Statistical Methodology, 75(3): 499–529.
  • Griffin, J. E. and Leisen, F. (2017). “Compound random measures and their use in Bayesian non-parametrics.” Journal of the Royal Statistical Society. Series B, 79(2): 525–545.
  • Griffin, J. E. and Steel, M. F. J. (2006). “Order-based dependent Dirichlet processes.” Journal of the American Statistical Association, 101(473): 179–194.
  • Hjort, N. L. (2000). “Bayesian analysis for a generalized Dirichlet process prior.” Technical report, University of Oslo.
  • Holmes, C., Caron, F., Griffin, J. E., and Stephens, D. A. (2015). “Two-sample Bayesian nonparametric hypothesis testing.” Bayesian Analysis, 10(2): 297–320.
  • Jara, A., Lesaffre, E., De Iorio, M., and Quintana, F. (2010). “Bayesian semiparametric inference for multivariate doubly-interval-censored data.” Annals of Applied Statistics, 4(4): 2126–2149.
  • Kingman, J. F. C. (1978). “The representation of partition structures.” Journal of the London Mathematical Society (2), 18(2): 374–380.
  • Kingman, J. F. C. (1993). Poisson processes. Oxford University Press.
  • Lijoi, A., Nipoti, B., and Prünster, I. (2014). “Bayesian inference with dependent normalized completely random measures.” Bernoulli, 20(3): 1260–1291.
  • Ma, L. and Wong, W. H. (2011). “Coupling optional Pólya trees and the two sample problem.” Journal of the American Statistical Association, 106(496): 1553–1565.
  • MacEachern, S. N. (1994). “Estimating normal means with a conjugate style Dirichlet process prior.” Communications in Statistics. Simulation and Computation, 23(3): 727–741.
  • MacEachern, S. N. (1999). “Dependent nonparametric processes.” In ASA proceedings of the section on Bayesian statistical science, 50–55.
  • MacEachern, S. N. (2000). “Dependent Dirichlet processes.” Tech. Report, Department of Statistics, The Ohio State University.
  • Mena, R. H. and Ruggiero, M. (2016). “Dynamic density estimation with diffusive Dirichlet mixtures.” Bernoulli, 22(2): 901–926.
  • Müller, P., Quintana, F., and Rosner, G. (2004). “A method for combining inference across related nonparametric Bayesian models.” Journal of the Royal Statistical Society. Series B, Statistical Methodology, 66(3): 735–749.
  • Müller, P., Quintana, F., and Rosner, G. L. (2011). “A product partition model with regression on covariates.” Journal of Computational and Graphical Statistics, 20(1): 260–278.
  • Nguyen, X. (2013). “Convergence of latent mixing measures in finite and infinite mixture models.” Annals of Statistics, 41(1): 370–400.
  • Nguyen, X. (2015). “Posterior contraction of the population polytope in finite admixture models.” Bernoulli, 21(1): 618–646.
  • Page, G. L. and Quintana, F. A. (2016). “Spatial product partition models.” Bayesian Analysis, 11(1): 265–298.
  • Pitman, J. (1995). “Exchangeable and partially exchangeable random partitions.” Probab. Theory Related Fields, 102(2): 145–158.
  • Regazzini, E., Lijoi, A., and Prünster, I. (2003). “Distributional results for means of random measures with independent increments.” Annals of Statistics, 31: 560–585.
  • Rodríguez, A. and Dunson, D. B. (2011). “Nonparametric Bayesian models through probit stick-breaking processes.” Bayesian Analysis, 6(1): 145–177.
  • Rodríguez, A. and Dunson, D. B. (2014). “Functional clustering in nested designs: modeling variability in reproductive epidemiology studies.” Annals of Applied Statistics, 8(3): 1416–1442.
  • Rodríguez, A., Dunson, D. B., and Gelfand, A. E. (2008). “The nested Dirichlet process.” Journal of the American Statistical Association, 103(483): 1131–1144.
  • Rodríguez, A., Dunson, D. B., and Gelfand, A. E. (2010). “Latent stick-breaking processes.” Journal of the American Statistical Association, 105(490): 647–659.
  • Soriano, J. and Ma, L. (2017). “Probabilistic multi-resolution scanning for two-sample differences.” Journal of the Royal Statistical Society. Series B, Statistical Methodology, 79(2): 547–572.
  • Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M. (2006). “Hierarchical Dirichlet processes.” Journal of the American Statistical Association, 101(476): 1566–1581.
  • West, M., Müller, P., and Escobar, M. D. (1994). “Hierarchical priors and mixture models, with application in regression and density estimation.” In Aspects of uncertainty, 363–386. Wiley, Chichester.

Supplemental materials