Open Access
December 2021 The Semi-Hierarchical Dirichlet Process and Its Application to Clustering Homogeneous Distributions
Mario Beraha, Alessandra Guglielmi, Fernando A. Quintana
Author Affiliations +
Bayesian Anal. 16(4): 1187-1219 (December 2021). DOI: 10.1214/21-BA1278

Abstract

Assessing homogeneity of distributions is an old problem that has received considerable attention, especially in the nonparametric Bayesian literature. To this effect, we propose the semi-hierarchical Dirichlet process, a novel hierarchical prior that extends the hierarchical Dirichlet process of Teh et al. (2006) and that avoids the degeneracy issues of nested processes recently described by Camerlenghi et al. (2019a). We go beyond the simple yes/no answer to the homogeneity question and embed the proposed prior in a random partition model; this procedure allows us to give a more comprehensive response to the above question and in fact find groups of populations that are internally homogeneous when I2 such populations are considered. We study theoretical properties of the semi-hierarchical Dirichlet process and of the Bayes factor for the homogeneity test when I=2. Extensive simulation studies and applications to educational data are also discussed.

Funding Statement

Fernando A. Quintana was supported by Fondecyt Grant 1180034. This work was supported by ANID – Millennium Science Initiative Program – NCN17_059.

Acknowledgments

We are thankful to the Editor, Associate Editor and two anonymous referees for their constructive comments that helped us to significantly improve and clarify this manuscript.

Citation

Download Citation

Mario Beraha. Alessandra Guglielmi. Fernando A. Quintana. "The Semi-Hierarchical Dirichlet Process and Its Application to Clustering Homogeneous Distributions." Bayesian Anal. 16 (4) 1187 - 1219, December 2021. https://doi.org/10.1214/21-BA1278

Information

Published: December 2021
First available in Project Euclid: 28 July 2021

MathSciNet: MR4381132
Digital Object Identifier: 10.1214/21-BA1278

Subjects:
Primary: 60G57 , 62F15 , 62G05
Secondary: 62H30

Keywords: Bayes factors , Bayesian nonparametrics , homogeneity test , Partial exchangeability , posterior consistency

Vol.16 • No. 4 • December 2021
Back to Top