Bayesian Analysis

Cluster Analysis, Model Selection, and Prior Distributions on Models

George Casella, Elías Moreno, and F. Javier Girón

Full-text: Open access

Abstract

Clustering is an important and challenging statistical problem for which there is an extensive literature. Modeling approaches include mixture models and product partition models. Here we develop a product partition model and a Bayesian model selection procedure based on Bayes factors from intrinsic priors. We also find that the choice of the prior on model space is of utmost importance, almost overshadowing the other parts of the clustering problem, and we examine the behavior of the model posterior probabilities based on different model space priors. We find, somewhat surprisingly, that procedures based on the often-used uniform prior (in which all models are given the same prior probability) lead to inconsistent model selection procedures. We examine other priors, and find that the Ewens-Pitman prior and a new prior, the hierarchical uniform prior, lead to consistent model selection procedures and have other desirable properties. Lastly, we compare the procedures on a range of examples.

Article information

Source
Bayesian Anal., Volume 9, Number 3 (2014), 613-658.

Dates
First available in Project Euclid: 5 September 2014

Permanent link to this document
https://projecteuclid.org/euclid.ba/1409921108

Digital Object Identifier
doi:10.1214/14-BA869

Mathematical Reviews number (MathSciNet)
MR3256058

Zentralblatt MATH identifier
1327.62374

Keywords
Bayesian model selection Consistency Hierarchical models Intrinsic priors Product partition models Stochastic search

Citation

Casella, George; Moreno, Elías; Girón, F. Javier. Cluster Analysis, Model Selection, and Prior Distributions on Models. Bayesian Anal. 9 (2014), no. 3, 613--658. doi:10.1214/14-BA869. https://projecteuclid.org/euclid.ba/1409921108


Export citation

References