Modal clustering in a class of product partition models

David B. Dahl

This paper defines a class of univariate product partition models for which a novel deterministic search algorithm is guaranteed to find the maximum a posteriori (MAP) clustering or the maximum likelihood (ML) clustering. While the number of possible clusterings of $n$ items grows exponentially according to the Bell number, the proposed mode-finding algorithm exploits properties of the model to provide a search requiring only $n(n+1)$ computations. No Monte Carlo is involved. Thus, the algorithm finds the MAP or ML clustering for potentially tens of thousands of items, whereas it can only be approximated through a stochastic search. Integrating over the model parameters in a Dirichlet process mixture (DPM) model leads to a product partition model. A simulation study explores the quality of the clustering estimates despite departures from the assumptions. Finally, applications to three specific models --- clustering means, probabilities, and variances --- are used to illustrate the variety of applicable models and mode-finding algorithm.

Bayesian Anal. Volume 4, Number 2 (2009), 243-264.

First available in Project Euclid: 22 June 2012

Bayesian nonparametrics Dirichlet process mixture model model-based clustering maximum a posteriori clustering maximum likelihood clustering product partition models


Dahl, David B. Modal clustering in a class of product partition models. Bayesian Anal. 4 (2009), no. 2, 243--264. doi:10.1214/09-BA409.

