Bayesian Analysis

Modal clustering in a class of product partition models

David B. Dahl

Abstract

This paper defines a class of univariate product partition models for which a novel deterministic search algorithm is guaranteed to find the maximum a posteriori (MAP) clustering or the maximum likelihood (ML) clustering. While the number of possible clusterings of $n$ items grows exponentially according to the Bell number, the proposed mode-finding algorithm exploits properties of the model to provide a search requiring only $n(n+1)$ computations. No Monte Carlo is involved. Thus, the algorithm finds the MAP or ML clustering for potentially tens of thousands of items, whereas it can only be approximated through a stochastic search. Integrating over the model parameters in a Dirichlet process mixture (DPM) model leads to a product partition model. A simulation study explores the quality of the clustering estimates despite departures from the assumptions. Finally, applications to three specific models --- clustering means, probabilities, and variances --- are used to illustrate the variety of applicable models and mode-finding algorithm.

Article information

Source
Bayesian Anal. Volume 4, Number 2 (2009), 243-264.

Dates
First available in Project Euclid: 22 June 2012

https://projecteuclid.org/euclid.ba/1340370277

Digital Object Identifier
doi:10.1214/09-BA409

Mathematical Reviews number (MathSciNet)
MR2507363

Zentralblatt MATH identifier
1330.62248

Citation

Dahl, David B. Modal clustering in a class of product partition models. Bayesian Anal. 4 (2009), no. 2, 243--264. doi:10.1214/09-BA409. https://projecteuclid.org/euclid.ba/1340370277

References

• Barry, D. and Hartigan, J. A. (1992). "Product partition models for change point problems." The Annals of Statistics, 20: 260–279.
• Bell, E. T. (1934). "Exponential Numbers." Amer. Math. Monthly, 41(7): 411–419.
• Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. John Wiley & Sons.
• Binder, D. A. (1978). "Bayesian Cluster Analysis." Biometrika, 65: 31–38.
• Blackwell, D. and MacQueen, J. B. (1973). "Ferguson Distributions Via Polya Urn Schemes." The Annals of Statistics, 1: 353–355.
• Broët, P., Richardson, S., and Radvanyi, F. (2002). "Bayesian Hierarchical Model for Identifying Changes in Gene Expression from Microarray Experiments." Journal of Computational Biology, 9: 671–683.
• Bryant, P. and Williamson, J. A. (1978). "Asymptotic Behaviour of Classification Maximum Likelihood Estimates." Biometrika, 65: 273–282.
• Bryant, P. G. and Williamson, J. A. (1986). "Maximum Likelihood and Classification: A Comparison of Three Approaches." In Gaul, W. and Schader, M. (eds.), Classification as a Tool of Research, 35–45. Elsevier/North-Holland [Elsevier Science Publishing Co., New York; North-Holland Publishing Co., Amsterdam].
• Celeux, G. and Govaert, G. (1993). "Comparison of the Mixture and the Classification Maximum Likelihood in Cluster Analysis." Journal of Statistical Computation and Simulation, 47: 127–146.
• Dahl, D. B. (2006). "Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model." In Do, K.-A., Müller, P., and Vannucci, M. (eds.), Bayesian Inference for Gene Expression and Proteomics, 201–218. Cambridge University Press.
• Ferguson, T. S. (1973). "A Bayesian Analysis of Some Nonparametric Problems." The Annals of Statistics, 1: 209–230.
• Fraley, C. and Raftery, A. E. (1999). "MCLUST": Software for Model-based Cluster Analysis." Journal of Classification, 16(2): 297–306.
• Gentleman, R. C., Carey, V. J., Bates, D. M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Li, F. L. C., Maechler, M., Rossini, A. J., Sawitzki, G., Smith, C., Smyth, G., Tierney, L., Yang, J. Y. H., and Zhang, J. (2004). "Bioconductor: Open software development for computational biology and bioinformatics." Genome Biology, 5: R80. http://genomebiology.com/2004/5/10/R80
• Green, P. J. (1995). "Reversible jump Markov chain Monte Carlo computation and Bayesian model determination." Biometrika, 82: 711–732.
• Hartigan, J. A. (1975). Clustering algorithms. John Wiley & Sons.
• –- (1990). "Partition Models." Communications in Statistics, Part A – Theory and Methods, 19: 2745–2756.
• Hartigan, J. A. and Wong, M. A. (1979). "[Algorithm AS" 136] A ${K}$-means Clustering Algorithm (AS R39: 81V30 P355-356). Applied Statistics, 28: 100–108.
• Hubert, L. and Arabie, P. (1985). "Comparing partitions." Journal of Classification, 2: 193–218.
• Irizarry, R., Hobbs, B., Collin, F., Beazer-Barclay, Y., Antonellis, K., Scherf, U., and Speed, T. (2003). "Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data." Biostatistics, 4: 249–264.
• Kim, S., Tadesse, M., and Vannucci, M. (2006). "Variable selection in clustering via Dirichlet process mixture models." Biometrika, 93(4): 877–893.
• Lau, J. W. and Green, P. J. (2007). "Bayesian model based clustering procedures." Journal of Computational and Graphical Statistics, 16: 526–558.
• Li, J., Ray, S., and Lindsay, B. (2007). "A Nonparametric Statistical Approach to Clustering via Mode Identiﬁcation." Journal of Machine Learning Research, 8: 1687–1723.
• MacEachern, S. N. (1994). "Estimating Normal Means With a Conjugate Style Dirichlet Process Prior." Communications in Statistics, Part B – Simulation and Computation, 23: 727–741.
• MacEachern, S. N., Clyde, M., and Liu, J. S. (1999). "Sequential Importance Sampling for Nonparametric Bayes Models: The Next Generation." The Canadian Journal of Statistics, 27: 251–267.
• MacQueen, J. B. (1967). "Some methods for classification and analysis of multivariate observations." In Le Cam, L. M. and Neyman, J. (eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 281–297. University of California Press.
• Medvedovic, M. and Sivaganesan, S. (2002). "Bayesian Infinite Mixture Model Based Clustering of Gene Expression Profiles." Bioinformatrics, 18: 1194–1206.
• Medvedovic, M., Yeung, K., and Bumgarner, R. (2004). "Bayesian mixture model based clustering of replicated microarray data." Bioinformatrics, 20: 1222–1232.
• Milligan, G. W. and Cooper, M. C. (1986). "A study of the comparability of external criteria for hierarchical cluster analysis." Multivariate Behavioral Research, 21: 441–458.
• Neal, R. M. (1992). "Bayesian mixture modeling." In Smith, C. R., Erickson, G. J., and Neudorfer, P. O. (eds.), Maximum Entropy and Bayesian Methods: Proceedings of the 11th International Workshop on Maximum Entropy and Bayesian Methods of Statistical Analysis (Seattle, 1991), 197–211. Kluwer Academic Publishers.
• Quintana, F. A. and Iglesias, P. L. (2003). "Bayesian Clustering and Product Partition Models." Journal of the Royal Statistical Society, Series B, Methodological, 65: 557–574.
• R Development Core Team (2008). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org
• Rand, W. M. (1971). "Objective criteria for the evaluation of clustering methods." Journal of the American Statistical Association, 66: 846–850.
• Richardson, S. and Green, P. J. (1997). "On Bayesian Analysis of Mixtures With An Unknown Number of Components (Disc: P758-792) (Corr: 1998V60 P661)." Journal of the Royal Statistical Society, Series B, Methodological, 59: 731–758.
• Rota, G. C. (1964). "The Number of Partitions of a Set." Amer. Math. Monthly, 71(5): 498–504.
• Schwarz, G. (1978). "Estimating the Dimension of a Model." The Annals of Statistics, 6: 461–464.