- Bayesian Anal.
- Volume 6, Number 2 (2011), 329-351.
Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination
We propose a hierarchical Bayesian nonparametric mixture model for clustering when some of the covariates are assumed to be of varying relevance to the clustering problem. This can be thought of as an issue in variable selection for unsupervised learning. We demonstrate that by defining a hierarchical population based nonparametric prior on the cluster locations scaled by the inverse covariance matrices of the likelihood we arrive at a `sparsity prior' representation which admits a conditionally conjugate prior. This allows us to perform full Gibbs sampling to obtain posterior distributions over parameters of interest including an explicit measure of each covariate's relevance and a distribution over the number of potential clusters present in the data. This also allows for individual cluster specific variable selection. We demonstrate improved inference on a number of canonical problems.
Bayesian Anal., Volume 6, Number 2 (2011), 329-351.
First available in Project Euclid: 13 June 2012
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]
Secondary: 60G57: Random measures 62B10: Information-theoretic topics [See also 94A17] 62F15: Bayesian inference 62G99: None of the above, but in this section 62H99: None of the above, but in this section 62P10: Applications to biology and medical sciences
Yau, Christopher; Holmes, Chris. Hierarchical Bayesian nonparametric mixture models for clustering with variable relevance determination. Bayesian Anal. 6 (2011), no. 2, 329--351. doi:10.1214/11-BA612. https://projecteuclid.org/euclid.ba/1339612049