Bayesian Analysis

Bayesian Clustering of Functional Data Using Local Features

Abstract

The use of exploratory methods is an important step in the understanding of data. When clustering functional data, most methods use traditional clustering techniques on a vector of estimated basis coefficients, assuming that the underlying signal functions live in the $L_{2}$-space. Bayesian methods use models which imply the belief that some observations are realizations from some signal plus noise models with identical underlying signal functions. The method we propose differs in this respect: we employ a model that does not assume that any of the signal functions are truly identical, but possibly share many of their local features, represented by coefficients in a multiresolution wavelet basis expansion. We cluster each wavelet coefficient of the signal functions using conditionally independent Dirichlet process priors, thus focusing on exact matching of local features. We then demonstrate the method using two datasets from different fields to show broad application potential.

Article information

Source
Bayesian Anal. Volume 11, Number 1 (2016), 71-98.

Dates
First available in Project Euclid: 4 February 2015

https://projecteuclid.org/euclid.ba/1423083640

Digital Object Identifier
doi:10.1214/14-BA925

Mathematical Reviews number (MathSciNet)
MR3447092

Zentralblatt MATH identifier
1359.62264

Citation

Suarez, Adam Justin; Ghosal, Subhashis. Bayesian Clustering of Functional Data Using Local Features. Bayesian Anal. 11 (2016), no. 1, 71--98. doi:10.1214/14-BA925. https://projecteuclid.org/euclid.ba/1423083640

References

• Abramovich, F., Sapatinas, T., and Silverman, B. (1998). “Wavelet thresholding via a Bayesian approach.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60(4): 725–749.
• Andrzejak, R., Lehnertz, K., Mormann, F., Rieke, C., David, P., Elger, C., et al. (2001). “Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state.” Physical Review Series E, 64(6; PART 1): 61907–61907.
• Belitser, E. and Ghosal, S. (2003). “Adaptive Bayesian inference on the mean of an infinite-dimensional normal distribution.” The Annals of Statistics, 31(2): 536–559.
• Blackwell, D. and MacQueen, J. (1973). “Ferguson distributions via Pólya urn schemes.” The Annals of Statistics, 353–355.
• Cohen, A., Daubechies, I., and Vial, P. (1993). “Wavelets on the interval and fast wavelet transforms.” Applied and Computational Harmonic Analysis, 1(1): 54–81.
• Crandell, J. L. and Dunson, D. B. (2011). “Posterior simulation across nonparametric models for functional clustering.” Sankhya B, 73(1): 42–61.
• Escobar, M. D. and West, M. (1995). “Bayesian density estimation and inference using mixtures.” Journal of the American Statistical Association, 90(430): 577–588.
• Galassi, M., Davies, J., Theiler, J., Gough, B., Jungman, G., Booth, M., and Rossi, F. (2009). GNU Scientific Library Reference Manual (3rd Ed.), ISBN 0954612078. URL: http://www.gnu.org/software/gsl/.
• Ghosal, S., Ghosh, J. K., and Ramamoorthi, R. (1999). “Posterior consistency of Dirichlet mixtures in density estimation.” The Annals of Statistics, 27(1): 143–158.
• James, G. and Sugar, C. (2003). “Clustering for sparsely sampled functional data.” Journal of the American Statistical Association, 98(462): 397–408.
• Le Cam, L. (1960). “An approximation theorem for the Poisson binomial distribution.” Pacific Journal of Mathematics, 10(4): 1181–1197.
• Lian, H. (2011). “On posterior distribution of Bayesian wavelet thresholding.” Journal of Statistical Planning and Inference, 141(1): 318–324.
• Liao, T. (2005). “Clustering of time series data – a survey.” Pattern Recognition, 38(11): 1857–1874.
• Navarrete, C., Quintana, F., and Müller, P. (2008). “Some issues in nonparametric Bayesian modeling using species sampling models.” Statistical Modelling, 8(1): 3–21.
• Petrone, S., Guindani, M., and Gelfand, A. (2009). “Hybrid Dirichlet mixture models for functional data.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(4): 755–782.
• Pitman, J. (1996). “Some developments of the Blackwell–MacQueen urn scheme.” Lecture Notes-Monograph Series, 245–267.
• Ray, S. and Mallick, B. (2006). “Functional clustering by Bayesian wavelet methods.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(2): 305–332.
• Serban, N. (2008). “Estimating and clustering curves in the presence of heteroscedastic errors.” Journal of Nonparametric Statistics, 20(7): 553–571.
• Tarpey, T. and Kinateder, K. (2003). “Clustering functional data.” Journal of Classification, 20(1): 93–114.
• Thibaux, R. and Jordan, M. (2007). “Hierarchical beta processes and the Indian buffet process.” In: International Conference on Artificial Intelligence and Statistics, volume 11, 564–571.
• Ward Jr., J. (1963). “Hierarchical grouping to optimize an objective function.” Journal of the American Statistical Association, 236–244.