We introduce a new method for performing clustering with the aim of fitting clusters with different scatters and weights. It is designed by allowing to handle a proportion α of contaminating data to guarantee the robustness of the method. As a characteristic feature, restrictions on the ratio between the maximum and the minimum eigenvalues of the groups scatter matrices are introduced. This makes the problem to be well defined and guarantees the consistency of the sample solutions to the population ones.
The method covers a wide range of clustering approaches depending on the strength of the chosen restrictions. Our proposal includes an algorithm for approximately solving the sample problem.
References
[1] Banfield, J. D. and Raftery, A. E. (1993). Model-based Gaussian and non-Gaussian clustering. Biometrics 49 803–821.
[2] Bock, H.-H. (2002). Clustering methods: From classical models to new approaches. Statistics in Transition 5 725–758.
[3] Celeux, G. and Govaert, A. (1992). A classification EM algorithm for clustering and two stochastic versions. Comput. Statist. Data Anal. 14 315–332.
[4] Cuesta-Albertos, J. A., Gordaliza, A. and Matrán, C. (1997). Trimmed k-means: An attempt to robustify quantizers. Ann. Statist. 25 553–576.
[5] Dempster, A., Laird, N. and Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 39 1–38.
Mathematical Reviews (MathSciNet):
MR501537
[6] Dykstra, R. L. (1983). An algorithm for restricted least squares regression. J. Amer. Statist. Assoc. 78 837–842.
Mathematical Reviews (MathSciNet):
MR727568
[7] Flury, B. (1997). A First Course in Multivariate Statistics. Springer, New York.
[8] Fraley, C. and Raftery, A. E. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. Computer J. 41 578–588.
[9] Gallegos, M. T. (2001). Robust clustering under general normal assumptions. Preprint. Available at http://www.fmi.uni-passau.de/forschung/mip-berichte/MIP-0103.html.
[10] Gallegos, M. T. (2002). Maximum likelihood clustering with outliers. In Classification, Clustering and Data Analysis: Recent Advances and Applications (K. Jajuga, A. Sokolowski and H.-H. Bock, eds.) 247–255. Springer, New York.
[11] Gallegos, M. T. and Ritter, G. (2005). A robust method for cluster analysis. Ann. Statist. 33 347–380.
[12] García-Escudero, L. A. and Gordaliza, A. (1999). Robustness properties of k-means and trimmed k-means. J. Amer. Statist. Assoc. 94 956–969.
[13] García-Escudero, L. A. and Gordaliza, A. (2007). The importance of the scales in heterogeneous robust clustering. Comput. Statist. Data Anal. 51 4403–4412.
[14] García-Escudero, L. A., Gordaliza, A. and Matrán, C. (1999). A central limit theorem for multivariate generalized trimmed k-means. Ann. Statist. 27 1061–1079.
[15] García-Escudero, L. A., Gordaliza, A. and Matrán, C. (2003). Trimming tools in exploratory data analysis. J. Comput. Graph. Statist. 12 434–449.
[16] García-Escudero, L. A., Gordaliza, A., Matrán, C. and Mayo-Iscar, A. (2006). The TCLUST approach to robust cluster analysis. Technical report. Available at http://www.eio.uva.es/inves/grupos/representaciones/trTCLUST.pdf.
[17] Goldfarb, D. and Idnani, A. (1983). A numerically stable dual method for solving strictly convex quadratic programs. Math. Program. 27 1–33.
Mathematical Reviews (MathSciNet):
MR712108
[18] Hathaway, R. J. (1985). A constrained formulation of maximum likelihood estimation for normal mixture distributions. Ann. Statist. 13 795–800.
Mathematical Reviews (MathSciNet):
MR790575
[19] Hennig, C. (2004). Breakdown points for ML estimators of location-scale mixtures. Ann. Statist. 32 1313–1340.
[20] Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis. Academic Press, London.
Mathematical Reviews (MathSciNet):
MR560319
[21] Maronna, R. (2005). Principal components and orthogonal regression based on robust scales. Technometrics 47 264–273.
[22] Maronna, R. and Jacovkis, P. M. (1974). Multivariate clustering procedures with variable metrics. Biometrics 30 499–505.
[23] McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley, New York.
[24] Papadimitriou, C. H. and Steiglitz, K. (1982). Combinatorial Optimization: Algorithms and Complexity. Prentice-Hall, Englewood Cliffs, NJ.
Mathematical Reviews (MathSciNet):
MR663728
[25] Rousseeuw, P. J. and Van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics 41 212–223.
[26] Scott, A. J. and Symons, M. J. (1971). Clustering methods based on likelihood ratio criteria. Biometrics 27 387–397.
[27] Van Aelst, S., Wang, X., Zamar, R. H. and Zhu, R. (2006). Linear grouping using orthogonal regression. Comput. Statist. Data Anal. 50 1287–1312.
[28] Van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Wiley, New York.