The Annals of Statistics

The topography of multivariate normal mixtures

Surajit Ray and Bruce G. Lindsay

Full-text: Open access

Abstract

Multivariate normal mixtures provide a flexible method of fitting high-dimensional data. It is shown that their topography, in the sense of their key features as a density, can be analyzed rigorously in lower dimensions by use of a ridgeline manifold that contains all critical points, as well as the ridges of the density. A plot of the elevations on the ridgeline shows the key features of the mixed density. In addition, by use of the ridgeline, we uncover a function that determines the number of modes of the mixed density when there are two components being mixed. A followup analysis then gives a curvature function that can be used to prove a set of modality theorems.

Article information

Source
Ann. Statist. Volume 33, Number 5 (2005), 2042-2065.

Dates
First available: 25 November 2005

Permanent link to this document
http://projecteuclid.org/euclid.aos/1132936556

Digital Object Identifier
doi:10.1214/009053605000000417

Mathematical Reviews number (MathSciNet)
MR2211079

Zentralblatt MATH identifier
1086.62066

Subjects
Primary: 62E10: Characterization and structure theory 62H05: Characterization and structure theory
Secondary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]

Keywords
Mixture modal cluster multivariate mode clustering dimension reduction topography manifold

Citation

Ray, Surajit; Lindsay, Bruce G. The topography of multivariate normal mixtures. The Annals of Statistics 33 (2005), no. 5, 2042--2065. doi:10.1214/009053605000000417. http://projecteuclid.org/euclid.aos/1132936556.


Export citation

References

  • Behboodian, J. (1970). On the modes of a mixture of two normal distributions. Technometrics 12 131--139.
  • Bryan, J. G. (1951). The generalized discriminant function: Mathematical foundations and computational routine. Harvard Educational Review 21 90--95.
  • Carreira-Perpiñán, M. Á. and Williams, C. K. I. (2003). On the number of modes of a Gaussian mixture. Scale-Space Methods in Computer Vision. Lecture Notes in Comput. Sci. 2695 625--640. Springer, New York.
  • Danovaro, E., De Floriani, L., Magillo, P., Mesmoudi, M. M. and Puppo, E. (2003). Morphology-driven simplification and multiresolution modeling of terrains. In Proc. Eleventh ACM International Symposium on Advances in Geographic Information Systems 63--70. ACM Press, New York.
  • de Helguero, F. (1904). Sui massimi delle curve dimorfiche. Biometrika 3 84--98.
  • Eisenberger, I. (1964). Genesis of bimodal distributions. Technometrics 6 357--363.
  • Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics 7 179--188.
  • Geisser, S. (1977). Discrimination, allocatory and separatory, linear aspects. In Classification and Clustering (J. Van Ryzin, ed.) 301--330. Academic Press, New York.
  • Gilbert, E. S. (1969). The effect of unequal variance--covariance matrices on Fisher's linear discriminant function. Biometrics 25 505--515.
  • Kakiuchi, I. (1981). Unimodality conditions of the distribution of a mixture of two distributions. Math. Sem. Notes Kobe Univ. 9 315--325.
  • Kemperman, J. H. B. (1991). Mixtures with a limited number of modal intervals. Ann. Statist. 19 2120--2144.
  • Lindsay, B. G. (1983). The geometry of mixture likelihoods. II. The exponential family. Ann. Statist. 11 783--792.
  • Liu, C. (1997). ML estimation of the multivariate $t$ distribution and the EM algorithm. J. Multivariate Anal. 63 296--312.
  • McLachlan, G. and Peel, D. (2000). Finite Mixture Models. Wiley, New York.
  • Milnor, J. (1963). Morse Theory. Princeton Univ. Press, Princeton, NJ.
  • Morse, M. and Cairns, S. (1969). Critical Point Theory in Global Analysis and Differential Topology. Academic Press, New York.
  • Olsen, O. (2003). The scale structure of the gradient magnitude. Technical report, IT Univ. Copenhagen. Available at www.itu.dk/pub/Reports/ITU-TR-2003-29.pdf.
  • Peel, D. and McLachlan, G. J. (2000). Robust mixture modelling using the $t$ distribution. Statist. Comput. 10 339--348.
  • Rao, C. R. (1948). The utilization of multiple measurements in problems of biological classification (with discussion). J. Roy. Statist. Soc. Ser. B 10 159--203.
  • Robertson, C. A. and Fryer, J. G. (1969). Some descriptive properties of normal mixtures. Skand. Aktuarietidskr. 1969 137--146.
  • Thomson, A. and Maciver, D. R. (1905). The Ancient Races of the Thebaid. Oxford Univ. Press.