Annals of Statistics

Principal Points and Self-Consistent Points of Elliptical Distributions

Thaddeus Tarpey, Luning Li, and Bernard D. Flury

Full-text: Open access

Abstract

The $k$ principal points of a $p$-variate random vector $\mathbf{X}$ are those points $\xi_1, \ldots, \xi_k \in \mathbb{R}^p$ which approximate the distribution of $\mathbf{X}$ by minimizing the expected squared distance of $\mathbf{X}$ from the nearest of the $\xi_j$. Any set of $k$ points $\mathbf{y}_1, \ldots, \mathbf{y}_k$ partitions $\mathbb{R}^p$ into "domains of attraction" $D_1, \ldots, D_k$ according to minimal distance; following Hastie and Stuetzle we call $\mathbf{y}_1, \ldots, \mathbf{y}_k$ self-consistent if $E\lbrack\mathbf{X}\mid\mathbf{X} \in D_j\rbrack = \mathbf{y}_j$ for $j = 1, \ldots, k$. Principal points are a special case of self-consistent points. In this paper we study principal points and self-consistent points of $p$-variate elliptical distributions. The main results are the following: (1) If $k$ self-consistent points of $\mathbf{X}$ span a subspace of dimension $q < p$, then this subspace is also spanned by $q$ principal components, that is, self-consistent points of elliptical distributions exist only in principal component subspaces. (2) The subspace spanned by $k$ principal points of $\mathbf{X}$ is identical with the subspace spanned by the principal components associated with the largest roots. This proves a conjecture of Flury. We also discuss implications of our results for the computation and estimation of principal points.

Article information

Source
Ann. Statist., Volume 23, Number 1 (1995), 103-112.

Dates
First available in Project Euclid: 11 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.aos/1176324457

Digital Object Identifier
doi:10.1214/aos/1176324457

Mathematical Reviews number (MathSciNet)
MR1331658

Zentralblatt MATH identifier
0822.62042

JSTOR
links.jstor.org

Subjects
Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]
Secondary: 62H05: Characterization and structure theory 62H25: Factor analysis and principal components; correspondence analysis

Keywords
$k$-means cluster analysis normal distribution principal components uniform distribution

Citation

Tarpey, Thaddeus; Li, Luning; Flury, Bernard D. Principal Points and Self-Consistent Points of Elliptical Distributions. Ann. Statist. 23 (1995), no. 1, 103--112. doi:10.1214/aos/1176324457. https://projecteuclid.org/euclid.aos/1176324457


Export citation