## Annals of Statistics

- Ann. Statist.
- Volume 23, Number 1 (1995), 103-112.

### Principal Points and Self-Consistent Points of Elliptical Distributions

Thaddeus Tarpey, Luning Li, and Bernard D. Flury

#### Abstract

The $k$ principal points of a $p$-variate random vector $\mathbf{X}$ are those points $\xi_1, \ldots, \xi_k \in \mathbb{R}^p$ which approximate the distribution of $\mathbf{X}$ by minimizing the expected squared distance of $\mathbf{X}$ from the nearest of the $\xi_j$. Any set of $k$ points $\mathbf{y}_1, \ldots, \mathbf{y}_k$ partitions $\mathbb{R}^p$ into "domains of attraction" $D_1, \ldots, D_k$ according to minimal distance; following Hastie and Stuetzle we call $\mathbf{y}_1, \ldots, \mathbf{y}_k$ self-consistent if $E\lbrack\mathbf{X}\mid\mathbf{X} \in D_j\rbrack = \mathbf{y}_j$ for $j = 1, \ldots, k$. Principal points are a special case of self-consistent points. In this paper we study principal points and self-consistent points of $p$-variate elliptical distributions. The main results are the following: (1) If $k$ self-consistent points of $\mathbf{X}$ span a subspace of dimension $q < p$, then this subspace is also spanned by $q$ principal components, that is, self-consistent points of elliptical distributions exist only in principal component subspaces. (2) The subspace spanned by $k$ principal points of $\mathbf{X}$ is identical with the subspace spanned by the principal components associated with the largest roots. This proves a conjecture of Flury. We also discuss implications of our results for the computation and estimation of principal points.

#### Article information

**Source**

Ann. Statist., Volume 23, Number 1 (1995), 103-112.

**Dates**

First available in Project Euclid: 11 April 2007

**Permanent link to this document**

https://projecteuclid.org/euclid.aos/1176324457

**Digital Object Identifier**

doi:10.1214/aos/1176324457

**Mathematical Reviews number (MathSciNet)**

MR1331658

**Zentralblatt MATH identifier**

0822.62042

**JSTOR**

links.jstor.org

**Subjects**

Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20]

Secondary: 62H05: Characterization and structure theory 62H25: Factor analysis and principal components; correspondence analysis

**Keywords**

$k$-means cluster analysis normal distribution principal components uniform distribution

#### Citation

Tarpey, Thaddeus; Li, Luning; Flury, Bernard D. Principal Points and Self-Consistent Points of Elliptical Distributions. Ann. Statist. 23 (1995), no. 1, 103--112. doi:10.1214/aos/1176324457. https://projecteuclid.org/euclid.aos/1176324457