The Annals of Mathematical Statistics

An Application of Information Theory to Multivariate Analysis

S. Kullback

Abstract

The problem considered is that of finding the "best" linear function for discriminating between two multivariate normal populations, $\pi_1$ and $\pi_2$, without limitation to the case of equal covariance matrices. The "best" linear function is found by maximizing the divergence, $J'(1, 2)$, between the distributions of the linear function. Comparison with the divergence, $J(1, 2)$, between $\pi_1$ and $\pi_2$ offers a measure of the discriminating efficiency of the linear function, since $J(1, 2) \geq J'(1, 2)$. The divergence, a special case of which is Mahalanobis's Generalized Distance, is defined in terms of a measure of information which is essentially that of Shannon and Wiener. Appropriate assumptions about $\pi_1$ and $\pi_2$ lead to discriminant analysis (Sections 4, 7), principal components (Section 5), and canonical correlations (Section 6).

Article information

Source
Ann. Math. Statist., Volume 23, Number 1 (1952), 88-102.

Dates
First available in Project Euclid: 28 April 2007

https://projecteuclid.org/euclid.aoms/1177729487

Digital Object Identifier
doi:10.1214/aoms/1177729487

Mathematical Reviews number (MathSciNet)
MR47297

Zentralblatt MATH identifier
0047.13503

JSTOR