The Annals of Mathematical Statistics

An Application of Information Theory to Multivariate Analysis

S. Kullback

Full-text: Open access


The problem considered is that of finding the "best" linear function for discriminating between two multivariate normal populations, $\pi_1$ and $\pi_2$, without limitation to the case of equal covariance matrices. The "best" linear function is found by maximizing the divergence, $J'(1, 2)$, between the distributions of the linear function. Comparison with the divergence, $J(1, 2)$, between $\pi_1$ and $\pi_2$ offers a measure of the discriminating efficiency of the linear function, since $J(1, 2) \geq J'(1, 2)$. The divergence, a special case of which is Mahalanobis's Generalized Distance, is defined in terms of a measure of information which is essentially that of Shannon and Wiener. Appropriate assumptions about $\pi_1$ and $\pi_2$ lead to discriminant analysis (Sections 4, 7), principal components (Section 5), and canonical correlations (Section 6).

Article information

Ann. Math. Statist., Volume 23, Number 1 (1952), 88-102.

First available in Project Euclid: 28 April 2007

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier



Kullback, S. An Application of Information Theory to Multivariate Analysis. Ann. Math. Statist. 23 (1952), no. 1, 88--102. doi:10.1214/aoms/1177729487.

Export citation

See also

  • Part II: S. Kullback. An Application of Information Theory to Multivariate Analysis, II. Ann. Math. Statist., Volume 27, Number 1 (1956), 122--146.