Open Access
December, 1958 A High Dimensional Two Sample Significance Test
A. P. Dempster
Ann. Math. Statist. 29(4): 995-1010 (December, 1958). DOI: 10.1214/aoms/1177706437

Abstract

The classical multivariate 2 sample significance test based on Hotelling's $T^2$ is undefined when the number $k$ of variables exceeds the number of within sample degrees of freedom available for estimation of variances and covariances. Addition of an a priori Euclidean metric to the affine $k$-space assumed by the classical method leads to an alternative approach to the same problem. A test statistic $F$ which is the ratio of 2 mean square distances is proposed and 3 methods of attaching a significance level to $F$ are described. The third method is considered in detail and leads to a "non-exact" significance test where the null hypothesis distribution of $F$ depends, in approximation, on a single unknown parameter $r$ for which an estimate must be substituted. Approximate distribution theory leads to 2 independent estimates of $r$ based on nearly sufficient statistics and these may be combined to yield a single estimate. A test of $F$ nominally at the 5% level but based on an estimate of $r$ rather than $r$ itself has a true significance level which is a function of $r$. This function is investigated and shown to be quite near 5%. The sensitivity of the test to a parameter measuring statistical distance between population means is discussed and it is shown that arbitrarily small differences in each individual variable can result in a detectable overall difference provided the number of variables (or, more precisely, $r$) can be made sufficiently large. This sensitivity discussion has stated implications for the a priori choice of metric in $k$-space. Finally a geometrical description of the case of large $r$ is presented.

Citation

Download Citation

A. P. Dempster. "A High Dimensional Two Sample Significance Test." Ann. Math. Statist. 29 (4) 995 - 1010, December, 1958. https://doi.org/10.1214/aoms/1177706437

Information

Published: December, 1958
First available in Project Euclid: 27 April 2007

zbMATH: 0226.62014
MathSciNet: MR112207
Digital Object Identifier: 10.1214/aoms/1177706437

Rights: Copyright © 1958 Institute of Mathematical Statistics

Vol.29 • No. 4 • December, 1958
Back to Top