A Nonparametric test for the Several Sample Problem

William H. Kruskal

doi:10.1214/aoms/1177729332

December, 1952 A Nonparametric test for the Several Sample Problem

William H. Kruskal

Ann. Math. Statist. 23(4): 525-540 (December, 1952). DOI: 10.1214/aoms/1177729332

Abstract

Suppose that $C$ independent random samples of sizes $n_1, \cdots, n_c$ are to be drawn from $C$ univariate populations with unknown cumulative distribution functions $F_1, \cdots, F_c$. This paper discusses a test of the null hypothesis $F_1 = F_2 = \cdots = F_c$ against alternatives of the form $F_i(x) = F(x - \theta_i)\quad (\text{all} x, i = 1, \cdots, C)$ with the $\theta_i$'s not all equal, or against alternatives of a much more general sort to be specified in Section 5. The test to be discussed has as its critical region large values of the ordinary $F$-ratio for one-way analysis of variance, computed after the observations have been replaced by their ranks in the $\sum n_i$-fold over-all sample. This use of ranks simplifies the distribution theory, and permits application of the test to cases where the ranks are available but the numerical values of the observations are difficult to obtain. Briefly, then, we shall consider a non-parametric analogue, based on ranks, of one-way analysis of variance. It is shown in Section 4 that, under quite general conditions, the proposed test statistic, $H$, is asymptotically chi-square with $C - 1$ degrees of freedom when the null hypothesis holds. Section 5 derives a necessary and sufficient condition that the natural family of sequences of tests based on large values of $H$ all be consistent against a given alternative. Section 6 derives the variance of $H$ under the null hypothesis, Section 7 derives the maximum value of $H$, and Section 8 gives a difference equation which may be used to obtain exact small-sample distributions under the null hypothesis. These derivations are made on the assumption of continuity for the cumulative distribution functions; Section 9 considers extensions to the possibly discontinuous case.

Citation

Download Citation

William H. Kruskal. "A Nonparametric test for the Several Sample Problem." Ann. Math. Statist. 23 (4) 525 - 540, December, 1952. https://doi.org/10.1214/aoms/1177729332