Abstract
The hypothesis that the two independent random samples $u_1, \cdots, u_m$ and $v_1, \cdots, v_n$ come from the same (unknown) continuous distribution may be tested by the two-sample Cramer-von Mises criterion $W^2 = mn(m + n)^{-2} \sum d^2_i,$ or by $U^2 = mn(m + n)^{-2} \sum (d_i - \bar{d})^2,$ as proposed by Watson [4]. Here $d_i$ is the difference between the sample distribution functions at the $i$th point in the pooled sample; more precisely, if there are $m_i$ members of the first sample and $n_i$ members of the second sample contained in the first $i$ members of the pooled set of $m + n$ members arranged in order of magnitude, then $d_i = n_i/n - m_i/m,\quad i = 1, 2, \cdots, m + n;$ and $\bar{d} = \sum d_i/(m + n)$. Watson's $U^2$ is particularly appropriate when the sample members lie on a circle with no preferred initial point. The limiting distributions of $W^2$ and $U^2$ for large $m, n$ are known [1], [4]. Comparisons between the upper tails of the limiting distributions and small-sample distributions have been made by Anderson [2] for $W^2$ with $m, n \leqq 7$ and $m = n = 8$; by the writer [3] for $W^2$ with $m = n \leqq 12$; and by Watson [5] for $U^2$ and $W^2$ with $m = n = 10$. In this paper similar comparisons are made, for both $U^2$ and $W^2$, for all sample pairs with $m, n \geqq 4$ and $m + n \leqq 17$. For each such $m, n$, the true probability $P$ of attaining or exceeding $U^2$ is tabulated for every attainable $U^2$ with $P \leqq 0.1$. The correction factor $R = P/P_\infty$ is also given, where $P_\infty$ is the approximation to $P$ obtained by using the limiting distribution. For $W^2$, the values of $P$ and $R$ are tabulated for the smallest $W^2$ significant at each of the levels $(10, 8, 6, 5, 4, 3, 2.5, 2, 1.5) \times 10^{-a},\quad a = 2, 3, 4,$ except for the largest and second largest values of $W^2$, for which special formulae are supplied. It is found that for all $U^2$ and $W^2$ other than the largest and second largest values of each, when $P$ lies between 0.1 and 0.01, $R$ lies between 0.65 and 1.35 (except when $m = n = 5$), and when $P$ lies between 0.01 and 0.001, $R$ lies between 0.47 and 1.02. Less precisely, we may sum up by saying that the limiting distribution yields upper-tail areas correct within $\pm 35$ per cent between the 10 per cent and 1 per cent levels, but it overestimates the true $P$ by a factor between 1 and 2 between the 1 per cent and 0.1 per cent levels.
Citation
E. J. Burr. "Small-Sample Distributions of the Two-sample Cramer-Von Mises' $W^2$ and Watson's $U^2$." Ann. Math. Statist. 35 (3) 1091 - 1098, September, 1964. https://doi.org/10.1214/aoms/1177703267
Information