## The Annals of Statistics

### On the multivariate runs test

#### Abstract

For independent $d$-variate random variables $X_1,\dots,X_m$ with common density $f$ and $Y_1,\dots,Y_n$ with common density $g$, let $R_{m,n}$ be the number of edges in the minimal spanning tree with vertices $X_1,\dots,X_m$, $Y_1,\dots,Y_n$ that connect points from different samples. Friedman and Rafsky conjectured that a test of $H_0: f = g$ that rejects $H_0$ for small values of $R_{m,n}$ should have power against general alternatives. We prove that $R_{m,n}$ is asymptotically distribution-free under $H_0$ , and that the multivariate two-sample test based on $R_{m,n}$ is universally consistent.

#### Article information

Source
Ann. Statist., Volume 27, Number 1 (1999), 290-298.

Dates
First available in Project Euclid: 5 April 2002

https://projecteuclid.org/euclid.aos/1018031112

Digital Object Identifier
doi:10.1214/aos/1018031112

Mathematical Reviews number (MathSciNet)
MR1701112

Zentralblatt MATH identifier
0944.62057

#### Citation

Henze, Norbert; Penrose, Mathew D. On the multivariate runs test. Ann. Statist. 27 (1999), no. 1, 290--298. doi:10.1214/aos/1018031112. https://projecteuclid.org/euclid.aos/1018031112

#### References

• [1] Aldous, D. (1990). A Random tree model associated with random graphs. Random Structures Algorithms 1 383-401.
• [2] Aldous, D. and Steele, J. M. (1992). Asymptotics for Euclidean minimal spanning trees on random points. Probab. Theory Related Fields 92 247-258.
• [3] Anderson, N. H., Hall, P. and Titterington, D. M. (1994). Two-sample test statistics for measuring discrepancies between two multivariate probability density functions using kernel-based density estimates. J. Multivariate Anal. 50 41-54.
• [4] Bahr, R. (1996). A new test for the multivariate two-sample problem with general alternatives (in German). Doctoral thesis, Univ. Hannover.
• [5] Bloemena, A. R. (1964). Sampling from a graph. Mathematical Centre Tracts 2. Math. Centrum, Amsterdam.
• [6] Einmahl, J. H. J. and Khmaladze, E. V. (1998). The two-sample problem in Rm and measure-valued martingales. Report S98-2, Dept. Statistics, Univ. New South Wales, Sidney.
• [7] Ferger, D. (1997). Optimal tests for the general two-sample problem. Dresdener Schriften zur Mathematischen Stochastik Technische Univ. Dresden.
• [8] Friedman, J. H. and Rafsky, L. C. (1979). Multivariate generalizations of the Wolfowitz and Smirnov two-sample tests. Ann. Statist. 7 697-717.
• [9] Gy ¨orfi, L. and Nemetz, T. (1975). f-dissimilarity: A general class of separation measures of several probability measures. In Topics in Information Theory. Colloq. Math. Soc. J´anos Bolyai 16 309-321.
• [10] Gy ¨orfi, L. and Nemetz, T. (1977). On the dissimilarity of probability measures. Problems Control Inform. Theory 6 263-267.
• [11] Gy ¨orfi, L. and Nemetz, T. (1978). f-dissimilarity. A generalization of affinity of several distributions. Ann. Inst. Statist. Math. 30 105-113.
• [12] Henze, N. (1986). On the probability that a random point is the jth nearest neighbour to its own kth nearest neighbour. J. Appl. Probab. 23 221-226.
• [13] Henze, N. (1988). A multivariate two-sample test based on the number of nearest-neighbor type coincidences. Ann. Statist. 16 772-783.
• [14] Henze, N. and Voigt, B. (1992). Almost sure convergence of certain slowly changing symmetric oneand multi-sample statistics. Ann. Probab. 20 1086-1098.
• [15] Kingman, J. F. C. (1993). Poisson Processes. Oxford Univ. Press.
• [16] Lee, S. (1997). The central limit theorem for Euclidean minimal spanning trees I. Ann. Appl. Probab. 7 996-1020.
• [17] Penrose, M. D. (1996). The random minimal spanning tree in high dimensions. Ann. Probab. 24 1903-1925.
• [18] Pickard, D. K. (1982). Isolated nearest neighbours. J. Appl. Probab. 19 444-449.
• [19] Resnick, S. I. (1987). Extreme Values, Regular Variation, and Point Processes. Springer, New York.
• [20] Rudin, W. (1987). Real and Complex Analysis, 3rd ed. McGraw-Hill, New York.
• [21] Schilling, M. F. (1986). Multivariate two-sample tests based on nearest neighbors. J. Amer. Statist. Assoc. 81 799-806.
• [22] Steele, J. M., Shepp, L. A. and Eddy, W. F. (1987). On the number of leaves of a Euclidean minimal spanning tree. J. Appl. Prob. 24 809-826.