On the multivariate runs test

Norbert Henze and Mathew D. Penrose

For independent $d$-variate random variables $X_1,\dots,X_m$ with common density $f$ and $Y_1,\dots,Y_n$ with common density $g$, let $R_{m,n}$ be the number of edges in the minimal spanning tree with vertices $X_1,\dots,X_m$, $Y_1,\dots,Y_n$ that connect points from different samples. Friedman and Rafsky conjectured that a test of $H_0: f = g$ that rejects $H_0$ for small values of $R_{m,n}$ should have power against general alternatives. We prove that $R_{m,n}$ is asymptotically distribution-free under $H_0$ , and that the multivariate two-sample test based on $R_{m,n}$ is universally consistent.

Ann. Statist., Volume 27, Number 1 (1999), 290-298.

Primary: 62H15: Hypothesis testing
Secondary: 62G10: Hypothesis testing 60F05: Central limit and other weak theorems 60F15: Strong theorems

Multivariate two-sample problem minimal spanning tree multivariate runs test homogeneous Poisson process


