Abstract
For independent $d$-variate random variables $X_1,\dots,X_m$ with common density $f$ and $Y_1,\dots,Y_n$ with common density $g$, let $R_{m,n}$ be the number of edges in the minimal spanning tree with vertices $X_1,\dots,X_m$, $Y_1,\dots,Y_n$ that connect points from different samples. Friedman and Rafsky conjectured that a test of $H_0: f = g$ that rejects $H_0$ for small values of $R_{m,n}$ should have power against general alternatives. We prove that $R_{m,n}$ is asymptotically distribution-free under $H_0$ , and that the multivariate two-sample test based on $R_{m,n}$ is universally consistent.
Citation
Norbert Henze. Mathew D. Penrose. "On the multivariate runs test." Ann. Statist. 27 (1) 290 - 298, February 1999. https://doi.org/10.1214/aos/1018031112
Information