## The Annals of Statistics

- Ann. Statist.
- Volume 16, Number 2 (1988), 772-783.

### A Multivariate Two-Sample Test Based on the Number of Nearest Neighbor Type Coincidences

#### Abstract

For independent $d$-variate random samples $X_1, \cdots, X_{n_1}$ i.i.d. $f(x), Y_1, \cdots, Y_{n_2}$ i.i.d. $g(x)$, where the densities $f$ and $g$ are assumed to be continuous a.e., consider the number $T$ of all $k$ nearest neighbor comparisons in which observations and their neighbors belong to the same sample. We show that, if $f = g$ a.e., the limiting (normal) distribution of $T$, as $\min(n_1, n_2) \rightarrow \infty, n_1/(n_1 + n_2) \rightarrow \tau, 0 < \tau < 1$, does not depend on $f$. An omnibus procedure for testing the hypothesis $H_0: f = g$ a.e. is obtained by rejecting $H_0$ for large values of $T$. The result applies to a general distance (generated by a norm on $\mathbb{R}^d$) for determining nearest neighbors, and it generalizes to the multisample situation.

#### Article information

**Source**

Ann. Statist. Volume 16, Number 2 (1988), 772-783.

**Dates**

First available in Project Euclid: 12 April 2007

**Permanent link to this document**

https://projecteuclid.org/euclid.aos/1176350835

**Digital Object Identifier**

doi:10.1214/aos/1176350835

**Mathematical Reviews number (MathSciNet)**

MR947577

**Zentralblatt MATH identifier**

0645.62062

**JSTOR**

links.jstor.org

**Subjects**

Primary: 62H15: Hypothesis testing

Secondary: 62G10: Hypothesis testing

**Keywords**

Multivariate two-sample test nearest neighbor-type coincidences

#### Citation

Henze, Norbert. A Multivariate Two-Sample Test Based on the Number of Nearest Neighbor Type Coincidences. Ann. Statist. 16 (1988), no. 2, 772--783. doi:10.1214/aos/1176350835. https://projecteuclid.org/euclid.aos/1176350835