Abstract
Two-sample tests utilizing a similarity graph on observations are useful for high-dimensional and non-Euclidean data due to their flexibility and good performance under a wide range of alternatives. Existing works mainly focused on sparse graphs, such as graphs with the number of edges in the order of the number of observations, and their asymptotic results imposed strong conditions on the graph that can easily be violated by commonly constructed graphs they suggested. Moreover, the graph-based tests have better performance with denser graphs under many settings. In this work, we establish the theoretical ground for graph-based tests with graphs ranging from those recommended in current literature to much denser ones.
Funding Statement
The authors were partly supported by NSF DMS-1848579.
Citation
Yejiong Zhu. Hao Chen. "Limiting distributions of graph-based test statistics on sparse and dense graphs." Bernoulli 30 (1) 770 - 796, February 2024. https://doi.org/10.3150/23-BEJ1616