Open Access
2024 Manifold energy two-sample test
Lynna Chu, Xiongtao Dai
Author Affiliations +
Electron. J. Statist. 18(1): 145-166 (2024). DOI: 10.1214/23-EJS2203

Abstract

We consider the problem of two-sample testing for data generated under the manifold setting, namely where potentially high-dimensional observations are made for underlying objects concentrated near a low-dimensional manifold. Existing two-sample tests typically suffer from a loss of power under high-dimensionality; under the manifold setting, these tests largely ignore the underlying geometric structure of the data, resulting in misleading representations of similarity. Instead, we avoid these issues and propose a non-parametric two-sample test for general data objects which takes into account the intrinsic geometry of the data. A data-driven metric is utilized to characterize the distance between points while respecting the manifold structure. The test statistic behaves like a distance metric between distributions and is shown to be consistent against all alternatives where the two distributions have a positive energy distance. Empirical studies and data analysis of speech recordings demonstrate the test’s superior performance for manifold data.

Citation

Download Citation

Lynna Chu. Xiongtao Dai. "Manifold energy two-sample test." Electron. J. Statist. 18 (1) 145 - 166, 2024. https://doi.org/10.1214/23-EJS2203

Information

Received: 1 August 2022; Published: 2024
First available in Project Euclid: 30 January 2024

Digital Object Identifier: 10.1214/23-EJS2203

Keywords: geodesic distance , High-dimensional data , manifold data , Permutation test

Vol.18 • No. 1 • 2024
Back to Top