Open Access
2024 Conditional independence testing for discrete distributions: Beyond χ2- and G-tests
Ilmun Kim, Matey Neykov, Sivaraman Balakrishnan, Larry Wasserman
Author Affiliations +
Electron. J. Statist. 18(2): 4767-4794 (2024). DOI: 10.1214/24-EJS2315

Abstract

This paper is concerned with the problem of conditional independence testing for discrete data. In recent years, researchers have shed new light on this fundamental problem, emphasizing finite-sample optimality. The non-asymptotic viewpoint adapted in these works has led to novel conditional independence tests that enjoy certain optimality under various regimes. Despite their attractive theoretical properties, the considered tests are not necessarily practical, relying on a Poissonization trick and unspecified constants in their critical values. In this work, we attempt to bridge the gap between theory and practice by reproving optimality without Poissonization and calibrating tests using Monte Carlo permutations. Along the way, we also prove that classical asymptotic χ2- and G-tests are notably sub-optimal in a high-dimensional regime, which justifies the demand for new tools. Our theoretical results are complemented by experiments on both simulated and real-world datasets. Accompanying this paper is an R package UCI that implements the proposed tests.

Funding Statement

We would like to thank the reviewers for their thoughtful comments that significantly improved our paper. This work was partially supported by funding from the NSF grants DMS-2113684 and DMS-2310632, as well as an Amazon AI and a Google Research Scholar Award to SB. MN acknowledges support from the NSF grant DMS-2113684. IK acknowledges support from the Yonsei University Research Fund of 2023-22-0419 as well as support from the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2022R1A4A1033384), and the Korea government (MSIT) RS-2023-00211073.

Citation

Download Citation

Ilmun Kim. Matey Neykov. Sivaraman Balakrishnan. Larry Wasserman. "Conditional independence testing for discrete distributions: Beyond χ2- and G-tests." Electron. J. Statist. 18 (2) 4767 - 4794, 2024. https://doi.org/10.1214/24-EJS2315

Information

Received: 1 October 2023; Published: 2024
First available in Project Euclid: 22 November 2024

arXiv: 2308.05373
Digital Object Identifier: 10.1214/24-EJS2315

Subjects:
Primary: 62C20 , 62G99 , 62H17

Keywords: Conditional independence , depoissonization , negative association , permutation tests , sample complexity

Vol.18 • No. 2 • 2024
Back to Top