Abstract
We study the problem of independence testing given independent and identically distributed pairs taking values in a σ-finite, separable measure space. Defining a natural measure of dependence D(f)D(f) as the squared L2L2-distance between a joint density f and the product of its marginals, we first show that there is no valid test of independence that is uniformly consistent against alternatives of the form {f:D(f)≥ρ2}{f:D(f)≥ρ2}. We therefore restrict attention to alternatives that impose additional Sobolev-type smoothness constraints, and define a permutation test based on a basis expansion and a U-statistic estimator of D(f)D(f) that we prove is minimax optimal in terms of its separation rates in many instances. Finally, for the case of a Fourier basis on [0,1]2[0,1]2, we provide an approximation to the power function that offers several additional insights. Our methodology is implemented in the R package USP.
Funding Statement
The first author was supported by the French National Research Agency (ANR) under the grants Labex Ecodec (ANR-11-LABEX-0047 and ANR-17-CE40-0003.
The third author was supported by Engineering and Physical Sciences Reseach Council (EPSRC) Programme grant EP/N031938/1 and EPSRC Fellowship EP/P031447/1.
Acknowledgments
We are very grateful to the anonymous reviewers, whose constructive feedback helped to improve the paper. We would also like to thank Ilmun Kim for bringing the work of Song et al. (2012) to our attention; this inspired the computational improvements discussed in Section 7.1.
Citation
Thomas B. Berrett. Ioannis Kontoyiannis. Richard J. Samworth. "Optimal rates for independence testing via U-statistic permutation tests." Ann. Statist. 49 (5) 2457 - 2490, October 2021. https://doi.org/10.1214/20-AOS2041
Information