Abstract
For testing conditional independence (CI) of a response Y and a predictor X given covariates Z, the model-X (MX) framework has been the subject of active methodological research, especially in the context of MX knockoffs and their application to genome-wide association studies. In this paper, we study the power of MX CI tests, yielding quantitative insights into the role of machine learning and providing evidence in favor of using likelihood-based statistics in practice. Focusing on the conditional randomization test (CRT), we find that its conditional mode of inference allows us to reformulate it as testing a point null hypothesis involving the conditional distribution of X. The Neyman-Pearson lemma implies that a likelihood-based statistic yields the most powerful CRT against a point alternative. We obtain a related optimality result for MX knockoffs. Switching to an asymptotic framework with arbitrarily growing covariate dimension, we derive an expression for the power of the CRT against local semiparametric alternatives in terms of the prediction error of the machine learning algorithm on which its test statistic is based. Finally, we exhibit a resampling-free test with uniform asymptotic Type-I error control under the assumption that only the first two moments of X given Z are known.
Funding Statement
EK was partially supported by NSF DMS-2113072.
Acknowledgments
We thank Asaf Weinstein, Timothy Barry, and Stephen Bates for detailed comments on earlier versions of the manuscript, as well as Ed Kennedy and Larry Wasserman for discussions of the connections to causal inference. We also thank two anonymous referees for constructive feedback that greatly helped us improve the manuscript.
Citation
Eugene Katsevich. Aaditya Ramdas. "On the power of conditional independence testing under model-X." Electron. J. Statist. 16 (2) 6348 - 6394, 2022. https://doi.org/10.1214/22-EJS2085
Information