Open Access
2024 Generating knockoffs via conditional independence
Emanuela Dreassi, Fabrizio Leisen, Luca Pratelli, Pietro Rigo
Author Affiliations +
Electron. J. Statist. 18(1): 119-144 (2024). DOI: 10.1214/23-EJS2198

Abstract

Let X be a p-variate random vector and X˜ a knockoff copy of X (in the sense of [9]). A new approach for constructing X˜ (henceforth, NA) has been introduced in [8]. NA has essentially three advantages: (i) To build X˜ is straightforward; (ii) The joint distribution of (X,X˜) can be written in closed form; (iii) X˜ is often optimal under various criteria. However, for NA to apply, X1,,Xp should be conditionally independent given some random element Z. Our first result is that any probability measure μ on Rp can be approximated by a probability measure μ0 of the form

μ0(A1××Ap)=E{i=1pP(XiAiZ)}.

The approximation is in total variation distance when μ is absolutely continuous, and an explicit formula for μ0 is provided. If Xμ0, then X1,,Xp are conditionally independent. Hence, with a negligible error, one can assume Xμ0 and build X˜ through NA. Our second result is a characterization of the knockoffs X˜ obtained via NA. It is shown that X˜ is of this type if and only if the pair (X,X˜) can be extended to an infinite sequence so as to satisfy certain invariance conditions. The basic tool for proving this fact is de Finetti’s theorem for partially exchangeable sequences. In addition to the quoted results, an explicit formula for the conditional distribution of X˜ given X is obtained in a few cases. In one of such cases, it is assumed Xi{0,1} for all i.

Acknowledgments

We are grateful to Guido Consonni for a very useful conversation.

Citation

Download Citation

Emanuela Dreassi. Fabrizio Leisen. Luca Pratelli. Pietro Rigo. "Generating knockoffs via conditional independence." Electron. J. Statist. 18 (1) 119 - 144, 2024. https://doi.org/10.1214/23-EJS2198

Information

Received: 1 June 2023; Published: 2024
First available in Project Euclid: 29 January 2024

Digital Object Identifier: 10.1214/23-EJS2198

Subjects:
Primary: 60E05 , 62E10 , 62H05 , 62J02

Keywords: approximation , Conditional independence , high-dimensional regression , Knockoffs , multivariate dependence , Partial exchangeability , Variable selection

Vol.18 • No. 1 • 2024
Back to Top