Abstract
When the target of statistical inference is chosen in a data-driven manner, the guarantees provided by classical theories vanish. We propose a solution to the problem of inference after selection by building on the framework of algorithmic stability, in particular its branch with origins in the field of differential privacy. Stability is achieved via randomization of selection and it serves as a quantitative measure that is sufficient to obtain nontrivial post-selection corrections for classical confidence intervals. Importantly, the underpinnings of algorithmic stability translate directly into computational efficiency—our method computes simple corrections for selective inference without recourse to Markov chain Monte Carlo sampling.
Funding Statement
This work was supported by the Army Research Office (ARO) under contract W911NF-17-1-0304 as part of the collaboration between US DOD, UK MOD and UK Engineering and Physical Research Council (EPSRC) under the Multidisciplinary University Research Initiative (MURI).
Acknowledgments
We are grateful to Vitaly Feldman, Will Fithian, Moritz Hardt, Arun Kumar Kuchibhotla, and Adam Sealfon for many helpful discussions and feedback which has lead to improvements of this work. In particular, we thank Will Fithian for pointing out the advantages of the oracle definition of stability.
Citation
Tijana Zrnic. Michael I. Jordan. "Post-selection inference via algorithmic stability." Ann. Statist. 51 (4) 1666 - 1691, August 2023. https://doi.org/10.1214/23-AOS2303
Information