Abstract
Many areas of science rely on simulators that implicitly encode intractable likelihood functions of complex systems. Classical statistical methods are poorly suited for these so-called likelihood-free inference (LFI) settings, especially outside asymptotic and low-dimensional regimes. At the same time, popular LFI methods — such as Approximate Bayesian Computation or more recent machine learning techniques — do not necessarily lead to valid scientific inference because they do not guarantee confidence sets with nominal coverage in general settings. In addition, LFI currently lacks practical diagnostic tools to check the actual coverage of computed confidence sets across the entire parameter space. In this work, we propose a modular inference framework that bridges classical statistics and modern machine learning to provide (i) a practical approach for constructing confidence sets with near finite-sample validity at any value of the unknown parameters, and (ii) interpretable diagnostics for estimating empirical coverage across the entire parameter space. We refer to this framework as likelihood-free frequentist inference (LF2I). Any method that defines a test statistic can leverage LF2I to create valid confidence sets and diagnostics without costly Monte Carlo or bootstrap samples at fixed parameter settings. We study two likelihood-based test statistics (ACORE and BFF) and demonstrate their performance on high-dimensional complex data. Code is available at https://github.com/lee-group-cmu/lf2i.
Funding Statement
This work was supported in part by NSF DMS-2053804, NSF PHY-2020295, and the C3.ai Digital Transformation Institute. RI is grateful for the financial support of CNPq (422705/2021-7 and 305065/2023-8) and FAPESP (2019/11321-9 and 2023/07068-1).
Acknowledgments
The authors would like to thank Mikael Kuusela, Rafael Stern and Larry Wasserman for helpful discussions. We are also indebted to Tommaso Dorigo, Jan Kieseler and Giles C. Strong for providing the muon energy data and the neural network architecture used for the studies described in Section 6.3.
Citation
Niccolò Dalmasso. Luca Masserano. David Zhao. Rafael Izbicki. Ann B. Lee. "Likelihood-free frequentist inference: bridging classical statistics and machine learning for reliable simulator-based inference." Electron. J. Statist. 18 (2) 5045 - 5090, 2024. https://doi.org/10.1214/24-EJS2307
Information