Abstract
We propose an empirical likelihood ratio (elr) test for comparing any two supervised learning models, which may be nested, non-nested, overlapping, misspecified, or correctly specified. The test compares the prediction losses of models based on the cross-validation. We determine the asymptotic null and alternative distributions of the elr test for comparing two nonparametric learning models under a general framework of convex loss functions. However, the prediction losses from the cross-validation involve repeatedly fitting the models with one observation left out, which leads to a heavy computational burden. We introduce an easy-to-implement elr test which requires fitting the models only once and shares the same asymptotics as the original one. The proposed tests are applied to compare additive models with varying-coefficient models. Furthermore, a scalable distributed elr test is proposed for testing the importance of a group of variables in possibly misspecified additive models with massive data. Simulations show that the proposed tests work well and have favorable finite-sample performance compared to some existing approaches. The methodology is validated on an empirical application.
Version Information
Corresponding author was noted in the Acknowledgement section.
Acknowledgments
All authors make equal contributions to this work. This research was supported by NSFC grants 11871263 and 12271238, Guangdong NSF Fund 2017A030313012, and Shenzhen Sci-Tech Fund (JCYJ20210324104803010) for Xuejun Jiang. Xuejun Jiang is the Corresponding author.
Citation
Jiancheng Jiang. Xuejun Jiang. Haofeng Wang. "Empirical likelihood ratio tests for non-nested model selection based on predictive losses." Bernoulli 30 (2) 1458 - 1481, May 2024. https://doi.org/10.3150/23-BEJ1640
Information