Abstract
The continuous ranked probability score (crps) is the most commonly used scoring rule in the evaluation of probabilistic forecasts for real-valued outcomes. To assess and rank forecasting methods, researchers compute the mean crps over given sets of forecast situations, based on the respective predictive distributions and outcomes. We propose a new, isotonicity-based decomposition of the mean crps into interpretable components that quantify miscalibration (MCB), discrimination ability (DSC), and uncertainty (UNC), respectively. In a detailed theoretical analysis, we compare the new approach to empirical decompositions proposed earlier, generalize to population versions, analyse their properties and relationships, and relate to a hierarchy of notions of calibration. The isotonicity-based decomposition guarantees the nonnegativity of the components and quantifies calibration in a sense that is stronger than for other types of decompositions, subject to the nondegeneracy of empirical decompositions. We illustrate the usage of the isotonicity-based decomposition and miscalibration–discrimination (MCB–DSC) plots in case studies from weather prediction and machine learning.
Funding Statement
Tilmann Gneiting is grateful for support by the Klaus Tschira Foundation. The work of Eva-Maria Walz was funded by the German Research Foundation (DFG) through grant number 257899354. Sebastian Arnold and Johanna Ziegel gratefully acknowledge financial support from the Swiss National Science Foundation.
Acknowledgments
We thank two anonymous referees, Tim Hewson, Kai Polsterer, and Johannes Resin for comments and discussion. Computations for the weather case study have been performed on UBELIX (https://ubelix.unibe.ch/), the HPC cluster of the University of Bern.
Citation
Sebastian Arnold. Eva-Maria Walz. Johanna Ziegel. Tilmann Gneiting. "Decompositions of the mean continuous ranked probability score." Electron. J. Statist. 18 (2) 4992 - 5044, 2024. https://doi.org/10.1214/24-EJS2316
Information