The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 11, Number 4 (2017), 2404-2431.
Focusing on regions of interest in forecast evaluation
Often, interest in forecast evaluation focuses on certain regions of the whole potential range of the outcome, and forecasts should mainly be ranked according to their performance within these regions. A prime example is risk management, which relies on forecasts of risk measures such as the value-at-risk or the expected shortfall, and hence requires appropriate loss distribution forecasts in the tails. Further examples include weather forecasts with a focus on extreme conditions, or forecasts of environmental variables such as ozone with a focus on concentration levels with adverse health effects.
In this paper, we show how weighted scoring rules can be used to this end, and in particular that they allow to rank several potentially misspecified forecasts objectively with the region of interest in mind. This is demonstrated in various simulation scenarios. We introduce desirable properties of weighted scoring rules and present general construction principles based on conditional densities or distributions and on scoring rules for probability forecasts. In our empirical application to log-return time series, all forecasts seem to be slightly misspecified, as is often unavoidable in practice, and no method performs best overall. However, using weighted scoring functions the best method for predicting losses can be identified, which is hence the method of choice for the purpose of risk management.
Ann. Appl. Stat. Volume 11, Number 4 (2017), 2404-2431.
Received: March 2017
Revised: August 2017
First available in Project Euclid: 28 December 2017
Permanent link to this document
Digital Object Identifier
Holzmann, Hajo; Klar, Bernhard. Focusing on regions of interest in forecast evaluation. Ann. Appl. Stat. 11 (2017), no. 4, 2404--2431. doi:10.1214/17-AOAS1088. https://projecteuclid.org/euclid.aoas/1514430291
- Supplement to “Focusing on regions of interest in forecast evaluation”. We discuss weighted versions of the multivariate Hyvärinen score and of multivariate energy scores and provide the proof of Theorem 3. Further, we present the remaining simulation results for scenario B as well as additional simulation results for the Wilcoxon signed-rank test in all scenarios.