Electronic Journal of Statistics

Order-sensitivity and equivariance of scoring functions

Tobias Fissler and Johanna F. Ziegel

Full-text: Open access


The relative performance of competing point forecasts is usually measured in terms of loss or scoring functions. It is widely accepted that these scoring function should be strictly consistent in the sense that the expected score is minimized by the correctly specified forecast for a certain statistical functional such as the mean, median, or a certain risk measure. Thus, strict consistency opens the way to meaningful forecast comparison, but is also important in regression and M-estimation. Usually strictly consistent scoring functions for an elicitable functional are not unique. To give guidance on the choice of a scoring function, this paper introduces two additional quality criteria. Order-sensitivity opens the possibility to compare two deliberately misspecified forecasts given that the forecasts are ordered in a certain sense. On the other hand, equivariant scoring functions obey similar equivariance properties as the functional at hand – such as translation invariance or positive homogeneity. In our study, we consider scoring functions for popular functionals, putting special emphasis on vector-valued functionals, e.g. the pair (mean, variance) or (Value at Risk, Expected Shortfall).

Article information

Electron. J. Statist., Volume 13, Number 1 (2019), 1166-1211.

Received: November 2017
First available in Project Euclid: 5 April 2019

Permanent link to this document

Digital Object Identifier

Zentralblatt MATH identifier

Primary: 62C99: None of the above, but in this section 62F07: Ranking and selection
Secondary: 62G99: None of the above, but in this section 91B06: Decision theory [See also 62Cxx, 90B50, 91A35]

Consistency decision theory elicitability equivariance homogeneity M-estimation order-sensitivity point forecasts scoring functions translation invariance

Creative Commons Attribution 4.0 International License.


Fissler, Tobias; Ziegel, Johanna F. Order-sensitivity and equivariance of scoring functions. Electron. J. Statist. 13 (2019), no. 1, 1166--1211. doi:10.1214/19-EJS1552. https://projecteuclid.org/euclid.ejs/1554429627

Export citation


  • Abernethy, J. D. and Frongillo, R. (2012). A Characterization of Scoring Rules for Linear Properties. In, Proceedings of the 25th Annual Conference on Learning Theory. Proceedings of Machine Learning Research 23 27.1–27.13. PMLR, Edinburgh, Scotland.
  • Acerbi, C. and Szekely, B. (2014). Backtesting Expected Shortfall., Risk Magazine 27 76–81.
  • Aliprantis, C. D. and Border, K. C. (2006)., Infinite Dimensional Analysis: A Hitchhiker’s Guide, 3rd ed. Springer, Berlin Heidelberg New York.
  • Banerjee, A., Guo, X. and Wang, H. (2005). On the Optimality of Conditional Expectation as a Bregman Predictor., IEEE Transactions on Information Theory 51 2664–2669.
  • Bellini, F. and Bignozzi, V. (2015). On elicitable risk measures., Quantitative Finance 15 725–733.
  • Brehmer, J. R. (2017). Elicitability and its Application in Risk Management, Master’s thesis, University of, Mannheim.
  • Diebold, F. X. and Mariano, R. S. (1995). Comparing predictive accuracy., Journal of Business and Economic Statistics 13 253–263.
  • Ehm, W., Gneiting, T., Jordan, A. and Krüger, F. (2016). Of quantiles and expectiles: Consistent scoring functions, Choquet representations and forecast rankings., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 78 505–562.
  • Embrechts, P. and Hofert, M. (2014). Statistics and Quantitative Risk Management for Banking and Insurance., Annual Review of Statistics and Its Application 1 493–514.
  • Embrechts, P., Puccetti, G., Rüschendorf, L., Wang, R. and Beleraj, A. (2014). An Academic Response to Basel 3.5., Risks 2 25–48.
  • Fissler, T. (2017). On Higher Order Elicitability and Some Limit Theorems on the Poisson and Wiener Space, PhD thesis, University of, Bern.
  • Fissler, T. and Ziegel, J. F. (2016). Higher order elicitability and Osband’s principle., Annals of Statistics 44 1680–1707.
  • Fissler, T. and Ziegel, J. F. (2019). Erratum: Higher Order Elicitability and Osband’s Principle., arXiv 1901.08826v1.
  • Friedman, D. (1983). Effective Scoring Rules for Probabilistic Forecasts., Management Science 29 447–454.
  • Frongillo, R. and Kash, I. (2015a). Vector-Valued Property Elicitation. In, Proceedings of The 28th Conference on Learning Theory. Proceedings of Machine Learning Research 40 710–727. PMLR, Paris, France.
  • Frongillo, R. and Kash, I. (2015b). On Elicitation Complexity. In, Advances in Neural Information Processing Systems 28 (C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama and R. Garnett, eds.) 3258–3266. Curran Associates, Inc.
  • Gneiting, T. (2011). Making and Evaluating Point Forecasts., Journal of the American Statistical Association 106 746–762.
  • Heinrich, C. (2014). The mode functional is not elicitable., Biometrika 101 245–251.
  • Huber, P. J. (1964). Robust Estimation of a Location Parameter., Annals of Mathematical Statistics 73–101.
  • Huber, P. J. and Ronchetti, E. M. (2009)., Robust Statistics, 2nd ed. John Wiley & Sons, Inc., Hoboken, New Jersey.
  • Koenker, R. (2005)., Quantile Regression. Cambridge University Press, Cambridge.
  • Königsberger, K. (2004)., Analysis 2, 5th ed. Springer-Verlag, Berlin Heidelberg New York.
  • Lambert, N. (2013). Elicitation and Evaluation of Statistical Functionals., Preprint.
  • Lambert, N., Pennock, D. M. and Shoham, Y. (2008). Eliciting properties of probability distributions. In, Proceedings of the 9th ACM Conference on Electronic Commerce 129–138. ACM, Chicago, Il, USA.
  • Lehmann, E. L. and Casella, G. (1998)., Theory of Point Estimation, 2nd ed. Springer Science & Business Media, New York.
  • Murphy, A. H. and Daan, H. (1985). Forecast Evaluation. In, Probability, Statistics and Decision Making in the Atmospheric Sciences (A. H. Murphy and R. W. Katz, eds.) 379–437. Westview Press, Boulder, Colorado.
  • Nau, R. F. (1985). Should Scoring Rules Be ‘Effective’?, Management Science 31 527–535.
  • Newey, W. K. and Powell, J. L. (1987). Asymmetric Least Squares Estimation and Testing., Econometrica 55 819–847.
  • Nolde, N. and Ziegel, J. F. (2017). Elicitability and backtesting: Perspectives for banking regulation., Annals of Applied Statistics. To appear.
  • Osband, K. H. (1985). Providing Incentives for Better Cost Forecasting, PhD thesis, University of California, Berkeley.
  • Patton, A. J. (2011). Data-based ranking of realised volatility estimators., Journal of Econometrics 161 284–303.
  • Patton, A. J. (2017). Comparing Possibly Misspecified Forecasts. Working paper, Duke, University.
  • Reichelstein, S. and Osband, K. (1984). Incentives in government contracts., Journal of Public Economics 24 257–270.
  • Saerens, M. (2000). Building cost functions minimizing to some summary statistics., IEEE Transactions on Neural Networks 11 1263–1271.
  • Savage, L. J. (1971). Elicitation of Personal Probabilities and Expectations., Journal of the American Statistical Association 66 783–801.
  • Scott, J. and Marshall, G. (2009)., A Dictionary of Sociology, 3 rev. ed. Oxford University Press.
  • Steinwart, I. (2007). How to Compare Different Loss Functions and Their Risks., Constructive Approximation 26 225–287.
  • Steinwart, I., Pasin, C., Williamson, R. and Zhang, S. (2014). Elicitation and Identification of Properties. In, Proceedings of The 27th Conference on Learning Theory. Proceedings of Machine Learning Research 35 482–526. PMLR, Barcelona, Spain.
  • Thomson, W. (1979). Eliciting production possibilities from a well-informed manager., Journal of Economic Theory 20 360–380.
  • van der Vaart, A. W. (1998)., Asymptotic Statistics. Cambridge University Press, Cambridge.
  • Weber, S. (2006). Distribution-Invariant Risk Measures, Information, and Dynamic Consistency., Mathematical Finance 16 419–441.