The Annals of Statistics

Local proper scoring rules of order two

Werner Ehm and Tilmann Gneiting

Full-text: Open access


Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if it encourages truthful reporting. It is local of order k if the score depends on the predictive density only through its value and the values of its derivatives of order up to k at the realizing event. Complementing fundamental recent work by Parry, Dawid and Lauritzen, we characterize the local proper scoring rules of order 2 relative to a broad class of Lebesgue densities on the real line, using a different approach. In a data example, we use local and nonlocal proper scoring rules to assess statistically postprocessed ensemble weather forecasts.

Article information

Ann. Statist., Volume 40, Number 1 (2012), 609-637.

First available in Project Euclid: 7 May 2012

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62C99: None of the above, but in this section
Secondary: 62M20: Prediction [See also 60G25]; filtering [See also 60G35, 93E10, 93E11] 86A10: Meteorology and atmospheric physics [See also 76Bxx, 76E20, 76N15, 76Q05, 76Rxx, 76U05]

Density forecast Euler equation Hyvärinen score proper scoring rule tangent construction


Ehm, Werner; Gneiting, Tilmann. Local proper scoring rules of order two. Ann. Statist. 40 (2012), no. 1, 609--637. doi:10.1214/12-AOS973.

Export citation


  • Bauer, H. (2001). Measure and Integration Theory. de Gruyter Studies in Mathematics 26. de Gruyter, Berlin.
  • Bernardo, J.-M. (1979). Expected information as expected utility. Ann. Statist. 7 686–690.
  • Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review 78 1–3.
  • Bröcker, J. and Smith, L. A. (2008). From ensemble forecasts to predictive distribution functions. Tellus Ser. A 60 663–678.
  • DasGupta, A. (2008). Asymptotic Theory of Statistics and Probability. Springer, New York.
  • Dawid, A. P. (1984). Statistical theory. The prequential approach. J. Roy. Statist. Soc. Ser. A 147 278–292.
  • Dawid, A. P. (2007). The geometry of proper scoring rules. Ann. Inst. Statist. Math. 59 77–93.
  • Dawid, A. P. (2008). Comments on: Assessing probabilistic forecasts of multivariate quantities, with an application to ensemble predictions of surface winds [MR2434318]. TEST 17 243–244.
  • Dawid, A. P. and Lauritzen, S. L. (2005). The geometry of decision theory. In Proceedings of the Second International Symposium on Information Geometry and Its Applications 22–28. Univ. Tokyo, Tokyo, Japan.
  • Dawid, A. P., Lauritzen, S. and Parry, M. (2012). Proper local scoring rules on discrete sample spaces. Ann. Statist. 40 593–608.
  • Dawid, A. P., Parry, M. and Lauritzen, S. (2009). Personal communication.
  • Ehm, W. (2011). Unbiased risk estimation and scoring rules. C. R. Math. Acad. Sci. Paris 349 699–702.
  • Ehm, W. and Gneiting, T. (2009). Local proper scoring rules. Technical Report 551, Dept. Statistics, Univ. Washington. (Addendum 2010).
  • Gelfand, I. M. and Fomin, S. V. (1963). Calculus of Variations. Prentice Hall International, Englewood Cliffs, NJ.
  • Genton, M. G., ed. (2004). Skew-elliptical Distributions and Their Applications: A Journey Beyond Normality. Chapman & Hall/CRC, Boca Raton, FL.
  • Gneiting, T. (2008). Editorial: Probabilistic forecasting. J. Roy. Statist. Soc. Ser. A 171 319–321.
  • Gneiting, T., Balabdaoui, F. and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 243–268.
  • Gneiting, T. and Raftery, A. E. (2005). Atmospheric science. Weather forecasting with ensemble methods. Science 310 248–249.
  • Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102 359–378.
  • Gneiting, T., Raftery, A. E., Westveld, A. H. and Goldman, T. (2005). Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Monthly Weather Review 133 1098–1118.
  • Good, I. J. (1952). Rational decisions. J. Roy. Statist. Soc. Ser. B. 14 107–114.
  • Grimit, E. P. and Mass, C. F. (2002). Initial results of a mesoscale short-range ensemble system over the Pacific Northwest. Weather and Forecasting 17 192–205.
  • Hendrickson, A. D. and Buehler, R. J. (1971). Proper scores for probability forecasters. Ann. Math. Statist. 42 1916–1921.
  • Huber, P. J. (1974). Fisher information and spline interpolation. Ann. Statist. 2 1029–1033.
  • Hyvärinen, A. (2005). Estimation of non-normalized statistical models by score matching. J. Mach. Learn. Res. 6 695–709 (electronic).
  • Hyvärinen, A. (2007). Some extensions of score matching. Comput. Statist. Data Anal. 51 2499–2512.
  • Jose, V. R. R., Nau, R. F. and Winkler, R. L. (2009). Sensitivity to distance and baseline distributions in forecast evaluation. Management Science 55 582–590.
  • Mason, S. J. (2008). Understanding forecast verification statistics. Meteorological Applications 15 31–40.
  • Matheson, J. E. and Winkler, R. L. (1976). Scoring rules for continuous probability distributions. Management Science 22 1087–1096.
  • Palmer, T. N. (2002). The economic value of ensemble forecasts as a tool for risk assessment: From days to decades. Quarterly Journal of the Royal Meteorological Society 128 747–774.
  • Parry, M., Dawid, A. P. and Lauritzen, S. (2012). Proper local scoring rules. Ann. Statist. 40 561–592.
  • Raftery, A. E., Gneiting, T., Balabdaoui, F. and Polakowski, M. (2005). Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review 133 1155–1174.
  • Sloughter, M., Gneiting, T. and Raftery, A. E. (2010). Probabilistic wind spread forecasting using ensembles and Bayesian model averaging. J. Amer. Statist. Assoc. 105 25–35.
  • Sloughter, J. M., Raftery, A. E., Gneiting, T. and Fraley, C. (2007). Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Monthly Weather Review 135 3209–3220.
  • Staël von Holstein, C. A. S. (1969). A family of strictly proper scoring rules which are sensitive to distance. Journal of Applied Meteorology 9 360–364.
  • Thorarinsdottir, T. L. and Gneiting, T. (2010). Probabilistic forecasts of wind speed: Ensemble model ouput statistics by using heteroscedastic censored regression. J. Roy. Statist. Soc. Ser. A 173 371–388.
  • Villani, C. (2009). Optimal Transport: Old and New. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] 338. Springer, Berlin.
  • Wilks, D. S. and Hamill, T. M. (2007). Comparison of ensemble-MOS methods using GFS reforecasts. Monthly Weather Review 135 2379–2390.
  • Winkler, R. L. and Jose, V. R. R. (2008). Comments on: Assessing probabilistic forecasts of multivariate quantities, with an application to ensemble predictions of surface winds [MR2434318]. TEST 17 251–255.
  • Winkler, R. L. and Murphy, A. H. (1968). “Good” probability assessors. Journal of Applied Meteorology 7 751–758.