The Annals of Statistics

Evaluating probability forecasts

Tze Leung Lai, Shulamith T. Gross, and David Bo Shen

Full-text: Access denied (no subscription detected) We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Probability forecasts of events are routinely used in climate predictions, in forecasting default probabilities on bank loans or in estimating the probability of a patient’s positive response to treatment. Scoring rules have long been used to assess the efficacy of the forecast probabilities after observing the occurrence, or nonoccurrence, of the predicted events. We develop herein a statistical theory for scoring rules and propose an alternative approach to the evaluation of probability forecasts. This approach uses loss functions relating the predicted to the actual probabilities of the events and applies martingale theory to exploit the temporal structure between the forecast and the subsequent occurrence or nonoccurrence of the event.

Article information

Source
Ann. Statist. Volume 39, Number 5 (2011), 2356-2382.

Dates
First available: 30 November 2011

Permanent link to this document
http://projecteuclid.org/euclid.aos/1322663461

Digital Object Identifier
doi:10.1214/11-AOS902

Zentralblatt MATH identifier
06008039

Mathematical Reviews number (MathSciNet)
MR2906871

Subjects
Primary: 60G42: Martingales with discrete parameter 62P99: None of the above, but in this section
Secondary: 62P05: Applications to actuarial sciences and financial mathematics

Keywords
Forecasting loss functions martingales scoring rules

Citation

Lai, Tze Leung; Gross, Shulamith T.; Shen, David Bo. Evaluating probability forecasts. The Annals of Statistics 39 (2011), no. 5, 2356--2382. doi:10.1214/11-AOS902. http://projecteuclid.org/euclid.aos/1322663461.


Export citation

References

  • Arvesen, J. N. (1969). Jackknifing U-statistics. Ann. Math. Statist. 40 2076–2100.
  • Basel Committee on Banking Supervision (2006). Basel II: International convergence of capital measurement and capital standards: A revised framework. Available at http://www.bis.org/publ/bcbs128.htm.
  • Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review 78 1–3.
  • Bröcker, J. and Smith, L. A. (2007). Increasing the reliability of reliability diagrams. Weather and Forecasting 22 651–661.
  • Cox, D. R. (1958). Two further applications of a model for binary regression. Biometrika 45 562–565.
  • Dawid, A. P. (1982). The well-calibrated Bayesian. J. Amer. Statist. Assoc. 77 605–613.
  • de Finetti, B. (1975). Theory of Probability: A Critical Introductory Treatment. Vol. 2. Wiley, London. Translated from the Italian by Antonio Machî and Adrian Smith.
  • DeGroot, M. H. and Fienberg, S. E. (1983). The comparison and evaluation of forecasters. Statistician 32 12–22.
  • Diebold, F. X. and Mariano, R. S. (1995). Comparing predictive accuracy. J. Bus. Econom. Statist. 13 253–263.
  • Fox, C. R. and Birke, R. (2002). Forecasting trial outcomes: Lawyers assign higher probability to possibilities that are described in greater detail. Law Hum. Behav. 26 159–173.
  • Giacomini, R. and White, H. (2006). Tests of conditional predictive ability. Econometrica 74 1545–1578.
  • Gneiting, T., Balabdaoui, F. and Raftery, A. E. (2007). Probabilistic forecasts, calibration and sharpness. J. R. Stat. Soc. Ser. B Stat. Methodol. 69 243–268.
  • Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. J. Amer. Statist. Assoc. 102 359–378.
  • Good, I. J. (1952). Rational decisions. J. Roy. Statist. Soc. Ser. B 14 107–114.
  • Grünwald, P. D. and Dawid, A. P. (2004). Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory. Ann. Statist. 32 1367–1433.
  • Hari, P. N., Zhang, M.-J., Roy, V., Pérez, W. S., Bashey, A., To, L. B., Elfenbein, G., Freytes, C. O., Gale, R. P., Gibson, J., Kyle, R. A., Lazarus, H. M., McCarthy, P. L., Milone, G. A., Pavlovsky, S., Reece, D. E., Schiller, G., Vela-Ojeda, J., Weisdorf, D. and Vesole, D. (2009). Is the international staging system superior to the Durie–Salmon staging system? A comparison in multiple myeloma patients undergoing autologous transplant. Leukemia 23 1528–1534.
  • Lai, T. L. and Wong, S. P.-S. (2008). Statistical models for the Basel II internal ratings-based approach to measuring credit risk of retail products. Stat. Interface 1 229–241.
  • Lichtendahl, K. C. Jr. and Winkler, R. L. (2007). Probability elicitation, scoring rules, and competition among forecasters. Management Sci. 53 1745–1755.
  • Mason, S. J. (2008). Understanding forecast verification statistics. Meteorol. Appl. 15 31–40.
  • Murphy, A. H. and Winkler, R. L. (1984). Probability forecasting in meteorology. J. Amer. Statist. Assoc. 79 489–500.
  • Ranjan, R. and Gneiting, T. (2010). Combining probability forecasts. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 71–91.
  • Redelmeier, D. A., Bloch, D. A. and Hickam, D. H. (1991). Assessing predictive accuracy: How to compare Brier scores. J. Clin. Epidemiol. 44 1141–1146.
  • Schervish, M. J. (1989). A general method for comparing probability assessors. Ann. Statist. 17 1856–1879.
  • Seillier-Moiseiwitsch, F. and Dawid, A. P. (1993). On testing the validity of sequential probability forecasts. J. Amer. Statist. Assoc. 88 355–359.
  • Spiegelhalter, D. J. (1986). Probabilistic prediction in patient management and clinical trials. Stat. Med. 5 421–433.
  • West, K. D. (1996). Asymptotic inference about predictive ability. Econometrica 64 1067–1084.
  • Wilks, D. (2005). Statistical Methods in the Atmospheric Sciences, 2nd ed. International Geophysics 91. Academic Press, New York.
  • Williams, D. (1991). Probability with Martingales. Cambridge Univ. Press, Cambridge.
  • Winkler, R. L. (1994). Evaluating probabilities: Asymmetric scoring rules. Management Sci. 40 1395–1405.