The Annals of Applied Statistics

Elicitability and backtesting: Perspectives for banking regulation

Natalia Nolde and Johanna F. Ziegel

Full-text: Open access


Conditional forecasts of risk measures play an important role in internal risk management of financial institutions as well as in regulatory capital calculations. In order to assess forecasting performance of a risk measurement procedure, risk measure forecasts are compared to the realized financial losses over a period of time and a statistical test of correctness of the procedure is conducted. This process is known as backtesting. Such traditional backtests are concerned with assessing some optimality property of a set of risk measure estimates. However, they are not suited to compare different risk estimation procedures. We investigate the proposal of comparative backtests, which are better suited for method comparisons on the basis of forecasting accuracy, but necessitate an elicitable risk measure. We argue that supplementing traditional backtests with comparative backtests will enhance the existing trading book regulatory framework for banks by providing the correct incentive for accuracy of risk measure forecasts. In addition, the comparative backtesting framework could be used by banks internally as well as by researchers to guide selection of forecasting methods. The discussion focuses on three risk measures, Value at Risk, expected shortfall and expectiles, and is supported by a simulation study and data analysis.

Article information

Ann. Appl. Stat. Volume 11, Number 4 (2017), 1833-1874.

Received: May 2016
Revised: March 2017
First available in Project Euclid: 28 December 2017

Permanent link to this document

Digital Object Identifier

Forecasting backtesting elicitability risk measurement procedure Value at Risk expected shortfall expectiles


Nolde, Natalia; Ziegel, Johanna F. Elicitability and backtesting: Perspectives for banking regulation. Ann. Appl. Stat. 11 (2017), no. 4, 1833--1874. doi:10.1214/17-AOAS1041.

Export citation


  • Acerbi, C. and Szekely, B. (2014). Backtesting expected shortfall. Risk Mag. December 76–81.
  • Andrews, D. W. K. (1991). Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59 817–858.
  • Bank for International Settlements (2013). Consultative document: Fundamental review of the trading book: A revised marked risk framework. Available at
  • Bank for International Settlements (2014). Consultative document: Fundamental review of the trading book: Outstanding issues. Available at
  • Bellini, F. and Bignozzi, V. (2015). On elicitable risk measures. Quant. Finance 15 725–733.
  • Bellini, F. and Di Bernardino, E. (2017). Risk management with expectiles. Eur. J. Finance 23 487–506.
  • Bellini, F., Klar, B., Müller, A. and Gianin, E. R. (2014). Generalized quantiles as risk measures. Insurance Math. Econom. 54 41–48.
  • Bollerslev, T. and Wooldridge, J. M. (1992). Quasi-maximum likelihood estimation and inference in dynamic models with time-varying covariances. Econometric Rev. 11 143–172.
  • Christoffersen, P. F. (1998). Evaluating interval forecasts. Internat. Econom. Rev. 39 841–862.
  • Christoffersen, P. (2003). Elements of Financial Risk Management. Academic Press.
  • Cont, R., Deguest, R. and Scandolo, G. (2010). Robustness and sensitivity analysis of risk measurement procedures. Quant. Finance 10 593–606.
  • Costanzino, N. and Curran, M. (2015). Backtesting general spectral risk measures with application to expected shortfall. Journal of Risk Model Validation 9 21–33.
  • Davis, M. H. A. (2016). Verification of internal risk measure estimates. Stat. Risk Model. 33 67–93.
  • Delbaen, F., Bellini, F., Bignozzi, V. and Ziegel, J. F. (2016). Risk measures with the CxLS property. Finance Stoch. 20 433–453.
  • Diebold, F. X., Gunther, T. A. and Tay, A. S. (1998). Evaluating density forecasts with applications to financial risk management. Internat. Econom. Rev. 39 863–883.
  • Diebold, F. X. and Mariano, R. S. (1995). Comparing predictive accuracy. J. Bus. Econom. Statist. 13 253–263.
  • Diebold, F. X., Schuermann, T. and Stroughair, J. D. (2000). Pitfalls and opportunities in the use of extreme value theory in risk management. J. Risk Finance 1 30–35.
  • Efron, B. (1991). Regression percentiles using asymmetric squared error loss. Statist. Sinica 1 93–125.
  • Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability 57. Chapman & Hall, New York.
  • Ehm, W., Gneiting, T., Jordan, A. and Krüger, F. (2016). Of quantiles and expectiles: Consistent scoring functions, Choquet representations and forecast rankings. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 505–562.
  • Embrechts, P., Klüppelberg, C. and Mikosch, T. (1997). Modelling Extremal Events for Insurance and Finance. Applications of Mathematics (New York) 33. Springer, Berlin.
  • Emmer, S., Kratz, M. and Tasche, D. (2015). What is the best risk measure in practice? A comparison of standard measures. J. Risk 18 31–60.
  • Engle, R. F. and Manganelli, S. (2004). CAViaR: Conditional autoregressive value at risk by regression quantiles. J. Bus. Econom. Statist. 22 367–381.
  • Fernández, C. and Steel, M. F. J. (1998). On Bayesian modeling of fat tails and skewness. J. Amer. Statist. Assoc. 93 359–371.
  • Fissler, T. and Ziegel, J. F. (2016). Higher order elicitability and Osband’s principle. Ann. Statist. 44 1680–1707.
  • Fissler, T., Ziegel, J. F. and Gneiting, T. (2016). Expected shortfall is jointly elicitable with value at risk—implications for backtesting. Risk Mag. January 58–61.
  • Föllmer, H. and Schied, A. (2002). Convex measures of risk and trading constraints. Finance Stoch. 6 429–447.
  • Frongillo, R. and Kash, I. (2015). Vector-valued property elicitation. In Proceedings of the 28th Conference on Learning Theory (S. Kale, P. Grünwald and E. Hazan, eds.). JMLR Workshop and Conference Proceedings 40.
  • Giacomini, R. and White, H. (2006). Tests of conditional predictive ability. Econometrica 74 1545–1578.
  • Gneiting, T. (2011). Making and evaluating point forecasts. J. Amer. Statist. Assoc. 106 746–762.
  • Gneiting, T. and Ranjan, R. (2011). Comparing density forecasts using threshold- and quantile-weighted scoring rules. J. Bus. Econom. Statist. 29 411–422.
  • Holzmann, H. and Eulert, M. (2014). The role of the information set for forecasting—with applications to risk management. Ann. Appl. Stat. 8 595–621.
  • Hommel, G. (1983). Tests of the overall hypothesis for arbitrary dependence structures. Biom. J. 25 423–430.
  • Koenker, R. (2005). Quantile Regression. Econometric Society Monographs 38. Cambridge Univ. Press, Cambridge.
  • Kou, S. and Peng, X. (2016). On the measurement of economic tail risk. Oper. Res. 64 1056–1072.
  • Kuan, C.-M., Yeh, J.-H. and Hsu, Y.-C. (2009). Assessing value at risk with CARE, the conditional autoregressive expectile models. J. Econometrics 150 261–270.
  • Kuester, K., Mittnik, S. and Paolella, M. S. (2006). Value-at-risk prediction: A comparison of alternative strategies. J. Financ. Econom. 4 53–89.
  • Lambert, N. (2013). Elicitation and evaluation of statistical functionals. Preprint. Available at
  • Lambert, N., Pennock, D. M. and Shoham, Y. (2008). Eliciting properties of probability distributions. In Proceedings of the 9th ACM Conference on Electronic Commerce 129–138. Chicago, IL. Extended abstract.
  • McNeil, A. J. and Frey, R. (2000). Estimation of tail-related risk measures for heteroscedastic financial time series: An extreme value approach. J. Empir. Finance 7 271–300.
  • McNeil, A. J., Frey, R. and Embrechts, P. (2005). Quantitative Risk Management: Concepts, Techniques and Tools. Princeton Series in Finance. Princeton Univ. Press, Princeton, NJ.
  • Nau, R. F. (1985). Should scoring rules be “effective”? Manage. Sci. 31 527–535.
  • Newey, W. K. and Powell, J. L. (1987). Asymmetric least squares estimation and testing. Econometrica 55 819–847.
  • Nolde, N. and Ziegel, J. F. (2017). Supplement to “Elicitability and backtesting: Perspectives for banking regulation”. DOI:10.1214/17-AOAS1041SUPP.
  • Osband, K. H. (1985). Providing incentives for better cost forecasting. Ph.D. thesis, Univ. California, Berkeley.
  • Patton, A. J. (2006). Volatility forecast comparison using imperfect volatility proxies. Research Paper 175, Quantitative Finance Research Centre, Univ. Technology, Sydney.
  • Patton, A. J. (2011). Volatility forecast comparison using imperfect volatility proxies. J. Econometrics 160 246–256.
  • Patton, A. J. (2014). Evaluating and comparing possibly misspecified forecasts. Working paper.
  • Patton, A. J. and Sheppard, K. (2009). Evaluating volatility and correlation forecasts. In Handbook of Financial Time Series (T. Mikosch, J.-P. Kreiss, R. A. Davis and T. G. Andersen, eds.) 801–838. Springer, Berlin.
  • R Core Team (2015). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Available at
  • Rosenblatt, M. (1952). Remarks on a multivariate transformation. Ann. Math. Stat. 23 470–472.
  • Saerens, M. (2000). Building cost functions minimizing to some summary statistics. IEEE Trans. Neural Netw. 11 1263–1271.
  • Steinwart, I., Pasin, C., Williamson, R. and Zhang, S. (2014). Elicitation and identification of properties. J. Mach. Learn. Res. Workshop Conf. Proc. 35 1–45.
  • Strähl, C. and Ziegel, J. (2017). Cross-calibration of probabilistic forecasts. Electron. J. Stat. 11 608–639.
  • Thomson, W. (1979). Eliciting production possibilities from a well-informed manager. J. Econom. Theory 20 360–380.
  • Tsyplakov, A. (2014). Theoretical guidelines for a partially informed forecast examiner. MPRA Paper, 55017. Available at
  • Wang, R. and Ziegel, J. F. (2015). Elicitable distortion risk measures: A concise proof. Statist. Probab. Lett. 100 172–175.
  • Weber, S. (2006). Distribution-invariant risk measures, information, and dynamic consistency. Math. Finance 16 419–441.
  • Ziegel, J. F. (2016). Coherence and elicitability. Math. Finance 26 901–918.

See also

  • Discussion of "Elicitability and backtesting: Perspectives for banking regulation".
  • Discussion of "Elicitability and backtesting: Perspectives for banking regulation".
  • Discussion of "Elicitability and backtesting: Perspectives for banking regulation".
  • Discussion on "Elicitability and backtesting: Perspectives for banking regulation".
  • Discussion of "Elicitability and backtesting: Perspectives for banking regulation".
  • Rejoinder: "Elicitability and backtesting: Perspectives for banking regulation".

Supplemental materials

  • Supplementary material for article “Elicitability and backtesting: Perspectives for banking regulation”. We elaborate on some of the points made in the main article as well as provide technical details and proofs of several results.