The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 12, Number 1 (2018), 310-329.
Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate
Threshold selection is a critical issue for extreme value analysis with threshold-based approaches. Under suitable conditions, exceedances over a high threshold have been shown to follow the generalized Pareto distribution (GPD) asymptotically. In practice, however, the threshold must be chosen. If the chosen threshold is too low, the GPD approximation may not hold and bias can occur. If the threshold is chosen too high, reduced sample size increases the variance of parameter estimates. To process batch analyses, commonly used selection methods such as graphical diagnostics are subjective and cannot be automated. We develop an efficient technique to evaluate and apply the Anderson–Darling test to the sample of exceedances above a fixed threshold. In order to automate threshold selection, this test is used in conjunction with a recently developed stopping rule that controls the false discovery rate in ordered hypothesis testing. Previous attempts in this setting do not account for the issue of ordered multiple testing. The performance of the method is assessed in a large scale simulation study that mimics practical return level estimation. This procedure was repeated at hundreds of sites in the western US to generate return level maps of extreme precipitation.
Ann. Appl. Stat. Volume 12, Number 1 (2018), 310-329.
Received: April 2016
Revised: August 2017
First available in Project Euclid: 9 March 2018
Permanent link to this document
Digital Object Identifier
Bader, Brian; Yan, Jun; Zhang, Xuebin. Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate. Ann. Appl. Stat. 12 (2018), no. 1, 310--329. doi:10.1214/17-AOAS1092. https://projecteuclid.org/euclid.aoas/1520564474
- Additional simulation results and data analysis. Material consisting of R code for the power study, additional simulation results, and analysis related to the application.