The problem of regions

Bradley Efron; Robert Tibshirani

doi:10.1214/aos/1024691353

October 1998 The problem of regions

Bradley Efron, Robert Tibshirani

Ann. Statist. 26(5): 1687-1718 (October 1998). DOI: 10.1214/aos/1024691353

Abstract

In the problem of regions, we wish to know which one of a discrete set of possibilities applies to a continuous parameter vector. This problem arises in the following way: we compute a descriptive statistic from a set of data, notice an interesting feature and wish to assign a confidence level to that feature. For example, we compute a density estimate and notice that the estimate is bimodal. What confidence can we assign to bimodality? A natural way to measure confidence is via the bootstrap: we compute our descriptive statistic on a large number of bootstrap data sets and record the proportion of times that the feature appears. This seems like a plausible measure of confidence for the feature. The paper studies the construction of such confidence values and examines to what extent they approximate frequentist $p$-values and Bayesian a posteriori probabilities. We derive more accurate confidence levels using both frequentist and objective Bayesian approaches. The methods are illustrated with a number of examples, including polynomial model selection and estimating the number of modes of a density.