Bayesian Anal. 17 (3), 737-764, (September 2022) DOI: 10.1214/21-BA1271
David E. Jones, Robert N. Trangucci, Yang Chen
KEYWORDS: effective prior sample size, statistical information, Wasserstein distance, Bayes estimate, sensitivity analysis
When summarizing a Bayesian analysis, it is important to quantify the contribution of the prior distribution to the final posterior inference because this informs other researchers whether the prior information needs to be carefully scrutinized, and whether alternative priors are likely to substantially alter the conclusions drawn. One appealing and interpretable way to do this is to report an effective prior sample size (EPSS), which captures how many observations the information in the prior distribution corresponds to. However, typically the most important aspect of the prior distribution is its location relative to the data, and therefore traditional information measures are somewhat deficit for the purpose of quantifying EPSS, because they concentrate on the variance or spread of the prior distribution (in isolation from the data). To partially address this difficulty, Reimherr et al. (2014) introduced a class of EPSS measures based on prior-likelihood discordance. In this paper, we take this idea further by proposing a new measure of EPSS that not only incorporates the general mathematical form of the likelihood (as proposed by Reimherr et al., 2014) but also the specific data at hand. Thus, our measure considers the location of the prior relative to the current observed data, rather than relative to the average of multiple datasets from the working model, the latter being the approach taken by Reimherr et al. (2014). Consequently, our measure can be highly variable, but we demonstrate that this is because the impact of a prior on a Bayesian analysis can intrinsically be highly variable. Our measure is called the (posterior) mean Observed Prior Effective Sample Size (mOPESS), and is a Bayes estimate of a meaningful quantity. The mOPESS well communicates the extent to which inference is determined by the prior, or framed differently, the amount of sampling effort saved due to having relevant prior information. We illustrate our ideas through a number of examples including Gaussian conjugate and non-conjugate models (continuous observations), a Beta-Binomial model (discrete observations), and a linear regression model (two unknown parameters).