Abstract
We propose the calibrated posterior predictive $p$-value ($cppp$) as an interpretable goodness-of-fit (GOF) measure for regression models in sequential regression multiple imputation (SRMI). The $cppp$ is uniformly distributed under the assumed model, while the posterior predictive $p$-value ($ppp$) is not in general and in particular when the percentage of missing data, $pm$, increases. Uniformity of $cppp$ allows the analyst to evaluate properly the evidence against the assumed model. We show the advantages of $cppp$ over $ppp$ in terms of power in detecting common departures from the assumed model and, more importantly, in terms of robustness with respect to $pm$. In the imputation phase, which provides a complete database for general statistical analyses, default and improper priors are usually needed, whereas the $cppp$ requires a proper prior on regression parameters. We avoid this problem by introducing the use of a minimum training sample that turns the improper prior into a proper distribution. The dependency on the training sample is naturally accounted for by changing the training sample at each step of the SRMI. Our results come from theoretical considerations together with simulation studies and an application to a real data set of anthropometric measures.
Citation
Stefano Cabras. María Eugenia Castellanos. Alicia Quirós. "Goodness-of-fit of conditional regression models for multiple imputation." Bayesian Anal. 6 (3) 429 - 455, September 2011. https://doi.org/10.1214/11-BA617
Information