Goodness-of-fit of conditional regression models for multiple imputation

Stefano Cabras; María Eugenia Castellanos; Alicia Quirós

doi:10.1214/11-BA617

September 2011 Goodness-of-fit of conditional regression models for multiple imputation

Stefano Cabras, María Eugenia Castellanos, Alicia Quirós

Bayesian Anal. 6(3): 429-455 (September 2011). DOI: 10.1214/11-BA617

Abstract

We propose the calibrated posterior predictive $p$-value ($cppp$) as an interpretable goodness-of-fit (GOF) measure for regression models in sequential regression multiple imputation (SRMI). The $cppp$ is uniformly distributed under the assumed model, while the posterior predictive $p$-value ($ppp$) is not in general and in particular when the percentage of missing data, $pm$, increases. Uniformity of $cppp$ allows the analyst to evaluate properly the evidence against the assumed model. We show the advantages of $cppp$ over $ppp$ in terms of power in detecting common departures from the assumed model and, more importantly, in terms of robustness with respect to $pm$. In the imputation phase, which provides a complete database for general statistical analyses, default and improper priors are usually needed, whereas the $cppp$ requires a proper prior on regression parameters. We avoid this problem by introducing the use of a minimum training sample that turns the improper prior into a proper distribution. The dependency on the training sample is naturally accounted for by changing the training sample at each step of the SRMI. Our results come from theoretical considerations together with simulation studies and an application to a real data set of anthropometric measures.

Citation

Download Citation

Stefano Cabras. María Eugenia Castellanos. Alicia Quirós. "Goodness-of-fit of conditional regression models for multiple imputation." Bayesian Anal. 6 (3) 429 - 455, September 2011. https://doi.org/10.1214/11-BA617

Information

Published: September 2011

First available in Project Euclid: 13 June 2012

zbMATH: 1330.62111

MathSciNet: MR2843539

Digital Object Identifier: 10.1214/11-BA617

Subjects:

Primary: 62F15

Secondary: 62D05 , 62G10 , 62L10

Keywords: Calibrated posterior predictive $p$-value , Discrepancy measure , Minimum training sample , missing at random , predictive distribution , Sequential regression multiple imputation

Access the abstract

JOURNAL ARTICLE
27 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY