Statistical Science

Multiple-Imputation Inferences with Uncongenial Sources of Input

Xiao-Li Meng

Full-text: Open access

Abstract

Conducting sample surveys, imputing incomplete observations, and analyzing the resulting data are three indispensable phases of modern practice with public-use data files and with many other statistical applications. Each phase inherits different input, including the information preceding it and the intellectual assessments available, and aims to provide output that is one step closer to arriving at statistical inferences with scientific relevance. However, the role of the imputation phase has often been viewed as merely providing computational convenience for users of data. Although facilitating computation is very important, such a viewpoint ignores the imputer's assessments and information inaccessible to the users. This view underlies the recent controversy over the validity of multiple-imputation inference when a procedure for analyzing multiply imputed data sets cannot be derived from (is "uncongenial" to) the model adopted for multiple imputation. Given sensible imputations and complete-data analysis procedures, inferences from standard multiple-imputation combining rules are typically superior to, and thus different from, users' incomplete-data analyses. The latter may suffer from serious nonresponse biases because such analyses often must rely on convenient but unrealistic assumptions about the nonresponse mechanism. When it is desirable to conduct inferences under models for nonresponse other than the original imputation model, a possible alternative to recreating imputations is to incorporate appropriate importance weights into the standard combining rules. These points are reviewed and explored by simple examples and general theory, from both Bayesian and frequentist perspectives, particularly from the randomization perspective. Some convenient terms are suggested for facilitating communication among researchers from different perspectives when evaluating multiple-imputation inferences with uncongenial sources of input.

Article information

Source
Statist. Sci. Volume 9, Number 4 (1994), 538-558.

Dates
First available in Project Euclid: 19 April 2007

Permanent link to this document
https://projecteuclid.org/euclid.ss/1177010269

Digital Object Identifier
doi:10.1214/ss/1177010269

JSTOR
links.jstor.org

Keywords
Congeniality self-efficiency importance sampling incomplete data missing data nonresponse normalizing constants public-use data file randomization

Citation

Meng, Xiao-Li. Multiple-Imputation Inferences with Uncongenial Sources of Input. Statist. Sci. 9 (1994), no. 4, 538--558. doi:10.1214/ss/1177010269. https://projecteuclid.org/euclid.ss/1177010269


Export citation

See also

  • See Comment: Robert E. Fay. [Multiple-Imputation Inferences with Uncongenial Sources of Input]: Comment. Statist. Sci., Volume 9, Number 4 (1994), 558--560.
  • See Comment: Joseph L. Schafer. [Multiple-Imputation Inferences with Uncongenial Sources of Input]: Comment. Statist. Sci., Volume 9, Number 4 (1994), 560--561.
  • See Comment: Chris Skinner. [Multiple-Imputation Inferences with Uncongenial Sources of Input]: Comment. Statist. Sci., Volume 9, Number 4 (1994), 561--563.
  • See Comment: Alan M. Zaslavsky. [Multiple-Imputation Inferences with Uncongenial Sources of Input]: Comment: Using the Full Toolkit. Statist. Sci., Volume 9, Number 4 (1994), 563--565.
  • See Comment: Xiao-Li Meng. [Multiple-Imputation Inferences with Uncongenial Sources of Input]: Rejoinder. Statist. Sci., Volume 9, Number 4 (1994), 566--573.