Open Access
2022 General-purpose imputation of planned missing data in social surveys: Different strategies and their effect on correlations
Julian B. Axenfeld, Christian Bruch, Christof Wolf
Author Affiliations +
Statist. Surv. 16: 182-209 (2022). DOI: 10.1214/22-SS137

Abstract

Planned missing survey data, for example stemming from split questionnaire designs are becoming increasingly common in survey research, making imputation indispensable to obtain reasonably analyzable data. However, these data can be difficult to impute due to low correlations, many predictors, and limited sample sizes to support imputation models. This paper presents findings from a Monte Carlo simulation, in which we investigate the accuracy of correlations after multiple imputation using different imputation methods and predictor set specifications based on data from the German Internet Panel (GIP). The results show that strategies that simplify the imputation exercise (such as predictive mean matching with dimensionality reduction or restricted predictor sets, linear regression models, or the multivariate normal model without transformation) perform well, while especially generalized linear models for categorical data, classification trees, and imputation models with many predictor variables lead to strong biases.

Funding Statement

This work was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) [project numbers: BL 1148/1-1, BR 5869/1-1, WO 739/20-1]. The Monte Carlo simulations were run on the High Performance Computing facilities of the state of Baden-Württemberg (bwHPC). This paper uses data from the German Internet Panel (GIP) funded by the DFG through the Collaborative Research Center (SFB) 884 “Political Economy of Reforms” (SFB 884) [Project-ID: 139943784].

Acknowledgments

The authors wish to thank Azim Selvi for his assistance and Hannah Laumann for language editing. The authors gratefully acknowledge support by the state of Baden-Württemberg through bwHPC for providing high-performance computing facilities for the Monte Carlo simulation.

Citation

Download Citation

Julian B. Axenfeld. Christian Bruch. Christof Wolf. "General-purpose imputation of planned missing data in social surveys: Different strategies and their effect on correlations." Statist. Surv. 16 182 - 209, 2022. https://doi.org/10.1214/22-SS137

Information

Received: 1 February 2022; Published: 2022
First available in Project Euclid: 9 August 2022

MathSciNet: MR4464527
zbMATH: 07577515
Digital Object Identifier: 10.1214/22-SS137

Subjects:
Primary: 62D10
Secondary: 62P25 , 65C05

Keywords: bias , imputation methods , Monte Carlo simulation , multiple imputation , split questionnaire design

Vol.16 • 2022
Back to Top