Open Access
July 2011 Estimating the population mean when some responses are missing
Jingsong Lu, Edward J. Stanek III, Elaine Puleo
Braz. J. Probab. Stat. 25(2): 171-182 (July 2011). DOI: 10.1214/10-BJPS115

Abstract

We develop a design-based prediction approach to estimate the finite population mean in a simple setting where some responses are missing. The approach is based on indicator sampling random variables that operate on labeled units (subjects). We define missing data mechanisms that may depend on a subject, or on a selection (such as when the study design assigns groups of selected subjects to different interviewers). Using an approach usually reserved for model-based inference, we develop a predictor that equals the sample total divided by the expected sample size. The methods are based on best linear unbiased prediction in finite population mixed models. When the probability of missing is estimated from the sample, the empirical estimator simplifies to the mean of the realized nonmissing responses. The different missing data mechanisms are revealed by the notation that accounts for the labels and sample selections. The mean squared error (MSE) of the empirical estimator, counterintuitively, is smaller than the MSE if the probability of missing is known.

Citation

Download Citation

Jingsong Lu. Edward J. Stanek III. Elaine Puleo. "Estimating the population mean when some responses are missing." Braz. J. Probab. Stat. 25 (2) 171 - 182, July 2011. https://doi.org/10.1214/10-BJPS115

Information

Published: July 2011
First available in Project Euclid: 31 March 2011

zbMATH: 1298.62020
MathSciNet: MR2793924
Digital Object Identifier: 10.1214/10-BJPS115

Keywords: best linear unbiased estimator (BLUE) , finite population , MCAR , missing data , prediction , simple random sampling

Rights: Copyright © 2011 Brazilian Statistical Association

Vol.25 • No. 2 • July 2011
Back to Top