Brazilian Journal of Probability and Statistics
- Braz. J. Probab. Stat.
- Volume 25, Number 2 (2011), 171-182.
Estimating the population mean when some responses are missing
We develop a design-based prediction approach to estimate the finite population mean in a simple setting where some responses are missing. The approach is based on indicator sampling random variables that operate on labeled units (subjects). We define missing data mechanisms that may depend on a subject, or on a selection (such as when the study design assigns groups of selected subjects to different interviewers). Using an approach usually reserved for model-based inference, we develop a predictor that equals the sample total divided by the expected sample size. The methods are based on best linear unbiased prediction in finite population mixed models. When the probability of missing is estimated from the sample, the empirical estimator simplifies to the mean of the realized nonmissing responses. The different missing data mechanisms are revealed by the notation that accounts for the labels and sample selections. The mean squared error (MSE) of the empirical estimator, counterintuitively, is smaller than the MSE if the probability of missing is known.
Braz. J. Probab. Stat., Volume 25, Number 2 (2011), 171-182.
First available in Project Euclid: 31 March 2011
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Lu, Jingsong; Stanek III, Edward J.; Puleo, Elaine. Estimating the population mean when some responses are missing. Braz. J. Probab. Stat. 25 (2011), no. 2, 171--182. doi:10.1214/10-BJPS115. https://projecteuclid.org/euclid.bjps/1301577152