## Brazilian Journal of Probability and Statistics

- Braz. J. Probab. Stat.
- Volume 25, Number 2 (2011), 171-182.

### Estimating the population mean when some responses are missing

Jingsong Lu, Edward J. Stanek III, and Elaine Puleo

#### Abstract

We develop a design-based prediction approach to estimate the finite population mean in a simple setting where some responses are missing. The approach is based on indicator sampling random variables that operate on labeled units (subjects). We define missing data mechanisms that may depend on a subject, or on a selection (such as when the study design assigns groups of selected subjects to different interviewers). Using an approach usually reserved for model-based inference, we develop a predictor that equals the sample total divided by the expected sample size. The methods are based on best linear unbiased prediction in finite population mixed models. When the probability of missing is estimated from the sample, the empirical estimator simplifies to the mean of the realized nonmissing responses. The different missing data mechanisms are revealed by the notation that accounts for the labels and sample selections. The mean squared error (MSE) of the empirical estimator, counterintuitively, is smaller than the MSE if the probability of missing is known.

#### Article information

**Source**

**Dates**

First available in Project Euclid: 31 March 2011

**Keywords**

Simple random sampling missing data MCAR finite population best linear unbiased estimator (BLUE) prediction

