Statistical Science

What Is Meant by “Missing at Random”?

Shaun Seaman, John Galati, Dan Jackson, and John Carlin

The concept of missing at random is central in the literature on statistical analysis with missing data. In general, inference using incomplete data should be based not only on observed data values but should also take account of the pattern of missing values. However, it is often said that if data are missing at random, valid inference using likelihood approaches (including Bayesian) can be obtained ignoring the missingness mechanism. Unfortunately, the term “missing at random” has been used inconsistently and not always clearly; there has also been a lack of clarity around the meaning of “valid inference using likelihood”. These issues have created potential for confusion about the exact conditions under which the missingness mechanism can be ignored, and perhaps fed confusion around the meaning of “analysis ignoring the missingness mechanism”. Here we provide standardised precise definitions of “missing at random” and “missing completely at random”, in order to promote unification of the theory. Using these definitions we clarify the conditions that suffice for “valid inference” to be obtained under a variety of inferential paradigms.

Article information

Statist. Sci., Volume 28, Number 2 (2013), 257-268.

First available in Project Euclid: 21 May 2013

Ignorability direct-likelihood inference frequentist inference repeated sampling missing completely at random


Seaman, Shaun; Galati, John; Jackson, Dan; Carlin, John. What Is Meant by “Missing at Random”?. Statist. Sci. 28 (2013), no. 2, 257--268. doi:10.1214/13-STS415.

