## The Annals of Statistics

- Ann. Statist.
- Volume 13, Number 1 (1985), 178-203.

### Empirical Distributions in Selection Bias Models

#### Abstract

The following problem is treated: Given $s$ not-necessarily-random samples from an unknown distribution $F$, and assuming that we know the sampling rule of each sample, is it possible to combine the samples in order to estimate $F$, and if so what is the natural way of doing it? More formally, this translates to the problem of determining whether there exists a nonparametric maximum likelihood estimate (NPMLE) of $F$ on the basis of $s$ samples from weighted versions of $F$, with known weight functions, and if it exists, how to construct it? We give a simple necessary and sufficient condition, which can be checked graphically, for the existence and uniqueness of the NPMLE and, under this condition, we describe a simple method for constructing it. The method is numerically efficient and mathematically interesting because it reduces the problem to one of solving $s - 1$ nonlinear equations with $s - 1$ unknowns, the unique solution of which is easily obtained by the iterative, Gauss-Seidel type, scheme described in the paper. Extensions for the case where the weight functions are not completely specified and for censored samples, applications, numerical examples, and statistical properties of the NPMLE, are discussed. In particular, we prove under this condition that the NPMLE is a sufficient statistic for $F$. The technique has many potential applications, because it is not limited to the case where the sampled items are univariate. A FORTRAN program for the described algorithm is available from the author.

#### Article information

**Source**

Ann. Statist., Volume 13, Number 1 (1985), 178-203.

**Dates**

First available in Project Euclid: 12 April 2007

**Permanent link to this document**

https://projecteuclid.org/euclid.aos/1176346585

**Digital Object Identifier**

doi:10.1214/aos/1176346585

**Mathematical Reviews number (MathSciNet)**

MR773161

**Zentralblatt MATH identifier**

0578.62047

**JSTOR**

links.jstor.org

**Subjects**

Primary: 62G05: Estimation

Secondary: 62E99: None of the above, but in this section 62M99: None of the above, but in this section 62P10: Applications to biology and medical sciences

**Keywords**

Nonparametric maximum likelihood sample selection bias weighted distributions

#### Citation

Vardi, Y. Empirical Distributions in Selection Bias Models. Ann. Statist. 13 (1985), no. 1, 178--203. doi:10.1214/aos/1176346585. https://projecteuclid.org/euclid.aos/1176346585