Open Access
March, 1985 Empirical Distributions in Selection Bias Models
Y. Vardi
Ann. Statist. 13(1): 178-203 (March, 1985). DOI: 10.1214/aos/1176346585

Abstract

The following problem is treated: Given $s$ not-necessarily-random samples from an unknown distribution $F$, and assuming that we know the sampling rule of each sample, is it possible to combine the samples in order to estimate $F$, and if so what is the natural way of doing it? More formally, this translates to the problem of determining whether there exists a nonparametric maximum likelihood estimate (NPMLE) of $F$ on the basis of $s$ samples from weighted versions of $F$, with known weight functions, and if it exists, how to construct it? We give a simple necessary and sufficient condition, which can be checked graphically, for the existence and uniqueness of the NPMLE and, under this condition, we describe a simple method for constructing it. The method is numerically efficient and mathematically interesting because it reduces the problem to one of solving $s - 1$ nonlinear equations with $s - 1$ unknowns, the unique solution of which is easily obtained by the iterative, Gauss-Seidel type, scheme described in the paper. Extensions for the case where the weight functions are not completely specified and for censored samples, applications, numerical examples, and statistical properties of the NPMLE, are discussed. In particular, we prove under this condition that the NPMLE is a sufficient statistic for $F$. The technique has many potential applications, because it is not limited to the case where the sampled items are univariate. A FORTRAN program for the described algorithm is available from the author.

Citation

Download Citation

Y. Vardi. "Empirical Distributions in Selection Bias Models." Ann. Statist. 13 (1) 178 - 203, March, 1985. https://doi.org/10.1214/aos/1176346585

Information

Published: March, 1985
First available in Project Euclid: 12 April 2007

zbMATH: 0578.62047
MathSciNet: MR773161
Digital Object Identifier: 10.1214/aos/1176346585

Subjects:
Primary: 62G05
Secondary: 62E99 , 62M99 , 62P10

Keywords: Nonparametric maximum likelihood , sample selection bias , weighted distributions

Rights: Copyright © 1985 Institute of Mathematical Statistics

Vol.13 • No. 1 • March, 1985
Back to Top