Abstract
Vardi (1985a) introduced an $s$-sample model for biased sampling, gave conditions which guarantee the existence and uniqueness of the nonparametric maximum likelihood estimator $\mathbb{G}_n$ of the common underlying distribution $G$ and discussed numerical methods for calculating the estimator. Here we examine the large sample behavior of the NPMLE $\mathbb{G}_n$, including results on uniform consistency of $\mathbb{G}_n$, convergence of $\sqrt n (\mathbb{G}_n - G)$ to a Gaussian process and asymptotic efficiency of $\mathbb{G}_n$ as an estimator of $G$. The proofs are based upon recent results for empirical processes indexed by sets and functions and convexity arguments. We also give a careful proof of identifiability of the underlying distribution $G$ under connectedness of a certain graph $\mathbf{G}$. Examples and applications include length-biased sampling, stratified sampling, "enriched" stratified sampling, "choice-based" sampling in econometrics and "case-control" studies in biostatistics. A final section discusses design issues and further problems.
Citation
Richard D. Gill. Yehuda Vardi. Jon A. Wellner. "Large Sample Theory of Empirical Distributions in Biased Sampling Models." Ann. Statist. 16 (3) 1069 - 1112, September, 1988. https://doi.org/10.1214/aos/1176350948
Information