Open Access
Translator Disclaimer
August 2003 Large sample theory for semiparametric regression models with two-phase, outcome dependent sampling
Norman Breslow, Brad McNeney, Jon A. Wellner
Ann. Statist. 31(4): 1110-1139 (August 2003). DOI: 10.1214/aos/1059655907


Outcome-dependent, two-phase sampling designs can dramatically reduce the costs of observational studies by judicious selection of the most informative subjects for purposes of detailed covariate measurement. Here we derive asymptotic information bounds and the form of the efficient score and influence functions for the semiparametric regression models studied by Lawless, Kalbfleisch and Wild (1999) under two-phase sampling designs. We show that the maximum likelihood estimators for both the parametric and nonparametric parts of the model are asymptotically normal and efficient. The efficient influence function for the parametric part agrees with the more general information bound calculations of Robins, Hsieh and Newey (1995). By verifying the conditions of Murphy and van der Vaart (2000) for a least favorable parametric submodel, we provide asymptotic justification for statistical inference based on profile likelihood.


Download Citation

Norman Breslow. Brad McNeney. Jon A. Wellner. "Large sample theory for semiparametric regression models with two-phase, outcome dependent sampling." Ann. Statist. 31 (4) 1110 - 1139, August 2003.


Published: August 2003
First available in Project Euclid: 31 July 2003

zbMATH: 1105.62335
MathSciNet: MR2001644
Digital Object Identifier: 10.1214/aos/1059655907

Primary: 60F05 , 60F17
Secondary: 60J65 , 60J70

Keywords: $Z$-theorem , Asymptotic distributions , Asymptotic efficiency , consistency , covariates , Empirical processes , information bounds , least favorable , maximum likelihood , missing data , outcome dependent , profile likelihood , stratified sampling , two-phase

Rights: Copyright © 2003 Institute of Mathematical Statistics


Vol.31 • No. 4 • August 2003
Back to Top