The Annals of Statistics

Asymptotic theory for the semiparametric accelerated failure time model with missing data

Bin Nan, John D. Kalbfleisch, and Menggang Yu
Source: Ann. Statist. Volume 37, Number 5A (2009), 2351-2376.

Abstract

We consider a class of doubly weighted rank-based estimating methods for the transformation (or accelerated failure time) model with missing data as arise, for example, in case-cohort studies. The weights considered may not be predictable as required in a martingale stochastic process formulation. We treat the general problem as a semiparametric estimating equation problem and provide proofs of asymptotic properties for the weighted estimators, with either true weights or estimated weights, by using empirical process theory where martingale theory may fail. Simulations show that the outcome-dependent weighted method works well for finite samples in case-cohort studies and improves efficiency compared to methods based on predictable weights. Further, it is seen that the method is even more efficient when estimated weights are used, as is commonly the case in the missing data literature. The Gehan censored data Wilcoxon weights are found to be surprisingly efficient in a wide class of problems.

First Page: Show Hide
Primary Subjects: 62E20, 62N01
Secondary Subjects: 62D05
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1247663758
Digital Object Identifier: doi:10.1214/08-AOS657
Zentralblatt MATH identifier: 05596904
Mathematical Reviews number (MathSciNet): MR2543695

References

[1] Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ. Press, Baltimore.
Mathematical Reviews (MathSciNet): MR1245941
[2] Borgan, O., Langholz, B., Samuelsen, S. O., Goldstein, L. and Pogoda, J. (2000). Exposure stratified case-cohort designs. Lifetime Data Anal. 6 39–58.
Mathematical Reviews (MathSciNet): MR1767493
Digital Object Identifier: doi:10.1023/A:1009661900674
[3] Breslow, N. E. and Wellner, J. A. (2007). Weighted likelihood for semiparametric models and two-phase stratified samples, with application to Cox regression. Scand. J. Statist. 34 86–102.
Mathematical Reviews (MathSciNet): MR2325244
Digital Object Identifier: doi:10.1111/j.1467-9469.2006.00523.x
[4] Buckley, J. and James, I. R. (1979). Linear regression with censored data. Biometrika 66 429–436.
[5] Chen, K. and Lo, S.-H. (1999). Case-cohort and case-control analysis with Cox’s model. Biometrika 86 755–764.
Mathematical Reviews (MathSciNet): MR1741975
Zentralblatt MATH: 0940.62108
Digital Object Identifier: doi:10.1093/biomet/86.4.755
[6] Cox, D. R. (1972). Regression models and life tables (with discussion). J. Roy. Statist. Soc. Ser. B 34 187–220.
Mathematical Reviews (MathSciNet): MR341758
[7] Fygenson, M. and Ritov, Y. (1994). Monotone estimating equations for censored data. Ann. Statist. 22 732–746.
Mathematical Reviews (MathSciNet): MR1292538
Zentralblatt MATH: 0807.62032
Digital Object Identifier: doi:10.1214/aos/1176325493
Project Euclid: euclid.aos/1176325493
[8] Hu, H. (1998). Large sample theory for pseudo-maximum likelihood estimates in semiparametric models. Ph.D. dissertation, Dept. Statistics, Univ. Washington.
[9] Huang, Y. (2002). Calibration regression of censored lifetime medical cost. J. Amer. Statist. Assoc. 97 318–327.
Mathematical Reviews (MathSciNet): MR1947289
Zentralblatt MATH: 1073.62562
Digital Object Identifier: doi:10.1198/016214502753479446
[10] Jin, Z., Ying, Z. and Wei, L. J. (2001). A simple resampling method by perturbing the minimand. Biometrika 88 381–390.
Mathematical Reviews (MathSciNet): MR1844838
Zentralblatt MATH: 0984.62033
Digital Object Identifier: doi:10.1093/biomet/88.2.381
[11] Kalbfleisch, J. D. and Lawless, J. F. (1988). Likelihood analysis of multi-state models for disease incidence and mortality. Stat. Med. 7 149–160.
[12] Kalbfleisch, J. D. and Prentice, R. L. (2002). The Statistical Analysis of Failure Time Data, 2nd ed. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1924807
[13] Kulich, M. and Lin, D. Y. (2004). Improving the efficiency of relative-risk estimation in case-cohort studies. J. Amer. Statist. Assoc. 99 832–844.
Mathematical Reviews (MathSciNet): MR2090916
Zentralblatt MATH: 1117.62373
Digital Object Identifier: doi:10.1198/016214504000000584
[14] Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data, 2nd ed. Wiley, Hoboken, NJ.
Mathematical Reviews (MathSciNet): MR1925014
[15] Nan, B. and Wellner, J. A. (2006). Semiparametric pseudo Z-estimation with applications. Technical report, Dept. Biostatistics, Univ. Michigan.
[16] Nan, B., Yu, M. and Kalbfleisch, J. D. (2006). Censored linear regression for case-cohort studies. Biometrika 93 747–762.
Mathematical Reviews (MathSciNet): MR2285069
Digital Object Identifier: doi:10.1093/biomet/93.4.747
[17] Parzen, M. I., Wei, L. J. and Ying, Z. (1994). A resampling method based on pivotal estimating functions. Biometrika 81 341–350.
Mathematical Reviews (MathSciNet): MR1294895
Zentralblatt MATH: 0807.62038
Digital Object Identifier: doi:10.1093/biomet/81.2.341
[18] Pierce, D. A. (1982). The asymptotic effect of substituting estimators for parameters in certain types of statistics. Ann. Statist. 10 475–478.
Mathematical Reviews (MathSciNet): MR653522
Zentralblatt MATH: 0488.62012
Digital Object Identifier: doi:10.1214/aos/1176345788
Project Euclid: euclid.aos/1176345788
[19] Prentice, R. L. (1986). A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73 1–11.
[20] Pugh, M., Robins, J., Lipsitz, S. and Harrington, D. (1994). Inference in the Cox proportional hazards model with missing covariates. Technical Report 758Z, Harvard School of Public Health, Boston, MA.
[21] Ritov, Y. (1990). Estimation in a linear regression model with censored data. Ann. Statist. 18 303–328.
Mathematical Reviews (MathSciNet): MR1041395
Zentralblatt MATH: 0713.62045
Digital Object Identifier: doi:10.1214/aos/1176347502
Project Euclid: euclid.aos/1176347502
[22] Robins, J. M., Rotnitzky, A. and Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. J. Amer. Statist. Assoc. 89 846–866.
Mathematical Reviews (MathSciNet): MR1294730
Zentralblatt MATH: 0815.62043
Digital Object Identifier: doi:10.1080/01621459.1994.10476818
[23] Self, S. G. and Prentice, R. L. (1988). Asymptotic distribution theory and efficiency results for case-cohort studies. Ann. Statist. 16 64–81.
Mathematical Reviews (MathSciNet): MR924857
Zentralblatt MATH: 0666.62108
Digital Object Identifier: doi:10.1214/aos/1176350691
Project Euclid: euclid.aos/1176350691
[24] Stute, W. (1993). Consistent estimation under random censorship when covariables are available. J. Multivariate Anal. 45 89–103.
Mathematical Reviews (MathSciNet): MR1222607
Zentralblatt MATH: 0767.62036
Digital Object Identifier: doi:10.1006/jmva.1993.1028
[25] Stute, W. (1996). Distributional convergence under random censorship when covariables are present. Scand. J. Statist. 23 461–471.
Mathematical Reviews (MathSciNet): MR1439707
[26] Tsiatis, A. A. (1990). Estimating regression parameters using linear rank tests for censored data. Ann. Statist. 18 354–372.
Mathematical Reviews (MathSciNet): MR1041397
Zentralblatt MATH: 0701.62051
Digital Object Identifier: doi:10.1214/aos/1176347504
Project Euclid: euclid.aos/1176347504
[27] van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
Mathematical Reviews (MathSciNet): MR1385671
Zentralblatt MATH: 0862.60002
[28] van der Vaart, A. W. and Wellner, J. A. (2000). Preservation theorems for Glivenko–Cantelli and uniform Glivenko–Cantelli classes. In High Dimensional Probability II (E. Giné, D. Mason and J. A. Wellner, eds.) 115–134. Birkhäuser, Boston.
Mathematical Reviews (MathSciNet): MR1857319
Zentralblatt MATH: 0967.60037
Digital Object Identifier: doi:10.1007/978-1-4612-1358-1_9
[29] Wei, L. J., Ying, Z. L. and Lin, D. Y. (1990). Linear regression analysis for censored survival data based on rank tests. Biometrika 77 845–851.
Mathematical Reviews (MathSciNet): MR1086694
Digital Object Identifier: doi:10.1093/biomet/77.4.845
[30] Ying, Z. (1993). A large sample study of rank estimation for censored regression data. Ann. Statist. 21 76–99.
Mathematical Reviews (MathSciNet): MR1212167
Zentralblatt MATH: 0773.62048
Digital Object Identifier: doi:10.1214/aos/1176349016
Project Euclid: euclid.aos/1176349016
[31] Yu, M. and Nan, B. (2006). A hybrid Newton-type method for censored survival data using double weights in linear models. Lifetime Data Anal. 12 345–364.
Mathematical Reviews (MathSciNet): MR2328581
Digital Object Identifier: doi:10.1007/s10985-006-9014-0

2013 © Institute of Mathematical Statistics

The Annals of Statistics

The Annals of Statistics

Turn MathJax Off
What is MathJax?