The Annals of Statistics

Penalized log-likelihood estimation for partly linear transformation models with current status data

Shuangge Ma and Michael R. Kosorok

Full-text: Open access

Abstract

We consider partly linear transformation models applied to current status data. The unknown quantities are the transformation function, a linear regression parameter and a nonparametric regression effect. It is shown that the penalized MLE for the regression parameter is asymptotically normal and efficient and converges at the parametric rate, although the penalized MLE for the transformation function and nonparametric regression effect are only n1/3 consistent. Inference for the regression parameter based on a block jackknife is investigated. We also study computational issues and demonstrate the proposed methodology with a simulation study. The transformation models and partly linear regression terms, coupled with new estimation and inference techniques, provide flexible alternatives to the Cox model for current status data analysis.

Article information

Source
Ann. Statist., Volume 33, Number 5 (2005), 2256-2290.

Dates
First available in Project Euclid: 25 November 2005

Permanent link to this document
https://projecteuclid.org/euclid.aos/1132936563

Digital Object Identifier
doi:10.1214/009053605000000444

Mathematical Reviews number (MathSciNet)
MR2211086

Zentralblatt MATH identifier
1086.62056

Subjects
Primary: 62G08: Nonparametric regression 60F05: Central limit and other weak theorems
Secondary: 62G20: Asymptotic properties 62B10: Information-theoretic topics [See also 94A17]

Keywords
Current status data empirical processes nonparametric regression semiparametric efficiency splines transformation models

Citation

Ma, Shuangge; Kosorok, Michael R. Penalized log-likelihood estimation for partly linear transformation models with current status data. Ann. Statist. 33 (2005), no. 5, 2256--2290. doi:10.1214/009053605000000444. https://projecteuclid.org/euclid.aos/1132936563


Export citation

References

  • Andrews, C., van der Laan, M. and Robins, J. M. (2005). Locally efficient estimation of regression parameters using current status data. J. Multivariate Anal. 96 332--351.
  • Ayer, M., Brunk, H. D., Ewing, G. M., Reid, W. T. and Silverman, E. (1955). An empirical distribution function for sampling with incomplete information. Ann. Math. Statist. 26 641--647.
  • Becker, N. G. (1989). Analysis of Infectious Disease Data. Chapman and Hall, New York.
  • Betensky, R. A., Rabinowitz, D. and Tsiatis, A. A. (2001). Computationally simple accelerated failure time regression for interval censored data. Biometrika 88 703--711.
  • Bickel, P. J., Götze, F. and van Zwet, W. R. (1997). Resampling fewer than $n$ observations: Gains, losses, and remedies for losses. Statist. Sinica 7 1--31.
  • Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ. Press.
  • Bickel, P. J. and Ritov, Y. (1997). Local asymptotic normality of ranks and covariates in transformation models. In Festschrift for Lucien Le Cam (D. Pollard, E. Torgersen and G. L. Yang, eds.) 43--54. Springer, New York.
  • Box, G. E. P. and Cox, D. R. (1964). An analysis of transformation (with discussion). J. Roy. Statist. Soc. Ser. B 36 211--252.
  • Chamberlain, G. (1986). Asymptotic efficiency in semiparametric models with censoring. J. Econometrics 32 189--218.
  • Cheng, S. C., Wei, L. J. and Ying, Z. (1995). Analysis of transformation models with censored data. Biometrika 82 835--845.
  • Cosslett, S. R. (1983). Distribution-free maximum likelihood estimator of the binary choice model. Econometrica 51 765--782.
  • Cosslett, S. R. (1987). Efficiency bounds for distribution-free estimators of the binary choice and censored regression models. Econometrica 55 559--585.
  • Dabrowska, D. M. and Doksum, K. A. (1988). Estimation and testing in a two-sample generalized odds-rate model. J. Amer. Statist. Assoc. 83 744--749.
  • Diamond, I. D., McDonald, J. W. and Shah, I. H. (1986). Proportional hazards models for current status data: Application to the study of differentials in age at weaning in Pakistan. Demography 23 607--620.
  • Dinse, G. E. and Lagakos, S. W. (1983). Regression analysis of tumour prevalence data. Appl. Statist. 32 236--248.
  • Farrington, C. P. (1996). Interval censored survival data: A generalized linear modelling approach. Statistics in Medicine 15 283--292.
  • Ghosh, D. (2001). Efficiency considerations in the additive hazards model with current status data. Statist. Neerlandica 55 367--376.
  • Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Chapman and Hall, New York.
  • Groeneboom, P. and Wellner, J. A. (1992). Information Bounds and Nonparametric Maximum Likelihood Estimation. Birkhäuser, Basel.
  • Grummer-Strawn, L. M. (1993). Regression analysis of current status data: An application to breast feeding. J. Amer. Statist. Assoc. 88 758--765.
  • Härdle, W., Mammen, E. and Müller, M. (1998). Testing parametric versus semiparametric modeling in generalized linear models. J. Amer. Statist. Assoc. 93 1461--1474.
  • Huang, J. (1996). Efficient estimation for the proportional hazard model with interval censoring. Ann. Statist. 24 540--568.
  • Huang, J. (1999). Efficient estimation of the partly linear additive Cox model. Ann. Statist. 27 1536--1563.
  • Jewell, N. P. and Shiboski, S. C. (1990). Statistical analysis of HIV infectivity based on partner studies. Biometrics 46 1133--1150.
  • Johnson, R. A. and Wichern, D. W. (1998). Applied Multivariate Statistical Analysis, 4th ed. Prentice Hall, Upper Saddle River, NJ.
  • Klein, R. W. and Spady, R. H. (1993). An efficient semiparametric estimator for binary response models. Econometrica 61 387--421.
  • Li, K.-C. (1987). Asymptotic optimality for $C_p$, $C_L$, cross-validation and generalized cross-validation: Discrete index set. Ann. Statist. 15 958--975.
  • Lin, D. Y., Oakes, D. and Ying, Z. (1998). Additive hazards regression with current status data. Biometrika 85 289--298.
  • Ma, S. and Kosorok, M. R. (2004). Adaptive penalized M-estimation for the Cox model with current status data. Technical Report 182, Dept. Biostatistics and Medical Informatics, Univ. Wisconsin--Madison.
  • Mammen, E. and van de Geer, S. (1997). Penalized quasi-likelihood estimation in partial linear models. Ann. Statist. 25 1014--1035.
  • Murphy, S. A. and van der Vaart, A. W. (2000). On profile likelihood (with discussion). J. Amer. Statist. Assoc. 95 449--485.
  • Murphy, S. A., van der Vaart, A. W. and Wellner, J. A. (1999). Current status regression. Math. Methods Statist. 8 407--425.
  • Politis, D. N. and Romano, J. P. (1994). Large sample confidence regions based on subsamples under minimal assumptions. Ann. Statist. 22 2031--2050.
  • Rabinowitz, D., Betensky, R. A. and Tsiatis, A. A. (2000). Using conditional logistic regression to fit proportional odds models to interval censored data. Biometrics 56 511--518.
  • Rabinowitz, D., Tsiatis, A. and Aragon, J. (1995). Regression with interval-censored data. Biometrika 82 501--513.
  • Rossini, A. and Tsiatis, A. A. (1996). A semiparametric proportional odds regression model for the analysis of current status data. J. Amer. Statist. Assoc. 91 713--721.
  • Sasieni, P. (1992). Nonorthogonal projections and their application to calculating the information in a partly linear Cox model. Scand. J. Statist. 19 215--233.
  • Satten, G. (1996). Rank-based inference in the proportional hazards model for interval censored data. Biometrika 83 355--370.
  • Scheffé, H. (1959). The Analysis of Variance. Wiley, New York.
  • Shen, X. (2000). Linear regression with current status data. J. Amer. Statist. Assoc. 95 842--852.
  • Shiboski, S. C. (1998). Generalized additive models for current status data. Lifetime Data Anal. 4 29--50.
  • Shiboski, S. C. and Jewell, N. P. (1992). Statistical analysis of the time dependence of HIV infectivity based on partner study data. J. Amer. Statist. Assoc. 87 360--372.
  • van de Geer, S. (2000). Empirical Processes in M-Estimation. Cambridge Univ. Press.
  • van der Laan, M. (1995). Locally efficient estimation with current status data and high-dimensional covariates. Group in Biostatistics Technical Report 55, Univ. California, Berkeley.
  • van der Laan, M., Bickel, P. and Jewell, N. P. (1994). Singly and doubly censored current status data: Estimation, asymptotics and regression. Group in Biostatistics Technical Report 50, Univ. California, Berkeley.
  • van der Laan, M. and Robins, J. M. (1998). Locally efficient estimation with current status data and time-dependent covariates. J. Amer. Statist. Assoc. 93 693--701.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press.
  • van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York.
  • Wahba, G. (1990). Spline Models for Observational Data. SIAM, Philadelphia.
  • Whitehead, J. (1989). The analysis of relapse clinical trials, with application to a comparison of two ulcer treatments. Statistics in Medicine 8 1439--1454.
  • Xiang, D. and Wahba, G. (1997). Approximate smoothing spline methods for large data sets in the binary case. In ASA Proc. of the Biometrics Section 94--99. Amer. Statist. Assoc., Alexandria, VA.
  • Xue, H., Lam, K. F. and Li, G. (2004). Sieve maximum likelihood estimator for semiparametric regression models with current status data. J. Amer. Statist. Assoc. 99 346--356.
  • Yu, Z. and van der Laan, M. (2003). Double robust estimation in longitudinal marginal structural models. Working Paper 132, Div. Biostatistics, Univ. California, Berkeley.