Annales de l'Institut Henri Poincaré, Probabilités et Statistiques

Oracle inequalities for the Lasso in the high-dimensional Aalen multiplicative intensity model

Sarah Lemler

Full-text: Open access

Abstract

In a general counting process setting, we consider the problem of obtaining a prognostic on the survival time adjusted on covariates in high-dimension. Towards this end, we construct an estimator of the whole conditional intensity. We estimate it by the best Cox proportional hazards model given two dictionaries of functions. The first dictionary is used to construct an approximation of the logarithm of the baseline hazard function and the second to approximate the relative risk. We introduce a new data-driven weighted Lasso procedure to estimate the unknown parameters of the best Cox model approximating the intensity. We provide non-asymptotic oracle inequalities for our procedure in terms of an appropriate empirical Kullback divergence. Our results rely on an empirical Bernstein’s inequality for martingales with jumps and properties of modified self-concordant functions.

Résumé

Dans le cadre général d’un processus de comptage, nous intéressons à la façon d’obtenir un pronostic sur la durée de survie en fonction des covariables en grande dimension. Pour ce faire, nous construisons un estimateur de l’intensité conditionnelle. Nous l’estimons par le meilleur modèle de Cox étant donné deux dictionnaires de fonctions. Le premier dictionnaire est utilisé pour construire le logarithme du risque de base et le second, pour approximer le risque relatif. Nous introduisons une nouvelle procédure Lasso pondéré avec une pondération basée sur les données pour estimer les paramètres inconnus du meilleur modèle de Cox approximant l’intensité. Nous établissons une inégalité oracle non-asymptotique en divergence de Kullback empirique, qui est la fonction de perte la plus appropriée à notre procédure. Nos résultats reposent sur une inégalité de Bernstein pour les martingales à sauts et sur des propriétés des fonctions self-concordantes.

Article information

Source
Ann. Inst. H. Poincaré Probab. Statist., Volume 52, Number 2 (2016), 981-1008.

Dates
Received: 12 October 2013
Revised: 11 June 2014
Accepted: 30 October 2014
First available in Project Euclid: 4 May 2016

Permanent link to this document
https://projecteuclid.org/euclid.aihp/1462367902

Digital Object Identifier
doi:10.1214/14-AIHP662

Mathematical Reviews number (MathSciNet)
MR3498018

Zentralblatt MATH identifier
1342.62158

Subjects
Primary: 62N02: Estimation 62G05: Estimation 62G08: Nonparametric regression 60E15: Inequalities; stochastic orderings

Keywords
Survival analysis Right-censored data Intensity Cox proportional hazards model Semiparametric model Non-parametric model High-dimensional covariates Lasso Non-asymptotic oracle inequalities Empirical Bernstein’s inequality

Citation

Lemler, Sarah. Oracle inequalities for the Lasso in the high-dimensional Aalen multiplicative intensity model. Ann. Inst. H. Poincaré Probab. Statist. 52 (2016), no. 2, 981--1008. doi:10.1214/14-AIHP662. https://projecteuclid.org/euclid.aihp/1462367902


Export citation

References

  • [1] O. Aalen. A model for nonparametric regression analysis of counting processes. In Mathematical Statistics and Probability Theory (Proc. Sixth Internat. Conf., Wisła, 1978) 1–25. Lecture Notes in Statist. 2. Springer, New York, 1980.
  • [2] P. K. Andersen, O. Borgan and R. D. Gill. Statistical Models Based on Counting Processes. Springer Series in Statistics. Springer, New York, 1993.
  • [3] A. Antoniadis, P. Fryzlewicz and F. Letué. The Dantzig selector in Cox’s proportional hazards model. Scand. J. Stat. 37 (2010) 531–552.
  • [4] F. Bach. Self-concordant analysis for logistic regression. Electron. J. Stat. 4 (2010) 384–414.
  • [5] P. L. Bartlett, S. Mendelson and J. Neeman. $\ell_{1}$-regularized linear regression: Persistence and oracle inequalities. Probab. Theory Related Fields 154 (2012) 193–224.
  • [6] K. Bertin, E. Le Pennec and V. Rivoirard. Adaptive Dantzig density estimation. Ann. Inst. Henri Poincaré Probab. Stat. 47 (2011) 43–74.
  • [7] P. J. Bickel, Y. Ritov and A. B. Tsybakov. Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 (2009) 1705–1732.
  • [8] J. Bradic, J. Fan and J. Jiang. Regularization for Cox’s proportional hazards model with NP-dimensionality. Ann. Statist. 39 (2012) 3092–3120.
  • [9] J. Bradic and R. Song. Structured estimation for the nonparametric Cox model. Electron. J. Stat. 9 (2015) 492–534.
  • [10] P. Bühlmann and S. van de Geer. On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3 (2009) 1360–1392.
  • [11] F. Bunea, A. B. Tsybakov and M. Wegkamp. Sparsity oracle inequalities for the Lasso. Electron. J. Stat. 1 (2007) 169–194.
  • [12] F. Bunea, A. B. Tsybakov and M. H. Wegkamp. Aggregation and sparsity via $l_{1}$ penalized least squares. In Learning Theory 379–391. Lecture Notes in Comput. Sci. 4005. Springer, Berlin, 2006.
  • [13] F. Bunea, A. B. Tsybakov, M. H. Wegkamp and A. Barbu. Spades and mixture models. Ann. Statist. 38 (2010) 2525–2558.
  • [14] F. Comte, S. Gaïffas and A. Guilloux. Adaptive estimation of the conditional intensity of marker-dependent counting processes. Ann. Inst. Henri Poincaré Probab. Stat. 47 (2011) 1171–1196.
  • [15] D. R. Cox. Regression models and life-tables. J. R. Stat. Soc. Ser. B Stat. Methodol. 34 (1972) 187–220. With discussion by F. Downton, Richard Peto, D. J. Bartholomew, D. V. Lindley, P. W. Glassborow, D. E. Barton, Susannah Howard, B. Benjamin, John J. Gart, L. D. Meshalkin, A. R. Kagan, M. Zelen, R. E. Barlow, Jack Kalbfleisch, R. L. Prentice and Norman Breslow, and a reply by D. R. Cox.
  • [16] S. S. Dave, G. Wright, B. Tan, A. Rosenwald, R. D. Gascoyne, W. C. Chan, R. I. Fisher, R. M. Braziel, L. M. Rimsza, T. M. Grogan, T. P. Miller, M. LeBlanc, T. C. Greiner, D. D. Weisenburger, J. C. Lynch, J. Vose, J. O. Armitage, E. B. Smeland, S. Kvaloy, H. Holte, J. Delabie, J. M. Connors, P. M. Lansdorp, Q. Ouyang, T. A. Lister, A. J. Davies, A. J. Norton, H. K. Muller-Hermelink, G. Ott, E. Campo, E. Montserrat, W. H. Wilson, E. S. Jaffe, R. Simon, L. Yang, J. Powell, H. Zhao, N. Goldschmidt, M. Chiorazzi and L. M. Staudt. Prediction of survival in follicular lymphoma based on molecular features of tumor-infiltrating immune cells. N. Engl. J. Med. 351 (2004) 2159–2169.
  • [17] J. Fan and R. Li. Variable selection for Cox’s proportional hazards model and frailty model. Ann. Statist. 30 (2002) 74–99.
  • [18] S. Gaïffas and A. Guilloux. High-dimensional additive hazard models and the Lasso. Electron. J. Stat. 6 (2011) 522–546.
  • [19] R. Gill. Large sample behaviour of the product-limit estimator on the whole line. Ann. Statist. 11 (1983) 49–58.
  • [20] M. L. Gourlay, J. P. Fine, J. S. Preisser, R. C. May, C. Li, L. Y. Lui, D. F. Ransohoff, J. A. Cauley and K. E. Ensrud. Bone-density testing interval and transition to osteoporosis in older women. N. Engl. J. Med. 366 (2012) 225–233.
  • [21] N. R. Hansen, P. Reynaud-Bouret and V. Rivoirard. Lasso and probabilistic inequalities for the multivariate point processes. Bernoulli 21 (2015) 83–143.
  • [22] M. J. Kearns, R. E. Schapire and L. M. Sellie. Toward efficient agnostic learning. Mach. Learn. 17 (1994) 115–141.
  • [23] V. Koltchinskii. Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems. Ecole d’Eté de Probabilités de Saint-Flour XXXVIII. Lecure Notes in Mathematics 2033. Springer, Heidelberg, 2011.
  • [24] S. Kong and B. Nan. Non-asymptotic oracle inequalities for the high-dimensional Cox regression via Lasso. Statist. Sinica 1 (2014) 25–42.
  • [25] E. Le Pennec and S. X. Cohen. Partition-based conditional density estimation. ESAIM Probab. Stat. 1 (2013) 672–697.
  • [26] S. Lemler Oracle inequalities for the Lasso in the high-dimensional multiplicative Aalen intensity model. Preprint, 2012. Available at arXiv:1206.5628.
  • [27] F. Letué. Modèle de Cox: Estimation par sélection de modele et modèle de chocs bivarié. Ph.D. thesis, 2000.
  • [28] T. Martinussen and T. H. Scheike. Covariate selection for the semiparametric additive risk model. Scand. J. Stat. 36 (2009) 602–619.
  • [29] P. Massart. Concentration Inequalities and Model Selection. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003. Lecture Notes in Mathematics 1896. Springer, Berlin, 2007. With a foreword by Jean Picard.
  • [30] P. Massart and C. Meynet. The Lasso as an $l_{1}$-ball model selection procedure. Electron. J. Stat. 5 (2011) 669–687.
  • [31] N. Meinshausen and P. Bühlmann. High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 (2006) 1436–1462.
  • [32] R. Senoussi. Problème d’identification dans le modèle de Cox. Ann. Inst. Henri Poincaré Probab. Stat. 26 (1988) 45–64.
  • [33] E. W. Steyerberg, M. Y. V. Homs, A. Stokvis, M. L. Essink-Bot and P. D. Siersema. Stent placement or brachytherapy for palliation of dysphagia from esophageal cancer: A prognostic model to guide treatment selection. Gastroint. Endosc. 62 (2005) 333–340.
  • [34] C. J. Stone. The use of polynomial splines and their tensor products in multivariate function estimation. Ann. Statist. 22 (1994) 118–184. With discussion by Andreas Buja and Trevor Hastie and a rejoinder by the author.
  • [35] R. Tibshirani. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 (1996) 267–288.
  • [36] R. Tibshirani. The Lasso method for variable selection in the Cox model. Stat. Med. 16 (1997) 385–395.
  • [37] S. van de Geer. Exponential inequalities for martingales, with application to maximum likelihood estimation for counting processes. Ann. Statist. 23 (1995) 1779–1801.
  • [38] S. van de Geer. High-dimensional generalized linear models and the Lasso. Ann. Statist. 36 (2008) 614–645.
  • [39] C. H. Zhang and J. Huang. The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann. Statist. 36 (2008) 1567–1594.
  • [40] H. H. Zhang and W. Lu. Adaptive Lasso for Cox’s proportional hazards model. Biometrika 94 (2007) 691–703.
  • [41] T. Zhang. Analysis of multi-stage convex relaxation for sparse regularization. J. Mach. Learn. Res. 11 (2010) 1081–1107.
  • [42] P. Zhao and B. Yu. On model selection consistency of Lasso. J. Mach. Learn. Res. 7 (2007) 2541.
  • [43] H. Zou. The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 (2006) 1418–1429.
  • [44] H. Zou. A note on path-based variable selection in the penalized proportional hazards model. Biometrika 95 (2008) 241–247.