Electronic Journal of Statistics

High-dimensional additive hazards models and the Lasso

Stéphane Gaïffas and Agathe Guilloux

Full-text: Open access

Abstract

We consider a general high-dimensional additive hazards model in a non-asymptotic setting, including regression for censored-data. In this context, we consider a Lasso estimator with a fully data-driven 1 penalization, which is tuned for the estimation problem at hand. We prove sharp oracle inequalities for this estimator. Our analysis involves a new “data-driven” Bernstein’s inequality, that is of independent interest, where the predictable variation is replaced by the optional variation.

Article information

Source
Electron. J. Statist., Volume 6 (2012), 522-546.

Dates
First available in Project Euclid: 30 March 2012

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1333113101

Digital Object Identifier
doi:10.1214/12-EJS681

Mathematical Reviews number (MathSciNet)
MR2988418

Zentralblatt MATH identifier
1274.62655

Subjects
Primary: 62N02: Estimation
Secondary: 62H12: Estimation

Keywords
Survival analysis counting processes censored data Aalen additive model Lasso high-dimensional covariates data-driven Bernstein’s inequality

Citation

Gaïffas, Stéphane; Guilloux, Agathe. High-dimensional additive hazards models and the Lasso. Electron. J. Statist. 6 (2012), 522--546. doi:10.1214/12-EJS681. https://projecteuclid.org/euclid.ejs/1333113101


Export citation

References

  • [1] Odd Aalen. A model for nonparametric regression analysis of counting processes. In, Mathematical statistics and probability theory (Proc. Sixth Internat. Conf., Wisła, 1978), volume 2 of Lecture Notes in Statist., pages 1–25. Springer, New York, 1980.
  • [2] Per Kragh Andersen, Ørnulf Borgan, Richard D. Gill, and Niels Keiding., Statistical models based on counting processes. Springer Series in Statistics. Springer-Verlag, New York, 1993.
  • [3] Anestis Antoniadis, Piotr Fryzlewicz, and Frédérique Letué. The Dantzig selector in Cox’s proportional hazards model., Scand. J. Stat., 37(4):531–552, 2010.
  • [4] Karine Bertin, Erwan Le Pennec, and Vincent Rivoirard. Adaptive dantzig density estimation., Annales de l’IHP, Probabilités et Statistiques, 47(1):43–74, 2011.
  • [5] Peter J. Bickel, Ya’acov Ritov, and Alexandre B. Tsybakov. Simultaneous analysis of lasso and Dantzig selector., Ann. Statist., 37(4) :1705–1732, 2009.
  • [6] Florentina Bunea, Alexandre B. Tsybakov, and Marten H. Wegkamp. Aggregation for Gaussian regression., Ann. Statist., 35(4) :1674–1697, 2007.
  • [7] Florentina Bunea, Alexandre B. Tsybakov, Marten H. Wegkamp, and Adrian Barbu. Spades and mixture models., Ann. Statist., 38(4) :2525–2558, 2010.
  • [8] Emmanuel Candès and Terence Tao. The Dantzig selector: statistical estimation when, p is much larger than n. Ann. Statist., 35(6) :2313–2351, 2007.
  • [9] Fabienne Comte, Stéphane Gaïffas, and Agathe Guilloux. Adaptive estimation of the conditional intensity of marker-dependent counting processes., Annales de l’Institut Henri Poincaré, Probabilités et Statistiques, 47(4) :1171–1196, 2011.
  • [10] David R. Cox. Regression models and life-tables., J. Roy. Statist. Soc. Ser. B, 34:187–220, 1972.
  • [11] Bradley Efron, Trevor Hastie, Iain Johnstone, and Robert Tibshirani. Least angle regression., Ann. Statist., 32(2):407–499, 2004.
  • [12] Jianqing Fan and Runze Li. Variable selection for Cox’s proportional hazards model and frailty model., Ann. Statist., 30(1):74–99, 2002.
  • [13] Niels Richard Hansen, Patricia Reynaud-Bouret, and Vincent Rivoirard. Lasso and probabilistic inequalities for multivariate point processes. Work in progress, personnal, communication.
  • [14] Jean Jacod and Albert N. Shiryaev., Limit theorems for stochastic processes, volume 288 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1987.
  • [15] Vladimir Koltchinskii. The Dantzig selector and sparsity oracle inequalities., Bernoulli, 15(3):799–828, 2009.
  • [16] Vladimir Koltchinskii. Sparsity in penalized empirical risk minimization., Ann. Inst. Henri Poincaré Probab. Stat., 45(1):7–57, 2009.
  • [17] Vladimir Koltchinskii, Karim Lounici, and Alexandre B. Tsybakov. Nuclear norm penalization and optimal rates for noisy low rank matrix completion., Annals of Statistics, 39(39) :2302–2329, 2011.
  • [18] Chenlei Leng and Shuangge Ma. Path consistent model selection in additive risk model via Lasso., Stat. Med., 26(20) :3753–3770, 2007.
  • [19] Danyu Lin and Zhiliang Ying. Semiparametric analysis of the additive risk model., Biometrika, 81(1):61–71, 1994.
  • [20] Robert Sh. Liptser and Albert N. Shiryayev., Theory of martingales, volume 49 of Mathematics and its Applications (Soviet Series). Kluwer Academic Publishers Group, Dordrecht, 1989. Translated from the Russian by K. Dzjaparidze [Kacha Dzhaparidze].
  • [21] Shuangge Ma and J. Huang. Additive risk survival model with microarray data., BMC bioinformatics, 8(1):192, 2007.
  • [22] Shuangge Ma, Mickael R. Kosorok, and Jason P. Fine. Additive risk models for survival data with high-dimensional covariates., Biometrics, 62(1):202–210, 2006.
  • [23] Torben Martinussen and Thomas H. Scheike., Dynamic regression models for survival data. Statistics for Biology and Health. Springer, New York, 2006.
  • [24] Torben Martinussen and Thomas H. Scheike. The additive hazards model with high-dimensional regressors., Lifetime Data Anal., 15(3):330–342, 2009.
  • [25] Torben Martinussen and Thomas H. Scheike. Covariate selection for the semiparametric additive risk model., Scand. J. Stat., 36(4):602–619, 2009.
  • [26] Ian W. McKeague and Peter D. Sasieni. A partly parametric additive risk model., Biometrika, 81(3):501–514, 1994.
  • [27] Nicolai Meinshausen and Peter Bühlmann. Stability selection., J. R. Stat. Soc. Ser. B Stat. Methodol., 72(4):417–473, 2010.
  • [28] Patricia Reynaud-Bouret. Penalized projection estimators of the Aalen multiplicative intensity., Bernoulli, 12(4):633–661, 2006.
  • [29] R. Tyrrell Rockafellar., Convex analysis. Princeton Mathematical Series, No. 28. Princeton University Press, Princeton, N.J., 1970.
  • [30] Andreas Rosenwald, George Wright, Adrian Wiestner, Wing C. Chan, Joseph M. Connors, Elias Campo, Randy D. Gascoyne, Thomas M. Grogan, H. Konrad Muller-Hermelink, Erlend B. Smeland, Michael Chiorazzi, Jena M. Giltnane, Elaine M. Hurt, Hong Zhao, Lauren Averett, Sarah Henrickson, Liming M. Yang, John Powell, Wyndham H. Wilson, Elaine S. Jaffe, Richard Simon, Richard D. Klausner, Emilio Montserrat, Francesc Bosch, Timohy C. Greiner, Dennis D. Weisenburger, Warren G. Sanger, Bhavana J. Dave, James C. Lynch, Julie Vose, James O. Armitage, Richard I. Fisher, Thomas P. Miller, Michael LeBlanc, German Ott, Stein Kvaloy, Harald Holte, Jan Delabie, and Louis M. Staudt. The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma., CANCER CELL, 3(2):185–197, 2003.
  • [31] Robert Tibshirani. The lasso method for variable selection in the cox model., Statist. in Med., 16:385–395, 1997.
  • [32] Sara A. van de Geer. Exponential inequalities for martingales, with application to maximum likelihood estimation for counting processes., Ann. Statist., 23(5) :1779–1801, 1995.
  • [33] Sara A. van de Geer and Peter Bühlmann. On the conditions used to prove oracle results for the Lasso., Electron. J. Stat., 3 :1360–1392, 2009.
  • [34] Laura. J. van ’t Veer, Hongyue Dai, Marc J. van de Vijver, and et al. Gene expression profiling predicts clinical outcome of breast cancer., Nature, 415 (6871):484–5, 2002.
  • [35] Daniela M. Witten and Robert Tibshirani. Survival analysis with high-dimensional covariates., Statistical methods in medical research, 19(1):29, 2010.
  • [36] Hao H. Zhang and Wenbin Lu. Adaptive lasso for cox’s proportional hazards model., Biometrika, 94(3):691, 2007.
  • [37] Tong Zhang. Analysis of multi-stage convex relaxation for sparse regularization., J. Mach. Learn. Res., 11 :1081–1107, 2010.
  • [38] Hui Zou. The adaptive lasso and its oracle properties., J. Amer. Statist. Assoc., 101(476) :1418–1429, 2006.
  • [39] Hui Zou. A note on path-based variable selection in the penalized proportional hazards model., Biometrika, 95(1):241–247, 2008.