Electronic Journal of Statistics
- Electron. J. Statist.
- Volume 6 (2012), 522-546.
High-dimensional additive hazards models and the Lasso
Stéphane Gaïffas and Agathe Guilloux
Full-text: Open access
Abstract
We consider a general high-dimensional additive hazards model in a non-asymptotic setting, including regression for censored-data. In this context, we consider a Lasso estimator with a fully data-driven ℓ1 penalization, which is tuned for the estimation problem at hand. We prove sharp oracle inequalities for this estimator. Our analysis involves a new “data-driven” Bernstein’s inequality, that is of independent interest, where the predictable variation is replaced by the optional variation.
Article information
Source
Electron. J. Statist. Volume 6 (2012), 522-546.
Dates
First available in Project Euclid: 30 March 2012
Permanent link to this document
http://projecteuclid.org/euclid.ejs/1333113101
Digital Object Identifier
doi:10.1214/12-EJS681
Mathematical Reviews number (MathSciNet)
MR2988418
Zentralblatt MATH identifier
1274.62655
Subjects
Primary: 62N02: Estimation
Secondary: 62H12: Estimation
Keywords
Survival analysis counting processes censored data Aalen additive model Lasso high-dimensional covariates data-driven Bernstein’s inequality
Citation
Gaïffas, Stéphane; Guilloux, Agathe. High-dimensional additive hazards models and the Lasso. Electron. J. Statist. 6 (2012), 522--546. doi:10.1214/12-EJS681. http://projecteuclid.org/euclid.ejs/1333113101.
References
- [1] Odd Aalen. A model for nonparametric regression analysis of counting processes. In, Mathematical statistics and probability theory (Proc. Sixth Internat. Conf., Wisła, 1978), volume 2 of Lecture Notes in Statist., pages 1–25. Springer, New York, 1980.
- [2] Per Kragh Andersen, Ørnulf Borgan, Richard D. Gill, and Niels Keiding., Statistical models based on counting processes. Springer Series in Statistics. Springer-Verlag, New York, 1993.Mathematical Reviews (MathSciNet): MR1198884
- [3] Anestis Antoniadis, Piotr Fryzlewicz, and Frédérique Letué. The Dantzig selector in Cox’s proportional hazards model., Scand. J. Stat., 37(4):531–552, 2010.Mathematical Reviews (MathSciNet): MR2779635
Digital Object Identifier: doi:10.1111/j.1467-9469.2009.00685.x - [4] Karine Bertin, Erwan Le Pennec, and Vincent Rivoirard. Adaptive dantzig density estimation., Annales de l’IHP, Probabilités et Statistiques, 47(1):43–74, 2011.Mathematical Reviews (MathSciNet): MR2779396
Zentralblatt MATH: 1207.62077
Digital Object Identifier: doi:10.1214/09-AIHP351
Project Euclid: euclid.aihp/1294170229 - [5] Peter J. Bickel, Ya’acov Ritov, and Alexandre B. Tsybakov. Simultaneous analysis of lasso and Dantzig selector., Ann. Statist., 37(4) :1705–1732, 2009.Mathematical Reviews (MathSciNet): MR2533469
Zentralblatt MATH: 1173.62022
Digital Object Identifier: doi:10.1214/08-AOS620
Project Euclid: euclid.aos/1245332830 - [6] Florentina Bunea, Alexandre B. Tsybakov, and Marten H. Wegkamp. Aggregation for Gaussian regression., Ann. Statist., 35(4) :1674–1697, 2007.Mathematical Reviews (MathSciNet): MR2351101
Zentralblatt MATH: 1209.62065
Digital Object Identifier: doi:10.1214/009053606000001587
Project Euclid: euclid.aos/1188405626 - [7] Florentina Bunea, Alexandre B. Tsybakov, Marten H. Wegkamp, and Adrian Barbu. Spades and mixture models., Ann. Statist., 38(4) :2525–2558, 2010.Mathematical Reviews (MathSciNet): MR2676897
Zentralblatt MATH: 1198.62025
Digital Object Identifier: doi:10.1214/09-AOS790
Project Euclid: euclid.aos/1278861256 - [8] Emmanuel Candès and Terence Tao. The Dantzig selector: statistical estimation when, p is much larger than n. Ann. Statist., 35(6) :2313–2351, 2007.Mathematical Reviews (MathSciNet): MR2382644
Zentralblatt MATH: 1139.62019
Digital Object Identifier: doi:10.1214/009053606000001523
Project Euclid: euclid.aos/1201012958 - [9] Fabienne Comte, Stéphane Gaïffas, and Agathe Guilloux. Adaptive estimation of the conditional intensity of marker-dependent counting processes., Annales de l’Institut Henri Poincaré, Probabilités et Statistiques, 47(4) :1171–1196, 2011.Mathematical Reviews (MathSciNet): MR2884230
Digital Object Identifier: doi:10.1214/10-AIHP386
Project Euclid: euclid.aihp/1317906507 - [10] David R. Cox. Regression models and life-tables., J. Roy. Statist. Soc. Ser. B, 34:187–220, 1972.Mathematical Reviews (MathSciNet): MR341758
- [11] Bradley Efron, Trevor Hastie, Iain Johnstone, and Robert Tibshirani. Least angle regression., Ann. Statist., 32(2):407–499, 2004.Mathematical Reviews (MathSciNet): MR2060166
Zentralblatt MATH: 1091.62054
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935 - [12] Jianqing Fan and Runze Li. Variable selection for Cox’s proportional hazards model and frailty model., Ann. Statist., 30(1):74–99, 2002.Mathematical Reviews (MathSciNet): MR1892656
Zentralblatt MATH: 1012.62106
Digital Object Identifier: doi:10.1214/aos/1015362185
Project Euclid: euclid.aos/1015362185 - [13] Niels Richard Hansen, Patricia Reynaud-Bouret, and Vincent Rivoirard. Lasso and probabilistic inequalities for multivariate point processes. Work in progress, personnal, communication.
- [14] Jean Jacod and Albert N. Shiryaev., Limit theorems for stochastic processes, volume 288 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1987.Mathematical Reviews (MathSciNet): MR959133
- [15] Vladimir Koltchinskii. The Dantzig selector and sparsity oracle inequalities., Bernoulli, 15(3):799–828, 2009.Mathematical Reviews (MathSciNet): MR2555200
Digital Object Identifier: doi:10.3150/09-BEJ187
Project Euclid: euclid.bj/1251463282 - [16] Vladimir Koltchinskii. Sparsity in penalized empirical risk minimization., Ann. Inst. Henri Poincaré Probab. Stat., 45(1):7–57, 2009.Mathematical Reviews (MathSciNet): MR2500227
Zentralblatt MATH: 1168.62044
Digital Object Identifier: doi:10.1214/07-AIHP146
Project Euclid: euclid.aihp/1234469970 - [17] Vladimir Koltchinskii, Karim Lounici, and Alexandre B. Tsybakov. Nuclear norm penalization and optimal rates for noisy low rank matrix completion., Annals of Statistics, 39(39) :2302–2329, 2011.Mathematical Reviews (MathSciNet): MR2906869
Zentralblatt MATH: 1231.62097
Digital Object Identifier: doi:10.1214/11-AOS894
Project Euclid: euclid.aos/1322663459 - [18] Chenlei Leng and Shuangge Ma. Path consistent model selection in additive risk model via Lasso., Stat. Med., 26(20) :3753–3770, 2007.
- [19] Danyu Lin and Zhiliang Ying. Semiparametric analysis of the additive risk model., Biometrika, 81(1):61–71, 1994.Mathematical Reviews (MathSciNet): MR1279656
Zentralblatt MATH: 0796.62099
Digital Object Identifier: doi:10.1093/biomet/81.1.61 - [20] Robert Sh. Liptser and Albert N. Shiryayev., Theory of martingales, volume 49 of Mathematics and its Applications (Soviet Series). Kluwer Academic Publishers Group, Dordrecht, 1989. Translated from the Russian by K. Dzjaparidze [Kacha Dzhaparidze].
- [21] Shuangge Ma and J. Huang. Additive risk survival model with microarray data., BMC bioinformatics, 8(1):192, 2007.
- [22] Shuangge Ma, Mickael R. Kosorok, and Jason P. Fine. Additive risk models for survival data with high-dimensional covariates., Biometrics, 62(1):202–210, 2006.Mathematical Reviews (MathSciNet): MR2226574
Digital Object Identifier: doi:10.1111/j.1541-0420.2005.00405.x - [23] Torben Martinussen and Thomas H. Scheike., Dynamic regression models for survival data. Statistics for Biology and Health. Springer, New York, 2006.
- [24] Torben Martinussen and Thomas H. Scheike. The additive hazards model with high-dimensional regressors., Lifetime Data Anal., 15(3):330–342, 2009.Mathematical Reviews (MathSciNet): MR2519717
Digital Object Identifier: doi:10.1007/s10985-009-9111-y - [25] Torben Martinussen and Thomas H. Scheike. Covariate selection for the semiparametric additive risk model., Scand. J. Stat., 36(4):602–619, 2009.Mathematical Reviews (MathSciNet): MR2572578
Digital Object Identifier: doi:10.1111/j.1467-9469.2009.00650.x - [26] Ian W. McKeague and Peter D. Sasieni. A partly parametric additive risk model., Biometrika, 81(3):501–514, 1994.Mathematical Reviews (MathSciNet): MR1311093
Zentralblatt MATH: 0812.62041
Digital Object Identifier: doi:10.1093/biomet/81.3.501 - [27] Nicolai Meinshausen and Peter Bühlmann. Stability selection., J. R. Stat. Soc. Ser. B Stat. Methodol., 72(4):417–473, 2010.Mathematical Reviews (MathSciNet): MR2758523
Digital Object Identifier: doi:10.1111/j.1467-9868.2010.00740.x - [28] Patricia Reynaud-Bouret. Penalized projection estimators of the Aalen multiplicative intensity., Bernoulli, 12(4):633–661, 2006.Mathematical Reviews (MathSciNet): MR2248231
Digital Object Identifier: doi:10.3150/bj/1155735930
Project Euclid: euclid.bj/1155735930 - [29] R. Tyrrell Rockafellar., Convex analysis. Princeton Mathematical Series, No. 28. Princeton University Press, Princeton, N.J., 1970.Mathematical Reviews (MathSciNet): MR274683
- [30] Andreas Rosenwald, George Wright, Adrian Wiestner, Wing C. Chan, Joseph M. Connors, Elias Campo, Randy D. Gascoyne, Thomas M. Grogan, H. Konrad Muller-Hermelink, Erlend B. Smeland, Michael Chiorazzi, Jena M. Giltnane, Elaine M. Hurt, Hong Zhao, Lauren Averett, Sarah Henrickson, Liming M. Yang, John Powell, Wyndham H. Wilson, Elaine S. Jaffe, Richard Simon, Richard D. Klausner, Emilio Montserrat, Francesc Bosch, Timohy C. Greiner, Dennis D. Weisenburger, Warren G. Sanger, Bhavana J. Dave, James C. Lynch, Julie Vose, James O. Armitage, Richard I. Fisher, Thomas P. Miller, Michael LeBlanc, German Ott, Stein Kvaloy, Harald Holte, Jan Delabie, and Louis M. Staudt. The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma., CANCER CELL, 3(2):185–197, 2003.
- [31] Robert Tibshirani. The lasso method for variable selection in the cox model., Statist. in Med., 16:385–395, 1997.
- [32] Sara A. van de Geer. Exponential inequalities for martingales, with application to maximum likelihood estimation for counting processes., Ann. Statist., 23(5) :1779–1801, 1995.Mathematical Reviews (MathSciNet): MR1370307
Zentralblatt MATH: 0852.60019
Digital Object Identifier: doi:10.1214/aos/1176324323
Project Euclid: euclid.aos/1176324323 - [33] Sara A. van de Geer and Peter Bühlmann. On the conditions used to prove oracle results for the Lasso., Electron. J. Stat., 3 :1360–1392, 2009.Mathematical Reviews (MathSciNet): MR2576316
Digital Object Identifier: doi:10.1214/09-EJS506
Project Euclid: euclid.ejs/1260801227 - [34] Laura. J. van ’t Veer, Hongyue Dai, Marc J. van de Vijver, and et al. Gene expression profiling predicts clinical outcome of breast cancer., Nature, 415 (6871):484–5, 2002.
- [35] Daniela M. Witten and Robert Tibshirani. Survival analysis with high-dimensional covariates., Statistical methods in medical research, 19(1):29, 2010.Mathematical Reviews (MathSciNet): MR2744491
Digital Object Identifier: doi:10.1177/0962280209105024 - [36] Hao H. Zhang and Wenbin Lu. Adaptive lasso for cox’s proportional hazards model., Biometrika, 94(3):691, 2007.Mathematical Reviews (MathSciNet): MR2410017
Zentralblatt MATH: 1135.62083
Digital Object Identifier: doi:10.1093/biomet/asm037 - [37] Tong Zhang. Analysis of multi-stage convex relaxation for sparse regularization., J. Mach. Learn. Res., 11 :1081–1107, 2010.
- [38] Hui Zou. The adaptive lasso and its oracle properties., J. Amer. Statist. Assoc., 101(476) :1418–1429, 2006.Mathematical Reviews (MathSciNet): MR2279469
Zentralblatt MATH: 1171.62326
Digital Object Identifier: doi:10.1198/016214506000000735 - [39] Hui Zou. A note on path-based variable selection in the penalized proportional hazards model., Biometrika, 95(1):241–247, 2008.Mathematical Reviews (MathSciNet): MR2409726
Zentralblatt MATH: 05563390
Digital Object Identifier: doi:10.1093/biomet/asm083
The Institute of Mathematical Statistics and the Bernoulli Society

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- The Lasso, correlated design, and improved oracle inequalities
van de Geer, Sara and Lederer, Johannes, From Probability to Statistics and Back: High-Dimensional Models and Processes -- A Festschrift in Honor of Jon A. Wellner, 2013 - Lasso and probabilistic inequalities for multivariate point processes
Hansen, Niels Richard, Reynaud-Bouret, Patricia, and Rivoirard, Vincent, Bernoulli, 2015 - SPADES and mixture models
Bunea, Florentina, Tsybakov, Alexandre B., Wegkamp, Marten H., and Barbu, Adrian, The Annals of Statistics, 2010
- The Lasso, correlated design, and improved oracle inequalities
van de Geer, Sara and Lederer, Johannes, From Probability to Statistics and Back: High-Dimensional Models and Processes -- A Festschrift in Honor of Jon A. Wellner, 2013 - Lasso and probabilistic inequalities for multivariate point processes
Hansen, Niels Richard, Reynaud-Bouret, Patricia, and Rivoirard, Vincent, Bernoulli, 2015 - SPADES and mixture models
Bunea, Florentina, Tsybakov, Alexandre B., Wegkamp, Marten H., and Barbu, Adrian, The Annals of Statistics, 2010 - Regularization for Cox’s proportional hazards model with NP-dimensionality
Bradic, Jelena, Fan, Jianqing, and Jiang, Jiancheng, The Annals of Statistics, 2011 - Smoothing ℓ1-penalized estimators for high-dimensional time-course data
Meier, Lukas and Bühlmann, Peter, Electronic Journal of Statistics, 2007 - Functional wavelet regression for linear function-on-function models
Luo, Ruiyan, Qi, Xin, and Wang, Yanhong, Electronic Journal of Statistics, 2016 - On the Sensitivity of the Lasso to the Number of Predictor Variables
Flynn, Cheryl J., Hurvich, Clifford M., and Simonoff, Jeffrey S., Statistical Science, 2017 - Penalized projection estimators of the Aalen multiplicative intensity
Reynaud-Bouret, Patricia, Bernoulli, 2006 - Oracle inequalities for the Lasso in the high-dimensional Aalen multiplicative intensity model
Lemler, Sarah, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques, 2016 - Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion
Koltchinskii, Vladimir, Lounici, Karim, and Tsybakov, Alexandre B., The Annals of Statistics, 2011
