The Annals of Applied Statistics

Network exploration via the adaptive LASSO and SCAD penalties

Jianqing Fan, Yang Feng, and Yichao Wu

Source: Ann. Appl. Stat. Volume 3, Number 2 (2009), 521-541.

Abstract

Graphical models are frequently used to explore networks, such as genetic networks, among a set of variables. This is usually carried out via exploring the sparsity of the precision matrix of the variables under consideration. Penalized likelihood methods are often used in such explorations. Yet, positive-definiteness constraints of precision matrices make the optimization problem challenging. We introduce nonconcave penalties and the adaptive LASSO penalty to attenuate the bias problem in the network estimation. Through the local linear approximation to the nonconcave penalty functions, the problem of precision matrix estimation is recast as a sequence of penalized likelihood problems with a weighted L1 penalty and solved using the efficient algorithm of Friedman et al. [Biostatistics 9 (2008) 432–441]. Our estimation schemes are applied to two real datasets. Simulation experiments and asymptotic theory are used to justify our proposed methods.

Related Works:

Keywords: Adaptive LASSO; covariance selection; Gaussian concentration graphical model; genetic network; LASSO; precision matrix; SCAD

Full-text: Access denied (no subscription detected)

In 2007, access to the Annals of Applied Statistics was open. Beginning in 2008, you must hold a subscription or be a member of the IMS to view the full journal. For more information on subscribing, please visit: http://imstat.org/orders.
If you are already an IMS member, you may need to update your Euclid profile following the instructions here: http://imstat.org/publications/eaccess.htm.
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1245676184
Digital Object Identifier: doi:10.1214/08-AOAS215
Zentralblatt MATH identifier: 1166.62040

References

Baldi, P., Brunak, S., Chauvin, Y., Andersen, C. A. F. and Nielsen, H. (2000). Assessing the accuracy of prediction algorithms for classification: An overview., Bioinformatics 16 412–424.
Mathematical Reviews (MathSciNet): MR1849633
Zentralblatt MATH: 0992.92024
Breiman, L. (1996). Heuristics of instability and stablization in model selection., Ann. Statist. 24 2350–2383.
Mathematical Reviews (MathSciNet): MR1425957
Zentralblatt MATH: 0867.62055
Digital Object Identifier: doi:10.1214/aos/1032181158
Project Euclid: euclid.aos/1032181158
d’Aspremont, A., Banerjee, O. and Ghaoui, L. E. (2008). First-order methods for sparse covariance selection., SIAM J. Matrix Anal. Appl. 30 56–66.
Mathematical Reviews (MathSciNet): MR2399568
Zentralblatt MATH: 1156.90423
Digital Object Identifier: doi:10.1137/060670985
Dempster, A. P. (1972). Covariance selection., Biometrics 28 157–175.
Dobra, A., Hans, C., Jones, B., Nevins, J. R., Yao, G. and West, M. (2004). Sparse graphical models for exploring gene expression data., J. Multivariate Anal. 90 196–212.
Mathematical Reviews (MathSciNet): MR2064941
Zentralblatt MATH: 1047.62104
Digital Object Identifier: doi:10.1016/j.jmva.2004.02.009
Drton, M. and Perlman, M. (2004). Model selection for Gaussian concentration graphs., Biometrika 91 591–602.
Mathematical Reviews (MathSciNet): MR2090624
Zentralblatt MATH: 1108.62098
Digital Object Identifier: doi:10.1093/biomet/91.3.591
Edwards, D. M. (2000)., Introduction to Graphical Modelling. Springer, New York.
Mathematical Reviews (MathSciNet): MR1880319
Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression (with discussions)., Ann. Statist. 32 409–499.
Mathematical Reviews (MathSciNet): MR2060166
Zentralblatt MATH: 1091.62054
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935
Fan, J. (1997). Comment on “Wavelets in statistics: A review,” by A. Antoniadis., J. Italian Statisit. Soc. 6 131–138.
Fan, J. and Fan, Y. (2008). High-dimensional classification using features annealed independence rules., Ann. Statist. 36 2605–2637.
Mathematical Reviews (MathSciNet): MR2485009
Zentralblatt MATH: 05503372
Digital Object Identifier: doi:10.1214/07-AOS504
Project Euclid: euclid.aos/1231165181
Fan, J., Feng, Y. and Wu, Y. (2008). Supplement to “Network exploration via the adaptive LASSO and SCAD penalties.” DOI:, 10.1214/08-AOAS215SUPP.
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties., J. Amer. Statist. Assoc. 96 1348–1360.
Mathematical Reviews (MathSciNet): MR1946581
Zentralblatt MATH: 1073.62547
Digital Object Identifier: doi:10.1198/016214501753382273
Fan, J. and Peng, H. (2004). Nonconcave penalized likelihood with a diverging number of parameters., Ann. Statist. 32 928–961.
Mathematical Reviews (MathSciNet): MR2065194
Zentralblatt MATH: 1092.62031
Digital Object Identifier: doi:10.1214/009053604000000256
Project Euclid: euclid.aos/1085408491
Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso., Biostatistics 9 432–441.
Hess, R. K., Anderson, K., Symmans, W. F., Valero, V., Ibrahim, N., Mejia, J. A., Booser, D., Theriault, R. L., Buzdar, A. U., Dempsey, P. J., Rouzier, R., Sneige, N., Ross, J. S., Vidaurre, T., Go’mez, H. L., Hortobagyi, G. N. and Pusztai, L. (2006). Pharmacogenomic predictor of sensitivity to preoperative chemotherapy with paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide in breast cancer., Journal of Clinical Oncology 24 4236–4244.
Huang, J., Liu, N., Pourahmadi, M. and Liu, L. (2006). Covariance matrix selection and estimation via penalised normal likelihood., Biometrika 93 85–98.
Mathematical Reviews (MathSciNet): MR2277742
Zentralblatt MATH: 1152.62346
Digital Object Identifier: doi:10.1093/biomet/93.1.85
Hunter, D. R. and Li, R. (2005). Variable selection using mm algorithm., Ann. Statist. 33 1617–1642.
Mathematical Reviews (MathSciNet): MR2166557
Zentralblatt MATH: 1078.62028
Digital Object Identifier: doi:10.1214/009053605000000200
Project Euclid: euclid.aos/1123250224
Kuerer, H. M., Newman, L. A., Smith., T. L. et al. (1999). Clinical course of breast cancer patients with complete pathologic primary tumor and axillary lymph node response to doxorubicin-based neoadjuvant chemotherapy., J. Clin. Oncol. 17 460–469.
Lam, C. and Fan, J. (2008). Sparsistency and rates of convergence in large covariance matrices estimation., Manuscript.
Levina, E., Zhu, J. and Rothman, A. J. (2008). Sparse estimation of large covariance matrices via a nested LASSO penalty., Ann. Appl. Statist. 2 245–263.
Li, H. and Gui, J. (2006). Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks., Biostatistics 7 302–317.
Lin, S. P. and Perlman, M. D. (1985). A Monte Carlo comparison of four estimators of a covariance matrix., Multivariate Anal. 6 411–429.
Mathematical Reviews (MathSciNet): MR822310
Zentralblatt MATH: 0593.62051
Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979)., Multivariate Analysis. Academic Press, New York.
Mathematical Reviews (MathSciNet): MR560319
Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs with the lasso., Ann. Statist. 34 1436–1462.
Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation., Electron. J. Statist. 2 494–515.
Mathematical Reviews (MathSciNet): MR2417391
Digital Object Identifier: doi:10.1214/08-EJS176
Project Euclid: euclid.ejs/1214491853
Schäfer, J. and Strimmer, K. (2005). An empirical Bayes approach to inferring large-scale gene association networks., Bioinformatics 21 754–764.
Shen, H. and Huang, J. (2005). Analysis of call centre arrival data using singular value decomposition., Appl. Stoch. Models Bus. Ind. 21 251–263.
Mathematical Reviews (MathSciNet): MR2159632
Zentralblatt MATH: 1089.62155
Digital Object Identifier: doi:10.1002/asmb.598
Tibshirani, R. J. (1996). Regression shrinkage and selection via the lasso., J. Roy. Statist. Soc. Ser. B 58 267–288.
Mathematical Reviews (MathSciNet): MR1379242
Vandenberghe, L., Boyd, S. and Wu, S.-P. (1998). Determinant maximization with linear matrix inequality constraints., SIAM J. Matrix Anal. Appl. 19 499–533.
Mathematical Reviews (MathSciNet): MR1614078
Zentralblatt MATH: 0959.90039
Digital Object Identifier: doi:10.1137/S0895479896303430
Wong, F., Carter, C. K. and Kohn, R. (2003). Efficient estimation of covariance selection models., Biometrika 90 809–830.
Mathematical Reviews (MathSciNet): MR2024759
Digital Object Identifier: doi:10.1093/biomet/90.4.809
Yuan, M. and Lin, Y. (2007). Model election and estimation in the Gaussian graphical model., Biometrika 94 19–35.
Mathematical Reviews (MathSciNet): MR2367824
Zentralblatt MATH: 1142.62408
Digital Object Identifier: doi:10.1093/biomet/asm018
Zou, H. (2006). The adaptive lasso and its oracle properties., J. Amer. Statist. Assoc. 101 1418–1429.
Mathematical Reviews (MathSciNet): MR2279469
Zentralblatt MATH: 05145624
Digital Object Identifier: doi:10.1198/016214506000000735
Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models (with discussion)., Ann. Statist. 36 1509–1566.
Mathematical Reviews (MathSciNet): MR2435443
Digital Object Identifier: doi:10.1214/009053607000000802
Project Euclid: euclid.aos/1216237287

2010 © Institute of Mathematical Statistics