High-dimensional graphs and variable selection with the Lasso



The Annals of Statistics

High-dimensional graphs and variable selection with the Lasso

Nicolai Meinshausen and Peter Bühlmann

Source: Ann. Statist. Volume 34, Number 3 (2006), 1436-1462.

Abstract

The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims at estimating those structural zeros from data. We show that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs. Neighborhood selection estimates the conditional independence restrictions separately for each node in the graph and is hence equivalent to variable selection for Gaussian linear models. We show that the proposed neighborhood selection scheme is consistent for sparse high-dimensional graphs. Consistency hinges on the choice of the penalty parameter. The oracle value for optimal prediction does not lead to a consistent neighborhood estimate. Controlling instead the probability of falsely joining some distinct connectivity components of the graph, consistent estimation for sparse graphs is achieved (with exponential rates), even when the number of variables grows as the number of observations raised to an arbitrary power.

Primary Subjects: 62J07
Secondary Subjects: 62H20, 62F12
Keywords: Linear regression; covariance selection; Gaussian graphical models; penalized regression

Full-text: Open access

Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1152540754
Digital Object Identifier: doi:10.1214/009053606000000281
Zentralblatt MATH identifier: 1113.62082

References

Buhl, S. (1993). On the existence of maximum-likelihood estimators for graphical Gaussian models. Scand. J. Statist. 20 263--270.
Mathematical Reviews (MathSciNet): MR1241392
Chen, S., Donoho, D. and Saunders, M. (2001). Atomic decomposition by basis pursuit. SIAM Rev. 43 129--159.
Mathematical Reviews (MathSciNet): MR1854649
Digital Object Identifier: doi:10.1137/S003614450037906X
Dempster, A. (1972). Covariance selection. Biometrics 28 157--175.
Drton, M. and Perlman, M. (2004). Model selection for Gaussian concentration graphs. Biometrika 91 591--602.
Mathematical Reviews (MathSciNet): MR2090624
Digital Object Identifier: doi:10.1093/biomet/91.3.591
Edwards, D. (2000). Introduction to Graphical Modelling, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR1880319
Zentralblatt MATH: 0952.62003
Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression (with discussion). Ann. Statist. 32 407--499.
Mathematical Reviews (MathSciNet): MR2060166
Digital Object Identifier: doi:10.1214/009053604000000067
Project Euclid: euclid.aos/1083178935
Frank, I. and Friedman, J. (1993). A statistical view of some chemometrics regression tools (with discussion). Technometrics 35 109--148.
Greenshtein, E. and Ritov, Y. (2004). Persistence in high-dimensional linear predictor selection and the virtue of over-parametrization. Bernoulli 10 971--988.
Mathematical Reviews (MathSciNet): MR2108039
Digital Object Identifier: doi:10.3150/bj/1106314846
Project Euclid: euclid.bj/1106314846
Heckerman, D., Chickering, D. M., Meek, C., Rounthwaite, R. and Kadie, C. (2000). Dependency networks for inference, collaborative filtering and data visualization. J. Machine Learning Research 1 49--75.
Juditsky, A. and Nemirovski, A. (2000). Functional aggregation for nonparametric regression. Ann. Statist. 28 681--712.
Mathematical Reviews (MathSciNet): MR1792783
Digital Object Identifier: doi:10.1214/aos/1015951994
Project Euclid: euclid.aos/1015951994
Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators. Ann. Statist. 28 1356--1378.
Mathematical Reviews (MathSciNet): MR1805787
Digital Object Identifier: doi:10.1214/aos/1015957397
Project Euclid: euclid.aos/1015957397
Lauritzen, S. (1996). Graphical Models. Clarendon Press, Oxford.
Mathematical Reviews (MathSciNet): MR1419991
Osborne, M., Presnell, B. and Turlach, B. (2000). On the lasso and its dual. J. Comput. Graph. Statist. 9 319--337.
Mathematical Reviews (MathSciNet): MR1822089
Digital Object Identifier: doi:10.2307/1390657
Shao, J. (1993). Linear model selection by cross-validation. J. Amer. Statist. Assoc. 88 486--494.
Mathematical Reviews (MathSciNet): MR1224373
Digital Object Identifier: doi:10.2307/2290328
Speed, T. and Kiiveri, H. (1986). Gaussian Markov distributions over finite graphs. Ann. Statist. 14 138--150.
Mathematical Reviews (MathSciNet): MR0829559
Digital Object Identifier: doi:10.1214/aos/1176349846
Project Euclid: euclid.aos/1176349846
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267--288.
Mathematical Reviews (MathSciNet): MR1379242
van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. Springer, New York.
Mathematical Reviews (MathSciNet): MR1385671
Zentralblatt MATH: 0862.60002

2009 © Institute of Mathematical Statistics