Open Access
2015 Confidence intervals for high-dimensional inverse covariance estimation
Jana Janková, Sara van de Geer
Electron. J. Statist. 9(1): 1205-1229 (2015). DOI: 10.1214/15-EJS1031
Abstract

We propose methodology for statistical inference for low-dimensional parameters of sparse precision matrices in a high-dimensional setting. Our method leads to a non-sparse estimator of the precision matrix whose entries have a Gaussian limiting distribution. Asymptotic properties of the novel estimator are analyzed for the case of sub-Gaussian observations under a sparsity assumption on the entries of the true precision matrix and regularity conditions. Thresholding the de-sparsified estimator gives guarantees for edge selection in the associated graphical model. Performance of the proposed method is illustrated in a simulation study.

References

1.

[1] Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root Lasso: Pivotal recovery of sparse signals via conic programming., Biometrika 98 791–806. MR2860324 1228.62083 10.1093/biomet/asr043[1] Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root Lasso: Pivotal recovery of sparse signals via conic programming., Biometrika 98 791–806. MR2860324 1228.62083 10.1093/biomet/asr043

2.

[2] Belloni, A., Chernozhukov, V. and Hansen, C. (2014). Inference on treatment effects after selection amongst high-dimensional controls., Review of Economic Studies 81 608–650. MR3207983 1153.68388 10.1093/restud/rdt044[2] Belloni, A., Chernozhukov, V. and Hansen, C. (2014). Inference on treatment effects after selection amongst high-dimensional controls., Review of Economic Studies 81 608–650. MR3207983 1153.68388 10.1093/restud/rdt044

3.

[3] Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Valid post-selection inference., Annals of Statistics 41 802–837. MR3099122 1267.62080 10.1214/12-AOS1077 euclid.aos/1369836961 [3] Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Valid post-selection inference., Annals of Statistics 41 802–837. MR3099122 1267.62080 10.1214/12-AOS1077 euclid.aos/1369836961

4.

[4] Bickel, P. J. and Levina, E. (2008). Covariance regularization by thresholding., Annals of Statistics 36 2577–2604. MR2485008 1196.62062 10.1214/08-AOS600 euclid.aos/1231165180 [4] Bickel, P. J. and Levina, E. (2008). Covariance regularization by thresholding., Annals of Statistics 36 2577–2604. MR2485008 1196.62062 10.1214/08-AOS600 euclid.aos/1231165180

5.

[5] Bühlmann, P. andvan de Geer, S. (2011)., Statistics for High-Dimensional Data. Springer. MR2807761[5] Bühlmann, P. andvan de Geer, S. (2011)., Statistics for High-Dimensional Data. Springer. MR2807761

6.

[6] Cai, T., Liu, W. and Luo, X. (2011). A constrained l1 minimization approach to sparse precision matrix estimation., Journal of the American Statistical Association 106 594–607. MR2847973 1232.62087 10.1198/jasa.2011.tm10155[6] Cai, T., Liu, W. and Luo, X. (2011). A constrained l1 minimization approach to sparse precision matrix estimation., Journal of the American Statistical Association 106 594–607. MR2847973 1232.62087 10.1198/jasa.2011.tm10155

7.

[7] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n., Annals of Statistics 35 2313–2351. MR2382644 1139.62019 10.1214/009053606000001523 euclid.aos/1201012958 [7] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n., Annals of Statistics 35 2313–2351. MR2382644 1139.62019 10.1214/009053606000001523 euclid.aos/1201012958

8.

[8] Chatterjee, A. and Lahiri, S. N. (2011). Bootstrapping Lasso estimators., Journal of the American Statistical Association 106 608–625. MR2847974 1232.62088 10.1198/jasa.2011.tm10159[8] Chatterjee, A. and Lahiri, S. N. (2011). Bootstrapping Lasso estimators., Journal of the American Statistical Association 106 608–625. MR2847974 1232.62088 10.1198/jasa.2011.tm10159

9.

[9] Chatterjee, A. and Lahiri, S. N. (2013). Rates of convergence of the adaptive Lasso estimators to the oracle distribution and higher order refinements by the bootstrap., Annals of Statistics 41MR3113809 1293.62153 10.1214/13-AOS1106 euclid.aos/1371150899 [9] Chatterjee, A. and Lahiri, S. N. (2013). Rates of convergence of the adaptive Lasso estimators to the oracle distribution and higher order refinements by the bootstrap., Annals of Statistics 41MR3113809 1293.62153 10.1214/13-AOS1106 euclid.aos/1371150899

10.

[10] d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2008). First-order methods for sparse covariance selection., SIAM J. Matrix Anal. Appl. 30 56–66. MR2399568 10.1137/060670985[10] d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2008). First-order methods for sparse covariance selection., SIAM J. Matrix Anal. Appl. 30 56–66. MR2399568 10.1137/060670985

11.

[11] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso., Biostatistics 9 432– 441.[11] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso., Biostatistics 9 432– 441.

12.

[12] Greene, W. H. (2011)., Econometric Analysis. Prentice Hall.[12] Greene, W. H. (2011)., Econometric Analysis. Prentice Hall.

13.

[13] Javanmard, A. and Montanari, A. (2013a). Confidence intervals and hypothesis testing for high-dimensional regression., ArXiv:1306.3171.[13] Javanmard, A. and Montanari, A. (2013a). Confidence intervals and hypothesis testing for high-dimensional regression., ArXiv:1306.3171.

14.

[14] Javanmard, A. and Montanari, A. (2013b). Model selection for high-dimensional regression under the generalized irrepresentability condition. In, Advances in Neural Information Processing Systems 26 (C. j. c. Burges, L. Bottou, M. Welling, Z. Ghahramani and K. q. Weinberger, eds.) 3012–3020.[14] Javanmard, A. and Montanari, A. (2013b). Model selection for high-dimensional regression under the generalized irrepresentability condition. In, Advances in Neural Information Processing Systems 26 (C. j. c. Burges, L. Bottou, M. Welling, Z. Ghahramani and K. q. Weinberger, eds.) 3012–3020.

15.

[15] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis., Annals of Statistics 29 295–327. MR1863961 1016.62078 10.1214/aos/1009210544 euclid.aos/1009210544 [15] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis., Annals of Statistics 29 295–327. MR1863961 1016.62078 10.1214/aos/1009210544 euclid.aos/1009210544

16.

[16] Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators., Annals of Statistics 28 1356–1378. MR1805787 1105.62357 10.1214/aos/1015957397 euclid.aos/1015957397 [16] Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators., Annals of Statistics 28 1356–1378. MR1805787 1105.62357 10.1214/aos/1015957397 euclid.aos/1015957397

17.

[17] Lauritzen, S. L. (1996)., Graphical Models. Clarendon Press, Oxford. MR1419991[17] Lauritzen, S. L. (1996)., Graphical Models. Clarendon Press, Oxford. MR1419991

18.

[18] Leeb, H. and Pötscher, B. M. (2005). Model selection and inference: Facts and fiction., Econometric Theory 21 21–59. MR2153856 1085.62004 10.1017/S0266466605050036[18] Leeb, H. and Pötscher, B. M. (2005). Model selection and inference: Facts and fiction., Econometric Theory 21 21–59. MR2153856 1085.62004 10.1017/S0266466605050036

19.

[19] Leeb, H. and Pötscher, B. M. (2006). Can one estimate the conditional distribution of post-model-selection estimators?, Annals of Statistics 34 2554–2591. MR2291510 10.1214/009053606000000821 euclid.aos/1169571807 [19] Leeb, H. and Pötscher, B. M. (2006). Can one estimate the conditional distribution of post-model-selection estimators?, Annals of Statistics 34 2554–2591. MR2291510 10.1214/009053606000000821 euclid.aos/1169571807

20.

[20] Meinshausen, N. (2013). Assumption-free confidence intervals for groups of variables in sparse high-dimensional regression., ArXiv:1309.3489.[20] Meinshausen, N. (2013). Assumption-free confidence intervals for groups of variables in sparse high-dimensional regression., ArXiv:1309.3489.

21.

[21] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso., Annals of Statistics 34 1436–1462. MR2278363 1113.62082 10.1214/009053606000000281 euclid.aos/1152540754 [21] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso., Annals of Statistics 34 1436–1462. MR2278363 1113.62082 10.1214/009053606000000281 euclid.aos/1152540754

22.

[22] Meinshausen, N., Meier, L. and Bühlmann, P. (2009). P-values for high-dimensional regression., Journal of the American Statistical Association 104 1671–1681. MR2750584 1205.62089 10.1198/jasa.2009.tm08647[22] Meinshausen, N., Meier, L. and Bühlmann, P. (2009). P-values for high-dimensional regression., Journal of the American Statistical Association 104 1671–1681. MR2750584 1205.62089 10.1198/jasa.2009.tm08647

23.

[23] Negahban, S. N., Ravikumar, P., Wainwright, M. J. and Yu, B. (2010). A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers., Statistical Science 27 538–557. MR3025133 10.1214/12-STS400 euclid.ss/1356098555 [23] Negahban, S. N., Ravikumar, P., Wainwright, M. J. and Yu, B. (2010). A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers., Statistical Science 27 538–557. MR3025133 10.1214/12-STS400 euclid.ss/1356098555

24.

[24] Ng, B., G. Varoquaux, J.-B. P. andThirion, B. (2013). A novel sparse group gaussian graphical model for functional connectivity estimation., Information Processing in Medical Imaging.[24] Ng, B., G. Varoquaux, J.-B. P. andThirion, B. (2013). A novel sparse group gaussian graphical model for functional connectivity estimation., Information Processing in Medical Imaging.

25.

[25] Nickl, R. andvan de Geer, S. (2012). Confidence sets in sparse regression., Annals of Statistics 41 2852–2876. MR3161450 1288.62108 10.1214/13-AOS1170 euclid.aos/1387313392 [25] Nickl, R. andvan de Geer, S. (2012). Confidence sets in sparse regression., Annals of Statistics 41 2852–2876. MR3161450 1288.62108 10.1214/13-AOS1170 euclid.aos/1387313392

26.

[26] Ravikumar, P., Raskutti, G., Wainwright, M. J. and Yu, B. (2008). High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence., Electronic Journal of Statistics 5 935–980. MR2836766 1274.62190 10.1214/11-EJS631 euclid.ejs/1316092865 [26] Ravikumar, P., Raskutti, G., Wainwright, M. J. and Yu, B. (2008). High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence., Electronic Journal of Statistics 5 935–980. MR2836766 1274.62190 10.1214/11-EJS631 euclid.ejs/1316092865

27.

[27] Ren, Z., Sun, T., Zhang, C. H. and Zhou, H. H. (2013). Asymptotic normality and optimalities in estimation of large Gaussian graphical model., ArXiv:1309.6024.[27] Ren, Z., Sun, T., Zhang, C. H. and Zhou, H. H. (2013). Asymptotic normality and optimalities in estimation of large Gaussian graphical model., ArXiv:1309.6024.

28.

[28] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation., Electronic Journal of Statistics 2 494–515. MR2417391 06165702 10.1214/08-EJS176 euclid.ejs/1214491853 [28] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation., Electronic Journal of Statistics 2 494–515. MR2417391 06165702 10.1214/08-EJS176 euclid.ejs/1214491853

29.

[29] Schäfer, J. and Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics., Statistical Applications in Genetics and Molecular Biology 4MR2183942[29] Schäfer, J. and Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics., Statistical Applications in Genetics and Molecular Biology 4MR2183942

30.

[30] Städler, N. and Mukherjee, S. (2013). Two-sample testing in high-dimensional models., Annals of Applied Statistics 7 1837–2457.[30] Städler, N. and Mukherjee, S. (2013). Two-sample testing in high-dimensional models., Annals of Applied Statistics 7 1837–2457.

31.

[31] Sun, T. and Zhang, C. H. (2012). Sparse matrix inversion with scaled Lasso., The Journal of Machine Learning Research 14 3385–3418. MR3144466 06378058[31] Sun, T. and Zhang, C. H. (2012). Sparse matrix inversion with scaled Lasso., The Journal of Machine Learning Research 14 3385–3418. MR3144466 06378058

32.

[32] van de Geer, S. (2014a). Statistical theory for high-dimensional models., ArXiv:1309.3489.[32] van de Geer, S. (2014a). Statistical theory for high-dimensional models., ArXiv:1309.3489.

33.

[33] van de Geer, S. (2014b). Worst possible sub-directions in high-dimensional models., ArXiv:1403.7023MR1382516 10.1051/ps:1997101[33] van de Geer, S. (2014b). Worst possible sub-directions in high-dimensional models., ArXiv:1403.7023MR1382516 10.1051/ps:1997101

34.

[34] van de Geer, S. A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso., Electronic Journal of Statistics 3 1360–1392. MR2576316 1106.62029 10.1214/09-EJS506 euclid.ejs/1260801227 [34] van de Geer, S. A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso., Electronic Journal of Statistics 3 1360–1392. MR2576316 1106.62029 10.1214/09-EJS506 euclid.ejs/1260801227

35.

[35] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2013). On asymptotically optimal confidence regions and tests for high-dimensional models., Annals of Statistics 42 1166–1202. MR3224285 1305.62259 10.1214/14-AOS1221 euclid.aos/1403276911 [35] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2013). On asymptotically optimal confidence regions and tests for high-dimensional models., Annals of Statistics 42 1166–1202. MR3224285 1305.62259 10.1214/14-AOS1221 euclid.aos/1403276911

36.

[36] Wasserman, L. and Roeder, K. (2009). High dimensional variable selection., Annals of Statistics 37 2178. MR2543689 1173.62054 10.1214/08-AOS646 euclid.aos/1247663752 [36] Wasserman, L. and Roeder, K. (2009). High dimensional variable selection., Annals of Statistics 37 2178. MR2543689 1173.62054 10.1214/08-AOS646 euclid.aos/1247663752

37.

[37] Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming., The Journal of Machine Learning Research 11 2261–2286. MR2719856 1242.62043[37] Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming., The Journal of Machine Learning Research 11 2261–2286. MR2719856 1242.62043

38.

[38] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model., Biometrika 1–17. MR2367824 1142.62408 10.1093/biomet/asm018[38] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model., Biometrika 1–17. MR2367824 1142.62408 10.1093/biomet/asm018

39.

[39] Zhang, C. H. and Zhang, S. S. (2014). Confidence intervals for low-dimensional parameters in high-dimensional linear models., Journal of the Royal Statistical Society: Series B 76 217–242. MR3153940 10.1111/rssb.12026[39] Zhang, C. H. and Zhang, S. S. (2014). Confidence intervals for low-dimensional parameters in high-dimensional linear models., Journal of the Royal Statistical Society: Series B 76 217–242. MR3153940 10.1111/rssb.12026

40.

[40] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso., Journal of Machine Learning Research 7 2541–2563. MR2274449 1222.62008[40] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso., Journal of Machine Learning Research 7 2541–2563. MR2274449 1222.62008
Copyright © 2015 The Institute of Mathematical Statistics and the Bernoulli Society
Jana Janková and Sara van de Geer "Confidence intervals for high-dimensional inverse covariance estimation," Electronic Journal of Statistics 9(1), 1205-1229, (2015). https://doi.org/10.1214/15-EJS1031
Received: 1 March 2014; Published: 2015
Vol.9 • No. 1 • 2015
Back to Top