Electronic Journal of Statistics

Confidence intervals for high-dimensional inverse covariance estimation

Jana Janková and Sara van de Geer

Full-text: Open access

Abstract

We propose methodology for statistical inference for low-dimensional parameters of sparse precision matrices in a high-dimensional setting. Our method leads to a non-sparse estimator of the precision matrix whose entries have a Gaussian limiting distribution. Asymptotic properties of the novel estimator are analyzed for the case of sub-Gaussian observations under a sparsity assumption on the entries of the true precision matrix and regularity conditions. Thresholding the de-sparsified estimator gives guarantees for edge selection in the associated graphical model. Performance of the proposed method is illustrated in a simulation study.

Article information

Source
Electron. J. Statist., Volume 9, Number 1 (2015), 1205-1229.

Dates
Received: March 2014
First available in Project Euclid: 1 June 2015

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1433195859

Digital Object Identifier
doi:10.1214/15-EJS1031

Mathematical Reviews number (MathSciNet)
MR3354336

Zentralblatt MATH identifier
1307.62015

Subjects
Primary: 62J07: Ridge regression; shrinkage estimators
Secondary: 62F12: Asymptotic properties of estimators

Keywords
Confidence intervals graphical Lasso high-dimensional precision matrix sparsity

Citation

Janková, Jana; van de Geer, Sara. Confidence intervals for high-dimensional inverse covariance estimation. Electron. J. Statist. 9 (2015), no. 1, 1205--1229. doi:10.1214/15-EJS1031. https://projecteuclid.org/euclid.ejs/1433195859


Export citation

References

  • [1] Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root Lasso: Pivotal recovery of sparse signals via conic programming., Biometrika 98 791–806.
  • [2] Belloni, A., Chernozhukov, V. and Hansen, C. (2014). Inference on treatment effects after selection amongst high-dimensional controls., Review of Economic Studies 81 608–650.
  • [3] Berk, R., Brown, L., Buja, A., Zhang, K. and Zhao, L. (2013). Valid post-selection inference., Annals of Statistics 41 802–837.
  • [4] Bickel, P. J. and Levina, E. (2008). Covariance regularization by thresholding., Annals of Statistics 36 2577–2604.
  • [5] Bühlmann, P. andvan de Geer, S. (2011)., Statistics for High-Dimensional Data. Springer.
  • [6] Cai, T., Liu, W. and Luo, X. (2011). A constrained l1 minimization approach to sparse precision matrix estimation., Journal of the American Statistical Association 106 594–607.
  • [7] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n., Annals of Statistics 35 2313–2351.
  • [8] Chatterjee, A. and Lahiri, S. N. (2011). Bootstrapping Lasso estimators., Journal of the American Statistical Association 106 608–625.
  • [9] Chatterjee, A. and Lahiri, S. N. (2013). Rates of convergence of the adaptive Lasso estimators to the oracle distribution and higher order refinements by the bootstrap., Annals of Statistics 41.
  • [10] d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2008). First-order methods for sparse covariance selection., SIAM J. Matrix Anal. Appl. 30 56–66.
  • [11] Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso., Biostatistics 9 432– 441.
  • [12] Greene, W. H. (2011)., Econometric Analysis. Prentice Hall.
  • [13] Javanmard, A. and Montanari, A. (2013a). Confidence intervals and hypothesis testing for high-dimensional regression., ArXiv:1306.3171.
  • [14] Javanmard, A. and Montanari, A. (2013b). Model selection for high-dimensional regression under the generalized irrepresentability condition. In, Advances in Neural Information Processing Systems 26 (C. j. c. Burges, L. Bottou, M. Welling, Z. Ghahramani and K. q. Weinberger, eds.) 3012–3020.
  • [15] Johnstone, I. M. (2001). On the distribution of the largest eigenvalue in principal components analysis., Annals of Statistics 29 295–327.
  • [16] Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators., Annals of Statistics 28 1356–1378.
  • [17] Lauritzen, S. L. (1996)., Graphical Models. Clarendon Press, Oxford.
  • [18] Leeb, H. and Pötscher, B. M. (2005). Model selection and inference: Facts and fiction., Econometric Theory 21 21–59.
  • [19] Leeb, H. and Pötscher, B. M. (2006). Can one estimate the conditional distribution of post-model-selection estimators?, Annals of Statistics 34 2554–2591.
  • [20] Meinshausen, N. (2013). Assumption-free confidence intervals for groups of variables in sparse high-dimensional regression., ArXiv:1309.3489.
  • [21] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso., Annals of Statistics 34 1436–1462.
  • [22] Meinshausen, N., Meier, L. and Bühlmann, P. (2009). P-values for high-dimensional regression., Journal of the American Statistical Association 104 1671–1681.
  • [23] Negahban, S. N., Ravikumar, P., Wainwright, M. J. and Yu, B. (2010). A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers., Statistical Science 27 538–557.
  • [24] Ng, B., G. Varoquaux, J.-B. P. andThirion, B. (2013). A novel sparse group gaussian graphical model for functional connectivity estimation., Information Processing in Medical Imaging.
  • [25] Nickl, R. andvan de Geer, S. (2012). Confidence sets in sparse regression., Annals of Statistics 41 2852–2876.
  • [26] Ravikumar, P., Raskutti, G., Wainwright, M. J. and Yu, B. (2008). High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence., Electronic Journal of Statistics 5 935–980.
  • [27] Ren, Z., Sun, T., Zhang, C. H. and Zhou, H. H. (2013). Asymptotic normality and optimalities in estimation of large Gaussian graphical model., ArXiv:1309.6024.
  • [28] Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation., Electronic Journal of Statistics 2 494–515.
  • [29] Schäfer, J. and Strimmer, K. (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics., Statistical Applications in Genetics and Molecular Biology 4.
  • [30] Städler, N. and Mukherjee, S. (2013). Two-sample testing in high-dimensional models., Annals of Applied Statistics 7 1837–2457.
  • [31] Sun, T. and Zhang, C. H. (2012). Sparse matrix inversion with scaled Lasso., The Journal of Machine Learning Research 14 3385–3418.
  • [32] van de Geer, S. (2014a). Statistical theory for high-dimensional models., ArXiv:1309.3489.
  • [33] van de Geer, S. (2014b). Worst possible sub-directions in high-dimensional models., ArXiv:1403.7023.
  • [34] van de Geer, S. A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso., Electronic Journal of Statistics 3 1360–1392.
  • [35] van de Geer, S., Bühlmann, P., Ritov, Y. and Dezeure, R. (2013). On asymptotically optimal confidence regions and tests for high-dimensional models., Annals of Statistics 42 1166–1202.
  • [36] Wasserman, L. and Roeder, K. (2009). High dimensional variable selection., Annals of Statistics 37 2178.
  • [37] Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming., The Journal of Machine Learning Research 11 2261–2286.
  • [38] Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model., Biometrika 1–17.
  • [39] Zhang, C. H. and Zhang, S. S. (2014). Confidence intervals for low-dimensional parameters in high-dimensional linear models., Journal of the Royal Statistical Society: Series B 76 217–242.
  • [40] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso., Journal of Machine Learning Research 7 2541–2563.