The Annals of Statistics

Gaussian graphical model estimation with false discovery rate control

Weidong Liu

Full-text: Open access

Abstract

This paper studies the estimation of a high-dimensional Gaussian graphical model (GGM). Typically, the existing methods depend on regularization techniques. As a result, it is necessary to choose the regularized parameter. However, the precise relationship between the regularized parameter and the number of false edges in GGM estimation is unclear. In this paper we propose an alternative method by a multiple testing procedure. Based on our new test statistics for conditional dependence, we propose a simultaneous testing procedure for conditional dependence in GGM. Our method can control the false discovery rate (FDR) asymptotically. The numerical performance of the proposed method shows that our method works quite well.

Article information

Source
Ann. Statist., Volume 41, Number 6 (2013), 2948-2978.

Dates
First available in Project Euclid: 1 January 2014

Permanent link to this document
https://projecteuclid.org/euclid.aos/1388545674

Digital Object Identifier
doi:10.1214/13-AOS1169

Mathematical Reviews number (MathSciNet)
MR3161453

Zentralblatt MATH identifier
1288.62094

Subjects
Primary: 62H12: Estimation 62H15: Hypothesis testing

Keywords
False discovery rate Gaussian graphical model multiple tests

Citation

Liu, Weidong. Gaussian graphical model estimation with false discovery rate control. Ann. Statist. 41 (2013), no. 6, 2948--2978. doi:10.1214/13-AOS1169. https://projecteuclid.org/euclid.aos/1388545674


Export citation

References

  • Anderson, T. W. (2003). An Introduction to Multivariate Statistical Analysis, 3rd ed. Wiley, Hoboken, NJ.
  • Belloni, A., Chernozhukov, V. and Wang, L. (2011). Square-root lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98 791–806.
  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57 289–300.
  • Berman, S. M. (1962). A law of large numbers for the maximum in a stationary Gaussian sequence. Ann. Math. Statist. 33 93–97.
  • Bickel, P. J., Ritov, Y. and Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705–1732.
  • Bühlmann, P. (2013). Statistical significance in high-dimensional linear models. Bernoulli 19 1212–1242.
  • Cai, T. and Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. J. Amer. Statist. Assoc. 106 672–684.
  • Cai, T., Liu, W. and Luo, X. (2011). A constrained $\ell_{1}$ minimization approach to sparse precision matrix estimation. J. Amer. Statist. Assoc. 106 594–607.
  • Cai, T. T., Liu, W. and Xia, Y. (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. J. Amer. Statist. Assoc. 108 265–277.
  • Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when $p$ is much larger than $n$. Ann. Statist. 35 2313–2351.
  • d’Aspremont, A., Banerjee, O. and El Ghaoui, L. (2008). First-order methods for sparse covariance selection. SIAM J. Matrix Anal. Appl. 30 56–66.
  • Drton, M. and Perlman, M. D. (2004). Model selection for Gaussian concentration graphs. Biometrika 91 591–602.
  • Fan, J., Feng, Y. and Wu, Y. (2009). Network exploration via the adaptive lasso and SCAD penalties. Ann. Appl. Stat. 3 521–541.
  • Friedman, J., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432–441.
  • Javanmard, A. and Montanari, A. (2013). Hypothesis testing in high-dimensional regression under the gaussian random design model: Asymptotic theory. Technical report. Available at arXiv:1301.4240.
  • Lauritzen, S. L. (1996). Graphical Models. Oxford Statistical Science Series 17. Oxford Univ. Press, New York.
  • Liu, W. (2013). Supplement to “Gaussian graphical model estimation with false discovery rate control.” DOI:10.1214/13-AOS1169SUPP.
  • Liu, W. and Shao, Q. M. (2012). A robust and powerful approach on control of false discovery rate under dependence. Technical report.
  • Liu, H., Han, F., Yuan, M., Lafferty, J. and Wasserman, L. (2012). High-dimensional semiparametric Gaussian copula graphical models. Ann. Statist. 40 2293–2326.
  • Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
  • Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 417–473.
  • Ravikumar, P., Wainwright, M. J., Raskutti, G. and Yu, B. (2011). High-dimensional covariance estimation by minimizing $\ell_{1}$-penalized log-determinant divergence. Electron. J. Stat. 5 935–980.
  • Ren, Z., Sun, T., Zhang, C. H. and Zhou, H. H. (2013). Asymptotic normality and optimalities in estimation of large Gaussian graphical model. Technical report. Available at http://www.stat.yale.edu/~hz68/InferenceGGM.pdf.
  • Rothman, A. J., Bickel, P. J., Levina, E. and Zhu, J. (2008). Sparse permutation invariant covariance estimation. Electron. J. Stat. 2 494–515.
  • Sun, T. and Zhang, C.-H. (2012a). Comment: “Minimax estimation of large covariance matrices under $\ell_{1}$-norm” [MR3027084]. Statist. Sinica 22 1354–1358.
  • Sun, T. and Zhang, C.-H. (2012b). Scaled sparse linear regression. Biometrika 99 879–898.
  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58 267–288.
  • van de Geer, S., Bühlmann, P. Ritov, Y. and Dezeure, R. (2013). On asymptotically optimal confidence regions and tests for high-dimensional models. Technical report. Available at arXiv:1303.0518.
  • Xue, L. and Zou, H. (2012). Regularized rank-based estimation of high-dimensional nonparanormal graphical models. Ann. Statist. 40 2541–2571.
  • Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. J. Mach. Learn. Res. 11 2261–2286.
  • Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19–35.
  • Zaïtsev, A. Y. (1987). On the Gaussian approximation of convolutions under multidimensional analogues of S. N. Bernstein’s inequality conditions. Probab. Theory Related Fields 74 535–566.
  • Zhang, C. (2010). Estimation of large inverse matrices and graphical model selection. Technical report. Dept. Statistics and Biostatistics, Rutgers Univ.
  • Zhang, C. H. and Zhang, S. S. (2011). Confidence intervals for low-dimensional parameters with highdimensional data. Technical report. Available at arXiv:1110.2563.

Supplemental materials

  • Supplementary material: Supplement to “Gaussian graphical model estimation with false discovery rate control”. This supplemental material includes additional numerical results for GFC-Dantizg and GFC-Lasso.