Electronic Journal of Statistics

Joint estimation of precision matrices in heterogeneous populations

Takumi Saegusa and Ali Shojaie

Full-text: Open access

Abstract

We introduce a general framework for estimation of inverse covariance, or precision, matrices from heterogeneous populations. The proposed framework uses a Laplacian shrinkage penalty to encourage similarity among estimates from disparate, but related, subpopulations, while allowing for differences among matrices. We propose an efficient alternating direction method of multipliers (ADMM) algorithm for parameter estimation, as well as its extension for faster computation in high dimensions by thresholding the empirical covariance matrix to identify the joint block diagonal structure in the estimated precision matrices. We establish both variable selection and norm consistency of the proposed estimator for distributions with exponential or polynomial tails. Further, to extend the applicability of the method to the settings with unknown populations structure, we propose a Laplacian penalty based on hierarchical clustering, and discuss conditions under which this data-driven choice results in consistent estimation of precision matrices in heterogenous populations. Extensive numerical studies and applications to gene expression data from subtypes of cancer with distinct clinical outcomes indicate the potential advantages of the proposed method over existing approaches.

Article information

Source
Electron. J. Statist., Volume 10, Number 1 (2016), 1341-1392.

Dates
Received: January 2015
First available in Project Euclid: 31 May 2016

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1464710236

Digital Object Identifier
doi:10.1214/16-EJS1137

Mathematical Reviews number (MathSciNet)
MR3507368

Zentralblatt MATH identifier
1341.62130

Keywords
Graph Laplacian graphical modeling heterogeneous populations hierarchical clustering high-dimensional estimation precision matrix sparsity

Citation

Saegusa, Takumi; Shojaie, Ali. Joint estimation of precision matrices in heterogeneous populations. Electron. J. Statist. 10 (2016), no. 1, 1341--1392. doi:10.1214/16-EJS1137. https://projecteuclid.org/euclid.ejs/1464710236


Export citation

References

  • [1] Petro Borysov, Jan Hannig, and JS Marron. Asymptotics of hierarchical clustering for growing dimension., Journal of Multivariate Analysis, 124:465–479, 2014.
  • [2] Stéphane Boucheron, Gábor Lugosi, and Pascal Massart., Concentration inequalities: A nonasymptotic theory of independence. Oxford University Press, 2013.
  • [3] Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, and Jonathan Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers., Foundations and Trends in Machine Learning, 3(1):1–122, 2011.
  • [4] Tony Cai, Weidong Liu, and Xi Luo. A constrained $\ell_1$ minimization approach to sparse precision matrix estimation., J. Amer. Statist. Assoc., 106(494):594–607, 2011. ISSN 0162-1459.
  • [5] Fan RK Chung., Spectral graph theory, volume 92. American Mathematical Soc., 1997.
  • [6] Patrick Danaher, Pei Wang, and Daniela M Witten. The joint graphical lasso for inverse covariance estimation across multiple classes., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(2):373–397, 2014.
  • [7] Alexandre d’Aspremont, Onureena Banerjee, and Laurent El Ghaoui. First-order methods for sparse covariance selection., SIAM J. Matrix Anal. Appl., 30(1):56–66, 2008. ISSN 0895-4798.
  • [8] Jerome Friedman, Trevor Hastie, and Robert Tibshirani. Sparse inverse covariance estimation with the graphical lasso., Biostatistics, 9(3):432–441, 2007.
  • [9] Jian Guo, Elizaveta Levina, George Michailidis, and Ji Zhu. Joint estimation of multiple graphical models., Biometrika, 98(1):1–15, 2011. ISSN 0006-3444.
  • [10] Jian Huang, Shuangge Ma, Hongzhe Li, and Cun-Hui Zhang. The sparse Laplacian shrinkage estimator for high-dimensional regression., Ann. Statist., 39(4) :2021–2046, 2011. ISSN 0090-5364.
  • [11] Trey Ideker and Nevan J Krogan. Differential network biology., Molecular systems biology, 8(1), 2012.
  • [12] Göran Jönsson, Johan Staaf, Johan Vallon-Christersson, Markus Ringnér, Karolina Holm, Cecilia Hegardt, Haukur Gunnarsson, Rainer Fagerholm, Carina Strand, Bjarni A Agnarsson, et al. Genomic subtypes of breast cancer identified by array-comparative genomic hybridization display distinct molecular and clinical characteristics., Breast Cancer Research, 12(3):1–14, 2010.
  • [13] Mladen Kolar, Le Song, and Eric P Xing. Sparsistent learning of varying-coefficient models with structural changes. In, Advances in Neural Information Processing Systems, pages 1006–1014, 2009.
  • [14] Steffen L Lauritzen., Graphical models. Oxford University Press, 1996.
  • [15] Caiyan Li and Hongzhe Li. Variable selection and regression analysis for graph-structured covariates with an application to genomics., Ann. Appl. Stat., 4(3) :1498–1516, 2010. ISSN 1932-6157.
  • [16] Fan Li and Nancy R Zhang. Bayesian variable selection in structured high-dimensional covariate spaces with applications in genomics., Journal of the American Statistical Association, 105(491) :1202–1214, 2010.
  • [17] F Liu, AC Lozano, S Chakraborty, and F Li. A graph laplacian prior for variable selection and grouping., Biometrika, 98(1):1–31, 2011.
  • [18] Fei Liu, Sounak Chakraborty, Fan Li, Yan Liu, Aurelie C Lozano, et al. Bayesian regularization via graph laplacian., Bayesian Analysis, 9(2):449–474, 2014.
  • [19] Nicolai Meinshausen and Peter Bühlmann. High-dimensional graphs and variable selection with the lasso., Ann. Statist., 34(3) :1436–1462, 2006. ISSN 0090-5364.
  • [20] Sahand N. Negahban, Pradeep Ravikumar, Martin J. Wainwright, and Bin Yu. A unified framework for high-dimensional analysis of $m$-estimators with decomposable regularizers., Stat. Sci., 27(4):538–557, 2012a.
  • [21] Sahand N. Negahban, Pradeep Ravikumar, Martin J. Wainwright, and Bin Yu. Supplementary material for “a unified framework for high-dimensional analysis of $m$-estimators with decomposable regularizers”., Stat. Sci., 2012b.
  • [22] Charles M Perou, Therese Sørlie, Michael B Eisen, Matt van de Rijn, Stefanie S Jeffrey, Christian A Rees, Jonathan R Pollack, Douglas T Ross, Hilde Johnsen, Lars A Akslen, et al. Molecular portraits of human breast tumours., Nature, 406 (6797):747–752, 2000.
  • [23] Christine Peterson, Francesco C Stingo, and Marina Vannucci. Bayesian inference of multiple gaussian graphical models., Journal of the American Statistical Association, 110(509):159–174, 2015.
  • [24] Franck Rapaport, Andrei Zinovyev, Marie Dutreix, Emmanuel Barillot, and Jean-Philippe Vert. Classification of microarray data using gene networks., BMC Bioinformatics, 8, 2007.
  • [25] Pradeep Ravikumar, Martin J. Wainwright, Garvesh Raskutti, and Bin Yu. High-dimensional covariance estimation by minimizing $\ell\sb1$-penalized log-determinant divergence., Electron. J. Stat., 5:935–980, 2011. ISSN 1935-7524.
  • [26] Adam J. Rothman, Peter J. Bickel, Elizaveta Levina, and Ji Zhu. Sparse permutation invariant covariance estimation., Electron. J. Stat., 2:494–515, 2008. ISSN 1935-7524. 10.1214/08-EJS176
  • [27] Nafiseh Sedaghat, Takumi Saegusa, Timothy Randolph, and Ali Shojaie. Comparative study of computational methods for reconstructing genetic networks of cancer-related pathways., Cancer Informatics, 13(Suppl 2):55–66, 09 2014.
  • [28] Ali Shojaie and George Michailidis. Penalized principal component regression on graphs for analysis of subnetworks. In John D. Lafferty, Christopher K. I. Williams, John Shawe-Taylor, Richard S. Zemel, and Aron Culotta, editors, NIPS, pages 2155–2163. Curran Associates, Inc., 2010.
  • [29] Nicolas Städler, Peter Bühlmann, and Sara Van De Geer. $\ell_1$-penalization for mixture regression models., Test, 19(2):209–256, 2010.
  • [30] Robert Tibshiranit. Regression shrinkage and selection via the lasso., Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288, 1996.
  • [31] Yu-Xiang Wang, James Sharpnack, Alex Smola, and Ryan J Tibshirani. Trend filtering on graphs., arXiv preprint arXiv :1410.7690, 2014.
  • [32] Kilian Q Weinberger, Fei Sha, Qihui Zhu, and Lawrence K Saul. Graph laplacian regularization for large-scale semidefinite programming. In, Advances in neural information processing systems (NIPS), pages 1489–1496, 2006.
  • [33] Ming Yuan. High dimensional inverse covariance matrix estimation via linear programming., J. Mach. Learn. Res., 11 :2261–2286, 2010. ISSN 1532-4435.
  • [34] Ming Yuan and Yi Lin. Model selection and estimation in the Gaussian graphical model., Biometrika, 94(1):19–35, 2007. ISSN 0006-3444.
  • [35] Peng Zhao and Bin Yu. On model selection consistency of lasso., The Journal of Machine Learning Research, 7 :2541–2563, 2006.
  • [36] Peng Zhao, Guilherme Rocha, and Bin Yu. The composite absolute penalties family for grouped and hierarchical variable selection’., Annals of Statistics, 37(6A) :3468–3497, 2009.
  • [37] Sen Zhao and Ali Shojaie. A significance test for graph-constrained estimation., Biometrics (forthcoming), 2015.