The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 11, Number 1 (2017), 41-68.
Gene network reconstruction using global-local shrinkage priors
Gwenaël G. R. Leday, Mathisca C. M. de Gunst, Gino B. Kpogbezan, Aad W. van der Vaart, Wessel N. van Wieringen, and Mark A. van de Wiel
Full-text: Access denied (no subscription detected) We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Abstract
Reconstructing a gene network from high-throughput molecular data is an important but challenging task, as the number of parameters to estimate easily is much larger than the sample size. A conventional remedy is to regularize or penalize the model likelihood. In network models, this is often done locally in the neighborhood of each node or gene. However, estimation of the many regularization parameters is often difficult and can result in large statistical uncertainties. In this paper we propose to combine local regularization with global shrinkage of the regularization parameters to borrow strength between genes and improve inference. We employ a simple Bayesian model with nonsparse, conjugate priors to facilitate the use of fast variational approximations to posteriors. We discuss empirical Bayes estimation of hyperparameters of the priors, and propose a novel approach to rank-based posterior thresholding. Using extensive model- and data-based simulations, we demonstrate that the proposed inference strategy outperforms popular (sparse) methods, yields more stable edges, and is more reproducible. The proposed method, termed ShrinkNet, is then applied to Glioblastoma to investigate the interactions between genes associated with patient survival.
Article information
Source
Ann. Appl. Stat. Volume 11, Number 1 (2017), 41-68.
Dates
Received: February 2016
Revised: June 2016
First available in Project Euclid: 8 April 2017
Permanent link to this document
http://projecteuclid.org/euclid.aoas/1491616871
Digital Object Identifier
doi:10.1214/16-AOAS990
Keywords
Undirected gene network Bayesian inference shrinkage variational approximation empirical Bayes
Citation
Leday, Gwenaël G. R.; de Gunst, Mathisca C. M.; Kpogbezan, Gino B.; van der Vaart, Aad W.; van Wieringen, Wessel N.; van de Wiel, Mark A. Gene network reconstruction using global-local shrinkage priors. Ann. Appl. Stat. 11 (2017), no. 1, 41--68. doi:10.1214/16-AOAS990. http://projecteuclid.org/euclid.aoas/1491616871.
References
- Allen, G. I. and Liu, Z. (2013). A local Poisson graphical model for inferring networks from sequencing data. IEEE Trans. Nanobiosci. 12 189–198.
- Blei, D. M. and Jordan, M. I. (2006). Variational inference for Dirichlet process mixtures. Bayesian Anal. 1 121–143 (electronic).
- Bondell, H. D. and Reich, B. J. (2012). Consistent high-dimensional Bayesian variable selection via penalized credible regions. J. Amer. Statist. Assoc. 107 1610–1624.
- Braun, M. and McAuliffe, J. (2010). Variational inference for large-scale models of discrete choice. J. Amer. Statist. Assoc. 105 324–335.Zentralblatt MATH: 06444900
- Camby, I., Mercier, M. L., Lefranc, F. and Kiss, R. (2006). Galectin-1: A small protein with major functions. Glycobiology 16 137R–157R.
- Cerami, E. G., Gross, B. E., Demir, E., Rodchenkov, I., Babur, Ö., Anwar, N., Schultz, N., Bader, G. D. and Sander, C. (2011). Pathway commons, a web resource for biological pathway data. Nucleic Acids Res. 39 D685–D690.
- Cerami, E., Gao, J., Dogrusoz, U., Gross, B. E., Sumer, S. O., Aksoy, B. A., Jacobsen, A., Byrne, C. J., Heuer, M. L., Larsson, E., Antipin, Y., Reva, B., Goldberg, A. P., Sander, C. and Schultz, N. (2012). The cBio cancer genomics portal: An open platform for exploring multidimensional cancer genomics data. Cancer Discovery 2 401–404.
- Chen, J. and Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika 95 759–771.Zentralblatt MATH: 05609546
- Chen, S., Witten, D. M. and Shojaie, A. (2015). Selection and estimation for mixed graphical models. Biometrika 102 47–64.
- Cordes, C., Bartling, B., Simm, A., Afar, D., Lautenschlager, C., Hansen, G., Silber, R.-E., Burdach, S. and Hofmann, H.-S. (2009). Simultaneous expression of Cathepsins B and K in pulmonary adenocarcinomas and squamous cell carcinomas predicts poor recurrence-free and overall survival. Lung Cancer 64 79–85.
- Dobra, A., Lenkoski, A. and Rodriguez, A. (2011). Bayesian inference for general Gaussian graphical models with application to multivariate lattice data. J. Amer. Statist. Assoc. 106 1418–1433.
- Dobra, A., Hans, C., Jones, B., Nevins, J. R., Yao, G. and West, M. (2004). Sparse graphical models for exploring gene expression data. J. Multivariate Anal. 90 196–212.
- Dodd, L. E. and Pepe, M. S. (2003). Partial AUC estimation and regression. Biometrics 59 614–623.Mathematical Reviews (MathSciNet): MR2004266
Zentralblatt MATH: 1210.62152
Digital Object Identifier: doi:10.1111/1541-0420.00071 - Efron, B. (2010). Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction. Institute of Mathematical Statistics (IMS) Monographs 1. Cambridge Univ. Press, Cambridge.
- Fortin, S., Mercier, M. L., Camby, I., Spiegl-Kreinecker, S., Berger, W., Lefranc, F. and Kiss, R. (2010). Galectin-1 is implicated in the protein kinase C epsilon/vimentin-controlled trafficking of integrin-beta1 in glioblastoma cells. Brain Pathol. 20 39–49.
- Foygel, R. and Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. In Advances in Neural Information Processing Systems 23 (J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel and A. Culotta, eds.) 604–612.
- Friedman, J. H., Hastie, T. and Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9 432–441.
- Gao, X., Pu, D. Q., Wu, Y. and Xu, H. (2012). Tuning parameter selection for penalized likelihood estimation of Gaussian graphical model. Statist. Sinica 22 1123–1146.Zentralblatt MATH: 06072101
- Geiger, D. and Heckerman, D. (2002). Parameter priors for directed acyclic graphical models and the characterization of several probability distributions. Ann. Statist. 30 1412–1440.
- Giraud, C. (2008). Estimation of Gaussian graphs by model selection. Electron. J. Stat. 2 542–563.
- Gole, B., Huszthy, P. C., Popović, M., Jeruc, J., Ardebili, Y. S., Bjerkvig, R. and Lah, T. T. (2012). The regulation of cysteine cathepsins and cystatins in human gliomas. Int. J. Cancer 131 1779–1789.
- Ishwaran, H. and Rao, J. S. (2005). Spike and slab variable selection: Frequentist and Bayesian strategies. Ann. Statist. 33 730–773.
- Jacobsen, A. (2013). cgdsr: R-Based API for accessing the MSKCC Cancer Genomics Data Server (CGDS). R package version 1.1.30.
- Kallunki, T., Olsen, O. D. and Jäättelä, M. (2013). Cancer-associated lysosomal changes: Friends or foes? Oncogene 32 1995–2004.
- Kass, R. E. and Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Amer. Statist. Assoc. 90 928–934.Mathematical Reviews (MathSciNet): MR1354008
Zentralblatt MATH: 0851.62020
Digital Object Identifier: doi:10.1080/01621459.1995.10476592 - Krämer, N., Schäfer, J. and Boulesteix, A.-L. (2009). Regularized estimation of large-scale gene association networks using graphical Gaussian models. BMC Bioinformatics 10 384.
- Leday, G. G. R., de Gunst, M. C. M., Kpogbezan, G. B., van der Vaart, A. W., van Wieringen, W. N., and van de Wiel, M. A. (2017). Supplement to “Gene network reconstruction using global-local shrinkage priors.” DOI:10.1214/16-AOAS990SUPP.Zentralblatt MATH: 06725395
- Ledoit, O. and Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. J. Multivariate Anal. 88 365–411.
- Lewis, C. A., Brault, C., Peck, B., Bensaad, K., Griffiths, B., Mitter, R., Chakravarty, P., East, P., Dankworth, B., Alibhai, D. et al. (2015). SREBP maintains lipid biosynthesis and viability of cancer cells under lipid-and oxygen-deprived conditions and defines a gene signature associated with poor survival in glioblastoma multiforme. Oncogene.
- Lian, H. (2011). Shrinkage tuning parameter selection in precision matrices estimation. J. Statist. Plann. Inference 141 2839–2848.
- Liang, F., Paulo, R., Molina, G., Clyde, M. A. and Berger, J. O. (2008). Mixtures of $g$ priors for Bayesian variable selection. J. Amer. Statist. Assoc. 103 410–423.
- Lim, K. S., Lim, K. J., Price, A. C., Orr, B. A., Eberhart, C. G. and Bar, E. E. (2013). Inhibition of monocarboxylate transporter-4 depletes stem-like glioblastoma cells and inhibits HIF transcriptional response in a lactate-independent manner. Oncogene.
- Luo, S., Song, R. and Witten, D. (2014). Sure screening for Gaussian graphical models. Preprint. Available at arXiv:1407.7819.arXiv: arXiv:1407.7819
- Madhankumar, A. B., Slagle-Webb, B., Mintz, A., Sheehan, J. M. and Connor, J. R. (2006). Interleukin-13 receptor-targeted nanovesicles are a potential therapy for glioblastoma multiforme. Mol. Cancer Ther. 5 3162–3169.
- Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
- Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 417–473.Mathematical Reviews (MathSciNet): MR2758523
Digital Object Identifier: doi:10.1111/j.1467-9868.2010.00740.x - Mohammadi, A. and Wit, E. C. (2015). Bayesian structure learning in sparse Gaussian graphical models. Bayesian Anal. 10 109–138.
- Oates, C. J. and Mukherjee, S. (2012). Network inference and biological dynamics. Ann. Appl. Stat. 6 1209–1235.Mathematical Reviews (MathSciNet): MR3012527
Zentralblatt MATH: 1257.62108
Digital Object Identifier: doi:10.1214/11-AOAS532
Project Euclid: euclid.aoas/1346418580 - Ormerod, J. T. and Wand, M. P. (2010). Explaining variational approximations. Amer. Statist. 64 140–153.Mathematical Reviews (MathSciNet): MR2757005
Zentralblatt MATH: 1200.65007
Digital Object Identifier: doi:10.1198/tast.2010.09058 - Park, T. and Casella, G. (2008). The Bayesian lasso. J. Amer. Statist. Assoc. 103 681–686.
- Peng, J., Zhou, N. and Zhu, J. (2009). Partial correlation estimation by joint sparse regression models. J. Amer. Statist. Assoc. 104 735–746.Zentralblatt MATH: 06441092
- Porstmann, T., Santos, C. R., Griffiths, B., Cully, M., Wu, M., Leevers, S., Griffiths, J. R., Chung, Y.-L. and Schulze, A. (2008). SREBP activity is regulated by mTORC1 and contributes to akt-dependent cell growth. Cell Metabolism 8 224–236.
- Rajagopalan, M. and Broemeling, L. (1983). Bayesian inference for the variance components in general mixed linear models. Comm. Statist. Theory Methods 12 701–723.
- Ravikumar, P., Wainwright, M. J. and Lafferty, J. D. (2010). High-dimensional Ising model selection using $\ell_{1}$-regularized logistic regression. Ann. Statist. 38 1287–1319.
- Rue, H., Martino, S. and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J. R. Stat. Soc. Ser. B. Stat. Methodol. 71 319–392.
- Schaefer, J., Opgen-Rhein, R. and Strimmer, K. (2006). Reverse engineering genetic networks using the GeneNet package. R News 6/5 50–53.Zentralblatt MATH: 1158.92311
- Schäfer, J. and Strimmer, K. (2005a). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. 4 Art. 32, 28 pp. (electronic).
- Schäfer, J. and Strimmer, K. (2005b). An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21 754–764.
- Scutari, M. (2013). On the prior and posterior distributions used in graphical modelling. Bayesian Anal. 8 505–532.
- Valpola, H. and Honkela, A. (2006). Hyperparameter adaptation in variational Bayes for the gamma distribution. Technical report, Helsinki University of Technology.
- Van Wieringen, W. N. and Peeters, C. F. W. (2016). Ridge estimation of inverse covariance matrices from high-dimensional data. Comput. Statist. Data Anal. 103 284–303.
- Van de Wiel, M. A., Leday, G. G. R., Pardo, L., Rue, H., Van der Vaart, A. W. and Van Wieringen, W. N. (2013). Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors. Biostatistics 14 113–128.
- Wang, H. and Li, S. Z. (2012). Efficient Gaussian graphical model determination under $G$-Wishart prior distributions. Electron. J. Stat. 6 168–198.
- Warton, D. I. (2008). Penalized normal likelihood and ridge regularization of correlation and covariance matrices. J. Amer. Statist. Assoc. 103 340–349.Zentralblatt MATH: 05564493
- West, M. (2003). Bayesian factor regression models in the “large $p$, small $n$” paradigm. In Bayesian Statistics, 7 (Tenerife, 2002) 733–742. Oxford Univ. Press, New York.
- Yajima, M., Telesca, D., Ji, Y. and Muller, P. (2012). Differential patterns of interaction and Gaussian graphical models. COBRA Preprint Series 91.
- Yang, E., Ravikumar, P., Allen, G. and Liu, Z. (2012). Graphical models via generalized linear models. In Advances in Neural Information Processing Systems 25 (P. Bartlett, F. C. N. Pereira, C. J. C. Burges, L. Bottou and K. Q. Weinberger, eds.) 1367–1375.
- Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. J. Mach. Learn. Res. 11 2261–2286.Zentralblatt MATH: 1242.62043
- Yuan, M. and Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94 19–35.
- Yuan, Y., Curtis, C., Caldas, C. and Markowetz, F. (2012). A sparse regulatory network of copy-number driven gene expression reveals putative breast cancer oncogenes. IEEE/ACM Trans Comput Biol Bioinform 9 947–954.
- Zellner, A. and Siow, A. (1980). Posterior odds ratios for selected regression hypotheses. In Bayesian Statistics: Proceedings of the First International Meeting Held in Valencia (Spain), Vol. 1 (J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.). University Press, Valencia.Zentralblatt MATH: 0457.62004
- Zhao, T., Liu, H., Roeder, K., Lafferty, J. and Wasserman, L. (2012). The $\tt{huge}$ package for high-dimensional undirected graph estimation in $\tt{R}$. J. Mach. Learn. Res. 13 1059–1062.
- Zhou, S., Rütimann, P., Xu, M. and Bühlmann, P. (2011). High-dimensional covariance estimation based on Gaussian graphical models. J. Mach. Learn. Res. 12 2975–3026.Zentralblatt MATH: 1280.62065
Supplemental materials
- Technical details and complementary results. We present technical and methodological details regarding the variational approximation and the different methods under comparison in Sections 3 and 4. Furthermore, complementary simulation results are provided.Digital Object Identifier: doi:10.1214/16-AOAS990SUPPSupplemental files available for subscribers.

- You have access to this content.
- You have partial access to this content.
- You do not have access to this content.
More like this
- Optimal Gaussian Approximations to the Posterior for Log-Linear Models with Diaconis–Ylvisaker Priors
Johndrow, James and Bhattacharya, Anirban, Bayesian Analysis, 2017 - Approximate Bayesian Model Selection with the Deviance Statistic
Held, Leonhard, Sabanés Bové, Daniel, and Gravestock, Isaac, Statistical Science, 2015 - Learning a nonlinear dynamical system model of gene regulation: A perturbed steady-state approach
Meister, Arwen, Li, Ye Henry, Choi, Bokyung, and Wong, Wing Hung, The Annals of Applied Statistics, 2013
- Optimal Gaussian Approximations to the Posterior for Log-Linear Models with Diaconis–Ylvisaker Priors
Johndrow, James and Bhattacharya, Anirban, Bayesian Analysis, 2017 - Approximate Bayesian Model Selection with the Deviance Statistic
Held, Leonhard, Sabanés Bové, Daniel, and Gravestock, Isaac, Statistical Science, 2015 - Learning a nonlinear dynamical system model of gene regulation: A perturbed steady-state approach
Meister, Arwen, Li, Ye Henry, Choi, Bokyung, and Wong, Wing Hung, The Annals of Applied Statistics, 2013 - Locally Adaptive Smoothing with Markov Random Fields and Shrinkage Priors
Faulkner, James R. and Minin, Vladimir N., Bayesian Analysis, 2017 - Shrinkage Estimation in Multilevel Normal Models
Morris, Carl N. and Lysy, Martin, Statistical Science, 2012 - Estimation of multiple networks in Gaussian mixture models
Gao, Chen, Zhu, Yunzhang, Shen, Xiaotong, and Pan, Wei, Electronic Journal of Statistics, 2016 - Incorporating Grouping Information in Bayesian Variable Selection with Applications in Genomics
Rockova, Veronika and Lesaffre, Emmanuel, Bayesian Analysis, 2014 - Inference with normal-gamma prior distributions in regression problems
Brown, Philip J. and Griffin, Jim E., Bayesian Analysis, 2010 - Local-Mass Preserving Prior Distributions for Nonparametric Bayesian Models
Lee, Juhee, MacEachern, Steven N., Lu, Yiling, and Mills, Gordon B., Bayesian Analysis, 2014 - A multi-functional analyzer uses parameter constraints to improve the efficiency of model-based gene-set analysis
Wang, Zhishi, He, Qiuling, Larget, Bret, and Newton, Michael A., The Annals of Applied Statistics, 2015
