Abstract
Network models are useful tools for modelling complex associations. In statistical omics such models are increasingly popular for identifying and assessing functional relationships and pathways. If a Gaussian graphical model is assumed, conditional independence is determined by the nonzero entries of the inverse covariance (precision) matrix of the data. The Bayesian graphical horseshoe estimator provides a robust and flexible framework for precision matrix inference, as it introduces local, edge-specific parameters which prevent over-shrinkage of nonzero off-diagonal elements. However, its applicability is currently limited in statistical omics settings, which often involve high-dimensional data from multiple conditions that might share common structures. We propose: (i) a scalable expectation conditional maximisation (ECM) algorithm for the original graphical horseshoe and (ii) a novel joint graphical horseshoe estimator, which borrows information across multiple related networks to improve estimation. We show numerically that our single-network ECM approach is more scalable than the existing graphical horseshoe Gibbs implementation, while achieving the same level of accuracy. We also show that our joint-network proposal successfully leverages shared edge-specific information between networks while still retaining differences, outperforming state-of-the-art methods at any level of network similarity. Finally, we leverage our approach to clarify gene regulation activity within and across immune stimulation conditions in monocytes, and formulate hypotheses on the pathogenesis of immune-mediated diseases.
Funding Statement
This research is funded by the UK Medical Research Council programme MRC MC UU 00002/10 (C.L., H.R. and S.R.), Aker Scholarship (C.L.), Lopez–Loreta Foundation (H.R.) and Wellcome Intermediate Clinical Fellowship 01488/Z/16/Z (B.P.F.).
Acknowledgments
C. Lingjærde, H. Ruffieux and S. Richardson developed the methodology. C. Lingjærde developed the software, performed the data analysis and drafted the manuscript. B. P. Fairfax contributed to the data anlysis. All authors have contributed to the interpretation of results. All authors have contributed to the manuscript, and read and approved the final version.
Software
The ECM graphical horseshoe approach for single or multiple network inference has been implemented in the R packages fastGHS and jointGHS, with all subroutines implemented in C for computational efficiency.
Data Availability
The CD14 monocyte gene expression data used in this study has been generated with HumanHT-12 v4 arrays and freely available for downloading from ArrayExpress45 (accession E-MTAB-2232, Fairfax et al. (2012, 2014))
Citation
Camilla Lingjærde. Benjamin P. Fairfax. Sylvia Richardson. Hélène Ruffieux. "Scalable multiple network inference with the joint graphical horseshoe." Ann. Appl. Stat. 18 (3) 1899 - 1923, September 2024. https://doi.org/10.1214/23-AOAS1863
Information