Open Access
December 2007 Bayesian variable selection and data integration for biological regulatory networks
Shane T. Jensen, Guang Chen, Christian J. Stoeckert, Jr.
Ann. Appl. Stat. 1(2): 612-633 (December 2007). DOI: 10.1214/07-AOAS130

Abstract

A substantial focus of research in molecular biology are gene regulatory networks: the set of transcription factors and target genes which control the involvement of different biological processes in living cells. Previous statistical approaches for identifying gene regulatory networks have used gene expression data, ChIP binding data or promoter sequence data, but each of these resources provides only partial information. We present a Bayesian hierarchical model that integrates all three data types in a principled variable selection framework. The gene expression data are modeled as a function of the unknown gene regulatory network which has an informed prior distribution based upon both ChIP binding and promoter sequence data. We also present a variable weighting methodology for the principled balancing of multiple sources of prior information. We apply our procedure to the discovery of gene regulatory relationships in Saccharomyces cerevisiae (Yeast) for which we can use several external sources of information to validate our results. Our inferred relationships show greater biological relevance on the external validation measures than previous data integration methods. Our model also estimates synergistic and antagonistic interactions between transcription factors, many of which are validated by previous studies. We also evaluate the results from our procedure for the weighting for multiple sources of prior information. Finally, we discuss our methodology in the context of previous approaches to data integration and Bayesian variable selection.

Citation

Download Citation

Shane T. Jensen. Guang Chen. Christian J. Stoeckert, Jr.. "Bayesian variable selection and data integration for biological regulatory networks." Ann. Appl. Stat. 1 (2) 612 - 633, December 2007. https://doi.org/10.1214/07-AOAS130

Information

Published: December 2007
First available in Project Euclid: 30 November 2007

zbMATH: 1126.62104
MathSciNet: MR2415749
Digital Object Identifier: 10.1214/07-AOAS130

Keywords: Bayesian variable selection , data integration , Regulatory networks , transcription factors

Rights: Copyright © 2007 Institute of Mathematical Statistics

Vol.1 • No. 2 • December 2007
Back to Top