The Annals of Applied Statistics

Refining cellular pathway models using an ensemble of heterogeneous data sources

Alexander M. Franks, Florian Markowetz, and Edoardo M. Airoldi

Improving current models and hypotheses of cellular pathways is one of the major challenges of systems biology and functional genomics. There is a need for methods to build on established expert knowledge and reconcile it with results of new high-throughput studies. Moreover, the available sources of data are heterogeneous, and the data need to be integrated in different ways depending on which part of the pathway they are most informative for. In this paper, we introduce a compartment specific strategy to integrate edge, node and path data for refining a given network hypothesis. To carry out inference, we use a local-move Gibbs sampler for updating the pathway hypothesis from a compendium of heterogeneous data sources, and a new network regression idea for integrating protein attributes. We demonstrate the utility of this approach in a case study of the pheromone response MAPK pathway in the yeast S. cerevisiae.

Article information

Ann. Appl. Stat., Volume 12, Number 3 (2018), 1361-1384.

Received: March 2014
Revised: January 2016
First available in Project Euclid: 11 September 2018

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Multi-level modeling statistical network analysis Bayesian inference regulation and signaling dynamics


Franks, Alexander M.; Markowetz, Florian; Airoldi, Edoardo M. Refining cellular pathway models using an ensemble of heterogeneous data sources. Ann. Appl. Stat. 12 (2018), no. 3, 1361--1384. doi:10.1214/16-AOAS915.

Export citation


Supplemental materials

  • Supplementary figures. In this Appendix, we give convergence diagnostics for network statistics, we explore sensitivity of the results to variations in the compartment map, and we give more details about the simulation results.