The Annals of Applied Statistics

Bayesian hierarchical modeling for signaling pathway inference from single cell interventional data

Ruiyan Luo and Hongyu Zhao

Full-text: Open access


Recent technological advances have made it possible to simultaneously measure multiple protein activities at the single cell level. With such data collected under different stimulatory or inhibitory conditions, it is possible to infer the causal relationships among proteins from single cell interventional data. In this article we propose a Bayesian hierarchical modeling framework to infer the signaling pathway based on the posterior distributions of parameters in the model. Under this framework, we consider network sparsity and model the existence of an association between two proteins both at the overall level across all experiments and at each individual experimental level. This allows us to infer the pairs of proteins that are associated with each other and their causal relationships. We also explicitly consider both intrinsic noise and measurement error. Markov chain Monte Carlo is implemented for statistical inference. We demonstrate that this hierarchical modeling can effectively pool information from different interventional experiments through simulation studies and real data analysis.

Article information

Ann. Appl. Stat., Volume 5, Number 2A (2011), 725-745.

First available in Project Euclid: 13 July 2011

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Bayesian network dependency network Gaussian graphical model hierarchical model interventional data Markov chain Monte Carlo mixture distribution single cell measurements signaling pathway


Luo, Ruiyan; Zhao, Hongyu. Bayesian hierarchical modeling for signaling pathway inference from single cell interventional data. Ann. Appl. Stat. 5 (2011), no. 2A, 725--745. doi:10.1214/10-AOAS425.

Export citation


  • Dobra, A., Hans, C., Jones, B., Nevins, J., Yao, G. and West, M. (2004). Sparse graphical models for exploring gene expression data. J. Multivariate Anal. 90 196–212.
  • Ellis, B. and Wong, W. H. (2008). Learning causal Bayesian network structures from experimental data. J. Amer. Statist. Assoc. 103 778–789.
  • Friedman, N. and Killer, D. (2003). Being Bayesian about network structure. Machine Learning 50 95–126.
  • Heckerman, D., Chickering, D. M., Meek, C., Rounthwaite, R. and Kadie, C. (2000). Dependency networks for inference, collaborative filtering, and data visulization. J. Mach. Learn. Res. 1 49–75.
  • Herzenberg, L. A., Parks, D., Sahaf, B., Perez, O., Roederer, M. and Herzenberg, L. A. (2002). The history and future of the fluorescence activated cell sorter and flow cytometry: A view from Stanford. Clinical Chemistry 48 1819–1827.
  • Lauritzen, S. L. (1996). Graphical Models. Clarendon Press, Oxford.
  • Liu, Y. and Ringnér, M. (2007). Revealing signaling pathway deregulation by using gene expression signatures and regulatory motif analysis. Genome Biology 8 R77.1–R77.10.
  • Luo, R. and Zhao, H. (2010). Supplementary material for “Bayesian hierarchical modeling for signaling pathway inference from single cell interventional data.” DOI: 10.1214/10-AOAS425SUPP.
  • Pe’er, D. (2005). Bayesian network analysis of signaling networks: A primer. Science’s STKE 281 1–12.
  • Pe’er, D., Regev, A., Elidan, G. and Friedman, N. (2001). Inferring subnetworks from perturbed expression profiles. Bioinformatics 17 Suppl. S215–S224.
  • Perez, O. D. and Nolan, G. (2002). Simultaneous measurement of multiple active kinase states using polychromatic flow cytometry. Nature Biotechnology 20 155–162.
  • Sachs, K., Perez, O., Pe‘er, D., Lauffenburger, D. A. and Nolan, G. P. (2005). Causal protein-signaling networks derived from multiparameter single-cell data. Science 308 523–529.
  • Schäfer, J. and Strimmer, K. (2005). An empirical Bayes approach to inferring large-scale gene association networks. Bioinformatics 21 754–764.
  • Wei, W. and Li, H. (2007). A Markov random field model for network-based analysis of genomic data. Bioinformatics 23 1537–1544.
  • Wei, W. and Li, H. (2008). A hidden spatial–temporal Markov random field model for network-based analysis of time course gene expression data. Ann. Appl. Statist. 2 408–429.
  • Werhli, A. V., Grzegorczyk, M. and Husmeier, D. (2006). Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks. Systems Biology 22 2523–2531.

Supplemental materials

  • Supplementary material: Additional descriptions and results of hierarchical models. Materials include description and simulation results of the hierarchical model (mHM) with varying variances of intrinsic noises (σ_{ik}^I)^2, MCMC algorithm for the hierarchical model (HM), direction inference for the restricted hierarchical model (RHM), and additional figures of posterior inference and networks.