In this paper we propose a Bayesian approach for inference about
dependence of high throughput gene expression. Our goals are to
use prior knowledge about pathways to anchor inference about
dependence among genes; to account for this dependence while
making inferences about differences in mean expression across
phenotypes; and to explore differences in the dependence itself
across phenotypes. Useful features of the proposed approach are
a model-based parsimonious representation of expression as an
ordinal outcome, a novel and flexible representation of prior
information on the nature of dependencies, and the use of a
coherent probability model over both the structure and strength
of the dependencies of interest. We evaluate our approach
through simulations and in the analysis of data on expression of
genes in the Complement and Coagulation Cascade pathway in
ovarian cancer.
References
Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data. J. Amer. Statist. Assoc. 88 669–679.
Beal, M., Falciani, F., Ghahramani, Z., Rangel, C. and Wild, D. (2005). A Bayesian approach to reconstructing genetic regulatory networks with hidden factors. Bioinformatics 21 349–356.
Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289–300.
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems. J. Roy. Statist. Soc. Ser. B 36 192–236.
Mathematical Reviews (MathSciNet):
MR373208
Brat, D. J., Bellail, A. C. and Erwin, G. V. M. (2005). The role of interlukin-8 and its receptors in gliomagenesis and tumoral angiogenesis. Neuro-Oncology 7 122–133.
Braun, R., Cope, L. and Parmigiani, G. (2008). Identigying differential correlation in gene/pathway combinations. BMC Bioinformatics 9 488.
Broët, P. and Richardson, S. (2006). Detection of gene copy number changes in CGH microarrays using a spatially correlated mixture model. Bioinformatics 22 911–918.
Brown, P. J., Vannucci, M. and Fearn, T. (1998). Multivariate Bayesian variable selection and prediction. J. R. Stat. Soc. Ser. B Stat. Methodol. 60 627–641.
Carvalho, C. M. and Scott, J. G. (2009). Objective Bayesian model selection in Gaussian graphical models. Biometrika 96 497–512.
Dawid, A. P. and Lauritzen, S. L. (1993). Hyper-Markov laws in the statistical analysis of decomposable graphical models. Ann. Statist. 21 1272–1317.
Dobra, A., Hans, C., Jones, B., Nevins, J. R., Yao, G. and West, M. (2004). Sparse graphical models for exploring gene expression data. J. Multivariate Anal. 90 196–212.
Drton, M. and Perlman, M. D. (2007). Multiple testing and error control in Gaussian graphical model selection. Statist. Sci. 22 430–449.
Friedman, N., Linial, M., Nachman, I. and Pe‘er, D. (2000). Using Bayesian networks to analyze expression data. J. Comput. Biol. 7 601–620.
Fröhlich, H., Speer, N., Poutska, A. and Beibart, T. (2007). GOSim–an R-package for computation of theoretic GO similarities between terms and ene products. BMC Bioinformatics 8 166.
Garrett, E. S. and Parmigiani, G. (2004). A nested unsupervised approach to identifying novel molecular subtypes. Bernoulli 10 951–969.
George, E. I. and McCulloch, R. E. (1993). Variable selection via Gibbs sampling. J. Amer. Statist. Assoc. 88 881–889.
Giudici, P. and Green, P. J. (1999). Decomposable graphical Gaussian model determination. Biometrika 86 785–801.
Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 711–732.
Hoffman, R. and Valencia, A. (2004). A gene network for navigating the literature. Nature Genetics 36 664–664.
Jones, B., Carvalho, C., Dobra, A., Hans, C., Carter, C. and West, M. (2005). Experiments in stochastic computation for high-dimensional graphical models. Statist. Sci. 20 388–400.
Kolaczyk, E. D. (2009). Statistical Analysis of Network Data: Methods and Models. Springer, New York.
Koster, J. T. A. (1996). Markov properties of nonrecursive causal models. Ann. Statist. 24 2148–2177.
Lauritzen, S. L. (1996). Graphical Models. Oxford Statistical Science Series 17. Clarendon Press, New York.
Markiewski, M. M. and Lambris, J. D. (2009). Unwelcome complement. Cancer Research 69 6367.
Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436–1462.
Mistry, M. and Pavlidis, P. (2008). Gene ontology term overlap as a measure of gene functional similarity. BMC Bioinformatics 9 327.
Mukherjee, S. and Speed, T. P. (2008). Netrwork inference using informative priors. PNAS 105 14133–14318.
Murphy, K. and Mian, S. (1999). Modeling gene expression data using dynamic Bayesian networksayesian networks. Technical report, Computer Science Division, Univ. California, Berkley.
Ong, I., Glasner, J. and Page, D. (2002). Modelling regulatory pathways in e.coli from time series expression profiles. Bioinformatics 18 S241–S248.
Parmigiani, G., Garrett, E. S., Anbazhagan, R. and Gabrielson, E. (2002). A statistical framework for expression-based molecular classification in cancer. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 717–736.
Roach, L. E., Petrik, J. J., Plante, L. and LaMarre, J. (2002). Thrombin generation and presence of thrombin in ovarian follicles. Biology of Reproduction 66 1350–1358.
Ronning, G. and Kukuk, M. (1996). Efficient estimation of ordered probit models. J. Amer. Statist. Assoc. 91 1120–1129.
Sabidussi, G. (1966). The centrality index of a graph. Psychometrika 31 581–603.
Mathematical Reviews (MathSciNet):
MR205879
Scott, J. G. and Berger, J. O. (2006). An exploration of aspects of Bayesian multiple testing. J. Statist. Plann. Inference 136 2144–2162.
Scott, J. G. and Carvalho, C. M. (2008). Feature-inclusion stochastic search for Gaussian graphical models. J. Comput. Graph. Statist. 17 790–808.
Sebastiani, P. and Ramoni, M. (2005). Normative selection of Bayesian networks. J. Multivariate Anal. 93 340–357.
Spirtes, P., Richardson, T. S., Meek, C., Scheines, R. and Glymour, C. (1998). Using path diagrams as a structural equation modeling tool. Sociol. Methods Res. 27 182–225.
Telesca, D., Müller, P., Parmigiani, G. and Freedman, R. S. (2011). Supplement to “Modeling dependent gene expression.”
DOI:10.1214/11-AOAS525SUPP.
Terranova, P. F. and Rice, V. M. (1997). Review: Cytokine involvement in ovarian processes. American Journal of Reproductive Immunology 37 50–63.
Wang, X., Wang, E. and Kavanagh, J. (2005). Ovarian cancer, the coagulation pathway, and inflammation. Journal of Translational Medicine 3 25.
Wei, Z. and Li, H. (2007). A Markov random field model for network–based analysis of genomic data. Bioinformatics 23 1357–1544.
Wei, Z. and Li, H. (2008). A hidden spatial-temporal Markov random field model for network-based analysis of time course gene expression data. Ann. Appl. Stat. 2 408–429.