The Annals of Applied Statistics

A hidden spatial-temporal Markov random field model for network-based analysis of time course gene expression data

Zhi Wei and Hongzhe Li

Full-text: Open access

Abstract

Microarray time course (MTC) gene expression data are commonly collected to study the dynamic nature of biological processes. One important problem is to identify genes that show different expression profiles over time and pathways that are perturbed during a given biological process. While methods are available to identify the genes with differential expression levels over time, there is a lack of methods that can incorporate the pathway information in identifying the pathways being modified/activated during a biological process. In this paper we develop a hidden spatial-temporal Markov random field (hstMRF)-based method for identifying genes and subnetworks that are related to biological processes, where the dependency of the differential expression patterns of genes on the networks are modeled over time and over the network of pathways. Simulation studies indicated that the method is quite effective in identifying genes and modified subnetworks and has higher sensitivity than the commonly used procedures that do not use the pathway structure or time dependency information, with similar false discovery rates. Application to a microarray gene expression study of systemic inflammation in humans identified a core set of genes on the KEGG pathways that show clear differential expression patterns over time. In addition, the method confirmed that the TOLL-like signaling pathway plays an important role in immune response to endotoxins.

Article information

Source
Ann. Appl. Stat. Volume 2, Number 1 (2008), 408-429.

Dates
First available in Project Euclid: 24 March 2008

Permanent link to this document
https://projecteuclid.org/euclid.aoas/1206367827

Digital Object Identifier
doi:10.1214/07--AOAS145

Mathematical Reviews number (MathSciNet)
MR2415609

Zentralblatt MATH identifier
1137.62081

Keywords
Iterative conditional modes pathway undirected graph differential expression

Citation

Wei, Zhi; Li, Hongzhe. A hidden spatial-temporal Markov random field model for network-based analysis of time course gene expression data. Ann. Appl. Stat. 2 (2008), no. 1, 408--429. doi:10.1214/07--AOAS145. https://projecteuclid.org/euclid.aoas/1206367827.


Export citation

References

  • Aderem, A. and Ulevitch, R. J. (2000). Toll-like receptors in the induction of the innate immune response., Nature 406 782–787.
  • Barton, G. M. and Medzhitov, R. (2003). Toll-like receptor signaling pathways., Science 300 1524–1525.
  • Basso, R., Margolin, A. A., Stolovitzky, G., Klein, U., Dalla-Favera, R. and Califano, A. (2005). Reverse engineering of regulatory networks in human B cells., Nature Genetics 37 382–390.
  • Besag, J. (1972). Nearest-neighbour systems and the auto-logistic model for binary data., J. Roy. Statist. Soc. Ser. B 34 75–83.
  • Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems., J. Roy. Statist. Soc. Ser. B 36 192–225.
  • Besag, J. (1986). On the statistical analysis of dirty pictures., J. Roy. Statist. Soc. Ser. B 48 259–302.
  • Calvano, S. E., Xiao, W., Richards, D. R., Felciano, R. M., Baker, H. V., Cho, R. J., Chen, R. O., Brownstein, B. H., Cobb, J. P., Tschoeke, S. K., Miller-Graziano, C., Moldawer, L. L., Mindrinos, M. N., Davis, R. W., Tompkins, R. G. and Lowry, S. F. (2005). A network-based analysis of systemic inflammation in humans., Nature 437 1032–1037.
  • Dennis, G., Sherman, B. T., Hosack, D. A., Yang, J., Gao, W., Lane, H. C. and Lempicki, R. A. (2003). DAVID: Database for annotation, visualization and integrated discovery., Genome Biology 4 P3.
  • Han, J. and Ulevitch, R. J. (2005). Limiting inflammatory responses during activation of innate immunity., Nature Immunology 6 1198–1205.
  • Hong, F. X. and Li, H. (2006). Functional hierarchical models for identifying genes with different time-course expression profiles., Biometrics 62 534–544.
  • Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U. and Speed, T. P. (2003). Exploration, normalization, and summaries of high density oligonucleotide array probe level data., Biostatistics 4 249–264.
  • Kanehisa, M. and Goto, S. (2002). KEGG: Kyoto encyclopedia of genes and genomes., Nucleic Acids Research 28 27–30.
  • Kendziorski, C. M., Newton, M. A., Lan, H. and Gould, M. N. (2003). On paramertic empirical Bayes methods for comparing multiple groups using replicated gene expression profiles., Statistics in Medicine 22 3899–3914.
  • Nacu, S., Critchley-Thorne, R., Lee, P. and Holmes, S. (2006). Gene expression network analysis, and applications to immunity. Technical report, Dept. Statistics, Stanford, Univ.
  • Newton, M. A., Kendziorski, C. M., Richmond, C. S., Blattner, F. R. and Tsui, K. W. (2001). On differential variability of expression ratios: Improving statistical inference about gene expression changes from micorarray data., J. Comput. Biol. 8 37–52.
  • Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition., Proc. IEEE 77 257–286.
  • Rahnenführer, J., Domingues, F., Maydt, J. and Lengauer, T. (2004). Calculating the statistical significance of changes in pathway activity from gene expression data., Statist. Appl. Genet. Mol. Biol. 3 Article 16.
  • Rapaport, F., Zinovyev, A., Dutreix, M., Barillot, E. and Vert, J. P. (2007). Classification of microarray data using gene networks., BMC Bioinformatics 8 35.
  • Sivachenko, A., Yuriev, A., Daraselia, N. and Mazo, I. (2005). Identifying local gene expression patterns in biomolecular networks., Proceedings of 2005 IEEE Computational Systems Bioinformatics Conference, Stanford, California.
  • Storey, J. D., Xiao, W., Leek, J. T., Dai, J. Y., Tompkins, R. G. and Davis, R. W. (2005). Significance analysis of time course microarray experiments., Proc. Nat. Acad. Sci. 102 12837–12842.
  • Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L., Gillette, M. A., Paulovich, A., Pomeroy, S. L., Golub, T. R., Lander, E. S. and Mesirov, J. P. (2005). Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles., Proc. Natl. Acad. Sci. 102 15545–15550.
  • Tai, Y. C. and Speed, T. (2006). A multivariate empirical Bayes statistic for replicated microarray time course data., Ann. Statist. 34 2387–2412.
  • Takeda, K., Kaisho, T. and Akira, S. (2003). Toll-like receptors., Annual Review of Immonology 21 335–376.
  • Wei, Z. and Li, H. (2007). A Markov random field model for network-based analysis of genomic data., Bioinformatics 23 1537–1544.
  • Wei, Z. and Li, H. (2008). Supplement to “A hidden spatial-temporal Markov random field model for network-based analysis of time course gene expression data.” DOI:, 10.1214/07-AOAS145SUPP.
  • Yuan, M. and Kendziorski, C. (2006). Hidden Markov models for microarray time course data under multiple biological conditions (with discussion)., J. Amer. Statist. Assoc. 101 1323–1340.
  • Zhu, J., Huang, H.-C. and Wu, J. (2005). Modeling spatial-temporal binary data using Markov random fields., J. Agricultural, Biological, Environmental Statistics 10 212–225.

Supplemental materials