The Annals of Applied Statistics

Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle

Xiaodan Fan, Saumyadipta Pyne, and Jun S. Liu

Full-text: Open access


The effort to identify genes with periodic expression during the cell cycle from genome-wide microarray time series data has been ongoing for a decade. However, the lack of rigorous modeling of periodic expression as well as the lack of a comprehensive model for integrating information across genes and experiments has impaired the effort for the accurate identification of periodically expressed genes. To address the problem, we introduce a Bayesian model to integrate multiple independent microarray data sets from three recent genome-wide cell cycle studies on fission yeast. A hierarchical model was used for data integration. In order to facilitate an efficient Monte Carlo sampling from the joint posterior distribution, we develop a novel Metropolis–Hastings group move. A surprising finding from our integrated analysis is that more than 40% of the genes in fission yeast are significantly periodically expressed, greatly enhancing the reported 10–15% of the genes in the current literature. It calls for a reconsideration of the periodically expressed gene detection problem.

Article information

Ann. Appl. Stat., Volume 4, Number 2 (2010), 988-1013.

First available in Project Euclid: 3 August 2010

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Cell cycle periodically expressed gene microarray time series meta-analysis fission yeast Schizosaccharomyces pombe Markov chain Monte Carlo


Fan, Xiaodan; Pyne, Saumyadipta; Liu, Jun S. Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle. Ann. Appl. Stat. 4 (2010), no. 2, 988--1013. doi:10.1214/09-AOAS300.

Export citation


  • Ahdesmaki, M., Lahdesmaki, H., Pearson, R., Huttunen, H. and Yli-Harja, O. (2005). Robust detection of periodic time series measured from biological systems. BMC Bioinformatics 6 117.
  • Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Second International Symposium on Information Theory (B. N. Petrox and F. Caski, eds.) 267–281. Akademiai Kiado, Budapest.
  • Bar-Joseph, Z., Siegfried, Z., Brandeis, M., Brors, B., Lu, Y., Eils, R., Dynlacht, B. D. and Simon, I. (2008). Genome-wide transcriptional analysis of the human cell cycle identifies genes differentially regulated in normal and cancer cells. Proc. Natl. Acad. Sci. USA 105 956–961.
  • Chen, D., Toone, W. M., Mata, J., Lyne, R., Burns, G., Kivinen, K., Brazma, A., Jones, N. and Bahler, J. (2003). Global transcriptional responses of fission yeast to environmental stress. Mol. Biol. Cell 14 214–229.
  • Chib, S. (1995). Marginal likelihood from the Gibbs output. J. Amer. Statist. Assoc. 90 1313–1321.
  • Chib, S. and Jeliazkov, I. (2001). Marginal likelihood from the metropolis hastings output. J. Amer. Statist. Assoc. 96 270–281.
  • Cho, R. J., Campbell, M. J., Winzeler, E. A., Steinmetz, L., Conway, A., Wodicka, L., Wolfsberg, T. G., Gabrielian, A. E., Landsman, D., Lockhart, D. J. and Davis, R. W. (1998). A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2 65–73.
  • de Lichtenberg, U., Jensen, L. J., Fausboll, A., Jensen, T. S., Bork, P. and Brunak, S. (2005). Comparison of computational methods for the identification of cell cycle-regulated genes. Bioinformatics 21 1164–1171.
  • Eisen, M. B., Spellman, P. T., Brown, P. O. and Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95 14863–14868.
  • Fan, X., Pyne, S. and Liu, J. S. (2009). Supplement to “Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle.” DOI: 10.1214/09-AOAS300SUPP.
  • Futschik, M. E. and Herzel, H. (2008). Are we overestimating the number of cell-cycling genes? The impact of background models on time-series analysis. Bioinformatics 24 1063–1069.
  • Gelman, A., Meng, X.-L. and Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statist. Sinica 6 733–807.
  • Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82 711–732.
  • Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions. Wiley, New York.
  • Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57 97–109.
  • Hertz-Fowler, C., Peacock, C. S., Wood, V., Aslett, M., Kerhornou, A., Mooney, P., Tivey, A., Berriman, M., Hall, N., Rutherford, K., Parkhill, J., Ivens, A. C., Rajandream, M.-A. and Barrell, B. (2004). GeneDB: A resource for prokaryotic and eukaryotic organisms. Nucl. Acids Res. 32 (Suppl 1) D339–D343.
  • Ishida, S., Huang, E., Zuzan, H., Spang, R., Leone, G., West, M. and Nevins, J. R. (2001). Role for E2F in control of both DNA replication and mitotic functions as revealed from DNA microarray analysis. Mol. Cell. Biol. 21 4684–4699.
  • Johansson, D., Lindgren, P. and Berglund, A. (2003). A multivariate approach applied to microarray data for identification of genes with cell cycle-coupled transcription. Bioinformatics 19 467–473.
  • Klevecz, R. R., Bolen, J., Forrest, G. and Murray, D. B. (2004). From the cover: A genomewide oscillation in transcription gates DNA replication and cell cycle. Proc. Natl. Acad. Sci. USA 101 1200–1205.
  • Lange, K. L., Little, R. J. A. and Taylor, J. M. G. (1989). Robust statistical modeling using the t distribution. J. Amer. Statist. Assoc. 84 881–896.
  • Laub, M. T., McAdams, H. H., Feldblyum, T., Fraser, C. M. and Shapiro, L. (2000). Global analysis of the genetic network controlling a bacterial cell cycle. Science 290 2144–2148.
  • Liu, D., Umbach, D. M., Peddada, S. D., Li, L., Crockett, P. W. and Weinberg, C. R. (2004). A random-periods model for expression of cell-cycle genes. Proc. Natl. Acad. Sci. USA 101 7240–7245.
  • Liu, J. S. (1996). Metropolized independent sampling with comparisons to rejection sampling and importance sampling. Statist. Comput. 6 113–119.
  • Liu, J. S. (2001). Monte Carlo Strategies in Scientific Computing. Springer, New York.
  • Liu, J. S. and Sabatti, C. (2000). Generalised Gibbs sampler and multigrid Monte Carlo for Bayesian computation. Biometrika 87 353–369.
  • Liu, J. S., Wong, W. H. and Kong, A. (1994). Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes. Biometrika 81 27–40.
  • Liu, J. S. and Wu, Y. N. (1999). Parameter expansion for data augmentation. J. Amer. Statist. Assoc. 94 1264–1274.
  • Lu, X., Zhang, W., Qin, Z. S., Kwast, K. E. and Liu, J. S. (2004). Statistical resynchronization and Bayesian detection of periodically expressed genes. Nucl. Acids Res. 32 447–455.
  • Lu, Y., Mahony, S., Benos, P., Rosenfeld, R., Simon, I., Breeden, L. and Bar-Joseph, Z. (2007). Combined analysis reveals a core set of cycling genes. Genome Biology 8 R146.
  • Luan, Y. and Li, H. (2004). Model-based methods for identifying periodically expressed genes based on time course microarray gene expression data. Bioinformatics 20 332–339.
  • Marguerat, S., Jensen, T. S., de Lichtenberg, U., Wilhelm, B. T., Jensen, L. J. and Bahler, J. (2006). The more the merrier: Comparative analysis of microarray studies on cell cycle-regulated genes in fission yeast. Yeast 23 261–277.
  • Menges, M., Hennig, L., Gruissem, W. and Murray, J. A. (2002). Cell cycleregulated gene expression in arabidopsis. J. Biol. Chem. 277 41987–42002.
  • Oliva, A., Rosebrock, A., Ferrezuelo, F., Pyne, S., Chen, H., Skiena, S., Futcher, B. and Leatherwood, J. (2005). The cell cycle-regulated genes of Schizosaccharomyces pombe. PLoS Biology 3 e225.
  • Peng, X., Karuturi, R. K. M., Miller, L. D., Lin, K., Jia, Y., Kondu, P., Wang, L., Wong, L.-S., Liu, E. T., Balasubramanian, M. K. and Liu, J. (2005). Identification of cell cycle-regulated genes in fission yeast. Mol. Biol. Cell 16 1026–1042.
  • Ptitsyn, A. A., Zvonic, S. and Gimble, J. M. (2007). Digital signal processing reveals circadian baseline oscillation in majority of mammalian genes. PLoS Computational Biology 3 e120.
  • Rustici, G., Mata, J., Kivinen, K., Lio, P., Penkett, C. J., Burns, G., Hayles, J., Brazma, A., Nurse, P. and Bahler, J. (2004). Periodic gene expression program of the fission yeast cell cycle. Nature Genetics 36 809–817.
  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461–464.
  • Shedden, K. and Cooper, S. (2002). Analysis of cell-cycle-specific gene expression in human cells as determined by microarrays and double-thymidine block synchronization. Proc. Natl. Acad. Sci. USA 99 4379–4384.
  • Sherr, C. J. (1996). Cancer cell cycles. Science 274 1672–1677.
  • Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D. and Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9 3273–3297.
  • Spiegelhalter, D., Best, N., Carlin, B. and van der Linde, A. (2002). Bayesian measures of model complexity and fit. J. Roy. Statist. Soc. Ser. B 64 583–616.
  • Tsiporkova, E. and Boeva, V. (2008). Fusing time series expression data through hybrid aggregation and hierarchical merge. Bioinformatics 24 i63–i69.
  • Tu, B. P., Kudlicki, A., Rowicka, M. and McKnight, S. L. (2005). Logic of the yeast metabolic cycle: Temporal compartmentalization of cellular processes. Science 310 1152–1158.
  • Whitfield, M. L., Sherlock, G., Saldanha, A. J., Murray, J. I., Ball, C. A., Alexander, K. E., Matese, J. C., Perou, C. M., Hurt, M. M., Brown, P. O. and Botstein, D. (2002). Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol. Biol. Cell 13 1977–2000.
  • Wichert, S., Fokianos, K. and Strimmer, K. (2004). Identifying periodically expressed transcripts in microarray time series data. Bioinformatics 20 5–20.
  • Willbrand, K., Radvanyi, F., Nadal, J.-P., Thiery, J.-P. and Fink, T. M. A. (2005). Identifying genes from up-down properties of microarray expression series. Bioinformatics 21 3859–3864.
  • Zhao, L. P., Prentice, R. and Breeden, L. (2001). Statistical modeling of large microarray data sets to identify stimulus-response profiles. Proc. Natl. Acad. Sci. USA 98 5631–5636.
  • Zhou, C., Wakefield, J. and Breeden, L. (2005). Bayesian analysis of cell-cycle gene expression data. UW Biostatistics Working Paper Series, Working Paper 276.

Supplemental materials

  • Supplementary material: Various supporting materials. In this supplement we provide model fitting diagnoses, hierarchical clustering results, the effect of data size on the statistical power, supporting evidences for newly found genes, and figures referred to in this paper.