Brazilian Journal of Probability and Statistics

A predictive Bayes factor approach to identify genes differentially expressed: An application to Escherichia coli bacterium data

Francisco Louzada, Erlandson F. Saraiva, Luis Milan, and Juliana Cobre

Full-text: Open access


Identifying genes differentially expressed between a treatment and a control experimental condition is a common task for gene expression data analysts. Standard existing methods are the two-sample t-test, the regularized t-test (Cyber-T) and the Bayesian t-test. In this paper, we propose a Bayesian approach to identify genes differentially expressed based on the posterior probability of the difference calculated via the Bayes factor. In order to calculate the Bayes factor, we use the predictive density that is constructed by using the previously observed gene expression levels. We perform a simulation study with small sample sizes, which is usual in gene expression data analysis, to verify the performance of the proposed method and compare it with the standard ones. The results revel a better performance of the proposed methodology in identification of difference of means and/or variance. The methodology is also illustrated on the Escherichia coli bacterium dataset.

Article information

Braz. J. Probab. Stat., Volume 28, Number 2 (2014), 167-189.

First available in Project Euclid: 4 April 2014

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Gene expression t-test modified t-test Bayesian Inference Bayes factor predictive density


Louzada, Francisco; Saraiva, Erlandson F.; Milan, Luis; Cobre, Juliana. A predictive Bayes factor approach to identify genes differentially expressed: An application to Escherichia coli bacterium data. Braz. J. Probab. Stat. 28 (2014), no. 2, 167--189. doi:10.1214/12-BJPS200.

Export citation


  • Aitkin, M. (1991). Posterior Bayes factor. Journal of the Royal Statistical Society, Ser. B 53, 111–142.
  • Allison, D. B., Cui, X., Page, G. P. and Sabripour, M. (2006). Microarray data analysis: From disarray to consolidation and consensus. Nature Reviews Genetics 7, 55–65.
  • Arfin, S. M., Long, A. D., Ito, E. T., Tolleri, L., Riehle, M. M., Paegle, E. S. and Hatfield, G. W. (2000). Global gene expression profiling in Escherichia coli K12. The Journal of Biological Chemistry 275, 29672–29684.
  • Baldi, P. and Long, D. A. A. (2001). Bayesian framework for the analysis of microarray expression data: Regularized t-test and statistical inferences of gene changes. Bioinformatics 17, 509–519.
  • Berger, J. O. and Pericchi, L. R. (1996). The intrinsic Bayes factor for model selection and prediction. Journal of the American Statistical Association 91, 109–122.
  • DeRisi, J. L., Iyer, V. R. and Brown, P. O. (1997). Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 680–668.
  • Fox, R. J. and Dimmic, M. W. (2006). A two-sample Bayesian t-test for microarray data. BMC Bioinformatics 7, 126.
  • Hatifield, G. W., Hung, S. and Baldi, P. (2003). Differential analysis of DNA microarray gene expression data. Molecular Microbiology 47, 871–877.
  • Kass, R. and Raftery, A. (1995). Bayes factor. Journal of the American Statistical Association 90, 773–795.
  • Lavine, M. and Schervish, M. J. (1999). Bayes factor: What they are and what they are not. The American Statistician 53, 119–122.
  • Lönnstedt, I. and Speed, T. (2001). Replicated microarray data. Statistica Sinica 12, 31–46.
  • Medvedovic, M. and Sivaganesan, S. (2002). Bayesian infinite mixture model based clustering of gene expression profiles. Bioinformatics 18, 1194–1206.
  • Richardson, S. and Green, P. J. (1997). On Bayesian analysis of mixture with unknown number of components. Journal of the Royal Statistical Society 59, 731–792.
  • Schena, M., Shalon, D., Davis, R. W. and Brown, P. O. (1995). Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470.
  • Sinharay, S. and Stern, H. S. (2002). On the sensitivity of Bayes factors to the prior distributions. The American Statistician 56, 196–201.
  • Stephens, M. (2000). Bayesian analysis of mixture models with an unknown number of components—An alternative to reversible jump method. The Annals of Statistics 28, 40–74.
  • Wu, T. D. (2001). Analyzing gene expression data from DNA microarray to identify candidates genes. Journal of Pathology 195, 53–65.