Statistical Science

Computational Discovery of Gene Regulatory Binding Motifs: A Bayesian Perspective

Shane T. Jensen, X. Shirley Liu, Qing Zhou, and Jun S. Liu

The Bayesian approach together with Markov chain Monte Carlo techniques has provided an attractive solution to many important bioinformatics problems such as multiple sequence alignment, microarray analysis and the discovery of gene regulatory binding motifs. The employment of such methods and, more broadly, explicit statistical modeling, has revolutionized the field of computational biology. After reviewing several heuristics-based computational methods, this article presents a systematic account of Bayesian formulations and solutions to the motif discovery problem. Generalizations are made to further enhance the Bayesian approach. Motivated by the need of a speedy algorithm, we also provide a perspective of the problem from the viewpoint of optimizing a scoring function. We observe that scoring functions resulting from proper posterior distributions, or approximations to such distributions, showed the best performance and can be used to improve upon existing motif-finding programs. Simulation analyses and a real-data example are used to support our observation.

Statist. Sci. Volume 19, Number 1 (2004), 188-204.

First available in Project Euclid: 14 July 2004

Gene regulation motif discovery Bayesian models scoring functions optimization Markov chain Monte Carlo


Jensen, Shane T.; Liu, X. Shirley; Zhou, Qing; Liu, Jun S. Computational Discovery of Gene Regulatory Binding Motifs: A Bayesian Perspective. Statist. Sci. 19 (2004), no. 1, 188--204. doi:10.1214/088342304000000107.

