Open Access
June 2007 Coupling hidden Markov models for the discovery of Cis-regulatory modules in multiple species
Qing Zhou, Wing Hung Wong
Ann. Appl. Stat. 1(1): 36-65 (June 2007). DOI: 10.1214/07-AOAS103


Cis-regulatory modules (CRMs) composed of multiple transcription factor binding sites (TFBSs) control gene expression in eukaryotic genomes. Comparative genomic studies have shown that these regulatory elements are more conserved across species due to evolutionary constraints. We propose a statistical method to combine module structure and cross-species orthology in de novo motif discovery. We use a hidden Markov model (HMM) to capture the module structure in each species and couple these HMMs through multiple-species alignment. Evolutionary models are incorporated to consider correlated structures among aligned sequence positions across different species. Based on our model, we develop a Markov chain Monte Carlo approach, MultiModule, to discover CRMs and their component motifs simultaneously in groups of orthologous sequences from multiple species. Our method is tested on both simulated and biological data sets in mammals and Drosophila, where significant improvement over other motif and module discovery methods is observed.


Download Citation

Qing Zhou. Wing Hung Wong. "Coupling hidden Markov models for the discovery of Cis-regulatory modules in multiple species." Ann. Appl. Stat. 1 (1) 36 - 65, June 2007.


Published: June 2007
First available in Project Euclid: 29 June 2007

zbMATH: 1129.62111
MathSciNet: MR2393840
Digital Object Identifier: 10.1214/07-AOAS103

Keywords: Cis-regulatory module , comparative genomics , coupled hidden Markov model , dynamic programming , Markov chain Monte Carlo , motif discovery

Rights: Copyright © 2007 Institute of Mathematical Statistics

Vol.1 • No. 1 • June 2007
Back to Top