Sinha Saurabh, van Nimwegen Erik, Siggia Eric D
Center for Studies in Physics and Biology, The Rockefeller University, New York, NY 0021, USA.
Bioinformatics. 2003;19 Suppl 1:i292-301. doi: 10.1093/bioinformatics/btg1040.
The discovery of cis-regulatory modules in metazoan genomes is crucial for understanding the connection between genes and organism diversity.
We develop a computational method that uses Hidden Markov Models and an Expectation Maximization algorithm to detect such modules, given the weight matrices of a set of transcription factors known to work together. Two novel features of our probabilistic model are: (i) correlations between binding sites, known to be required for module activity, are exploited, and (ii) phylogenetic comparisons among sequences from multiple species are made to highlight a regulatory module. The novel features are shown to improve detection of modules, in experiments on synthetic as well as biological data.
后生动物基因组中顺式调控模块的发现对于理解基因与生物多样性之间的联系至关重要。
我们开发了一种计算方法,该方法使用隐马尔可夫模型和期望最大化算法来检测此类模块,前提是已知一组共同起作用的转录因子的权重矩阵。我们概率模型的两个新特性是:(i)利用了已知模块活性所需的结合位点之间的相关性,以及(ii)对来自多个物种的序列进行系统发育比较以突出显示调控模块。在合成数据和生物数据的实验中,这些新特性被证明可以提高模块的检测率。