Liu Bingqiang, Zhou Chuan, Li Guojun, Zhang Hanyuan, Zeng Erliang, Liu Qi, Ma Qin
School of Mathematics, Shandong University, Jinan, Shandong, China.
Systems Biology and Biomedical Informatics (SBBI) Laboratory University of Nebraska-Lincoln, Lincoln, NE 68588-0115, USA.
Sci Rep. 2016 Mar 15;6:23030. doi: 10.1038/srep23030.
Regulons are the basic units of the response system in a bacterial cell, and each consists of a set of transcriptionally co-regulated operons. Regulon elucidation is the basis for studying the bacterial global transcriptional regulation network. In this study, we designed a novel co-regulation score between a pair of operons based on accurate operon identification and cis regulatory motif analyses, which can capture their co-regulation relationship much better than other scores. Taking full advantage of this discovery, we developed a new computational framework and built a novel graph model for regulon prediction. This model integrates the motif comparison and clustering and makes the regulon prediction problem substantially more solvable and accurate. To evaluate our prediction, a regulon coverage score was designed based on the documented regulons and their overlap with our prediction; and a modified Fisher Exact test was implemented to measure how well our predictions match the co-expressed modules derived from E. coli microarray gene-expression datasets collected under 466 conditions. The results indicate that our program consistently performed better than others in terms of the prediction accuracy. This suggests that our algorithms substantially improve the state-of-the-art, leading to a computational capability to reliably predict regulons for any bacteria.
调控子是细菌细胞中反应系统的基本单位,每个调控子都由一组转录共调控的操纵子组成。调控子的阐明是研究细菌全局转录调控网络的基础。在本研究中,我们基于准确的操纵子识别和顺式调控基序分析,设计了一种新型的操纵子对之间的共调控评分,它比其他评分能更好地捕捉它们的共调控关系。充分利用这一发现,我们开发了一个新的计算框架,并构建了一个用于调控子预测的新型图模型。该模型整合了基序比较和聚类,使调控子预测问题更易于解决且更加准确。为了评估我们的预测,基于已记录的调控子及其与我们预测的重叠情况设计了一个调控子覆盖评分;并实施了一种改进的Fisher精确检验,以衡量我们的预测与从466种条件下收集的大肠杆菌微阵列基因表达数据集中得出的共表达模块的匹配程度。结果表明,我们的程序在预测准确性方面始终比其他程序表现更好。这表明我们的算法显著改进了现有技术水平,具备了可靠预测任何细菌调控子的计算能力。