Bioinformatics and Computational Biology Program, Iowa State University, Ames, USA.
BMC Bioinformatics. 2011 Jun 13;12:233. doi: 10.1186/1471-2105-12-233.
Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element.
This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to E. coli data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of E. coli during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by using our reconstructed network.
The GTRNetwork algorithm introduces the hidden layer TFA into classic relevance score-based gene regulatory network reconstruction processes. Integrating the TFA biological information with regulatory network reconstruction algorithms significantly improves both detection of new links and reduces that rate of false positives. The application of GTRNetwork on E. coli gene transcriptome data gives a set of potential regulatory links with promising biological significance for isobutanol stress and other conditions.
基因调控网络在生物体内发挥着至关重要的作用,控制着生物的生长、维持内部新陈代谢并对外界环境变化做出响应。了解调控因子的连接和活性水平对于研究基因调控网络至关重要。虽然基于相关性评分的算法可以从转录组数据中推断出全基因组的基因调控网络,但不幸的是,这些算法容易产生假阳性结果。转录因子活性(TFA)定量反映了转录因子调节靶基因的能力。然而,经典的基于相关性评分的基因调控网络重建算法使用的模型不包括 TFA 层,因此错过了一个关键的调控元件。
本工作将 TFA 预测算法与基于相关性评分的网络重建算法相结合,以重建基因调控网络,与经典的基于相关性评分的算法相比,准确性得到了提高。该方法称为基于基因表达和转录因子活性的相关性网络(GTRNetwork)。应用了不同的 TFA 预测算法和相关性评分函数组合,以找到最有效的组合。当将集成的 GTRNetwork 方法应用于大肠杆菌数据时,重建的全基因组基因调控网络预测了 381 个新的调控链接。这个包括预测新调控链接的重建基因调控网络具有有前景的生物学意义。许多新链接通过已知的 TF 结合位点信息得到验证,许多其他链接可以从文献和数据库(如 EcoCyc)中得到验证。重建的基因调控网络应用于大肠杆菌在异丁醇胁迫下的最近转录组分析。除了原始论文中检测到的 16 个显著变化的 TFAs 外,使用我们重建的网络还检测到了另外 7 个显著变化的 TFAs。
GTRNetwork 算法将隐藏层 TFA 引入到经典的基于相关性评分的基因调控网络重建过程中。将 TFA 的生物学信息与调控网络重建算法相结合,显著提高了新链接的检测能力,并降低了假阳性率。GTRNetwork 在大肠杆菌基因转录组数据上的应用为异丁醇胁迫和其他条件下提供了一组具有潜在调控意义的链接。