Tuncay Kagan, Ensman Lisa, Sun Jingjun, Haidar Alaa Abi, Stanley Frank, Trelinski Michael, Ortoleva Peter
Center for Cell and Virus Theory, Indiana University Bloomington, IN 47405, USA.
In Silico Biol. 2007;7(1):21-34.
Transcriptional regulatory network (TRN) discovery using information from a single source does not seem feasible due to lack of sufficient information, resulting in the construction of spurious or incomplete TRNs. A methodology, TRND, that integrates a preliminary TRN, gene expression data and gene ontology is developed to discover TRNs. The method is applied to a comprehensive set of expression data on B cell and a preliminary TRN that included 1,335 genes, 443 transcription factors (TFs) and 4032 gene/TF interactions. Predictions were obtained for 443 TFs and 9,589 genes. 14,616 of 4,247,927 possible gene/TF interactions scored higher than the imposed threshold. Results for three TFs, E2F-4, p130 and c-Myc, were examined in more detail to assess the accuracy of the integrated methodology. Although the training sets for E2F-4 and p130 were rather limited, the activities of these two TFs were found to be highly correlated and a large set of coregulated genes is predicted. These predictions were confirmed with published experimental results not used in the training set. A similar test was run for the c-Myc TF using the comprehensive resource www.myccancergene.org. In addition, correlations between expression of genes that encode TFs and TF activities were calculated and showed that the assumption of TF activity correlates with encoding gene expression might be misleading. The constructed B cell TRN, and scores for individual methodologies and the integrated approach are available at systemsbiology.indiana.edu/trndresults.
由于缺乏足够信息,仅使用单一来源的信息来发现转录调控网络(TRN)似乎并不可行,这会导致构建虚假或不完整的TRN。为此开发了一种名为TRND的方法,该方法整合了初步的TRN、基因表达数据和基因本体来发现TRN。该方法应用于一组全面的B细胞表达数据以及一个包含1335个基因、443个转录因子(TF)和4032个基因/TF相互作用的初步TRN。获得了针对443个TF和9589个基因的预测结果。在4247927个可能的基因/TF相互作用中,有14616个得分高于设定的阈值。对三个TF,即E2F-4、p130和c-Myc的结果进行了更详细的检查,以评估整合方法的准确性。尽管E2F-4和p130的训练集相当有限,但发现这两个TF的活性高度相关,并预测了一大组共调控基因。这些预测通过未用于训练集的已发表实验结果得到了证实。使用综合资源www.myccancergene.org对c-Myc TF进行了类似测试。此外,计算了编码TF的基因表达与TF活性之间的相关性,结果表明TF活性与编码基因表达相关的假设可能会产生误导。构建的B细胞TRN以及各个方法和整合方法的得分可在systemsbiology.indiana.edu/trndresults上获取。