Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China.
Bioinformatics. 2014 Mar 15;30(6):823-30. doi: 10.1093/bioinformatics/btt596. Epub 2013 Nov 5.
Limited cohort of transcription factors is capable to structure various gene-expression patterns. Transcriptional cooperativity (TC) is deemed to be the main mechanism of complexity and precision in regulatory programs. Although many data types generated from numerous experimental technologies are utilized in an attempt to understand combinational transcriptional regulation, complementary computational approach that can integrate diverse data resources and assimilate them into biological model is still under development.
We developed a novel Bayesian approach for integrative analysis of proteomic, transcriptomic and genomic data to identify specific TC. The model evaluation demonstrated distinguishable power of features derived from distinct data sources and their essentiality to model performance. Our model outperformed other classifiers and alternative methods. The application that contextualized TC within hepatocarcinogenesis revealed carcinoma associated alterations. Derived TC networks were highly significant in capturing validated cooperativity as well as revealing novel ones. Our methodology is the first multiple data integration approach to predict dynamic nature of TC. It is promising in identifying tissue- or disease-specific TC and can further facilitate the interpretation of underlying mechanisms for various physiological conditions.
Supplementary data are available at Bioinformatics online.
有限的转录因子群体能够构建各种基因表达模式。转录协同作用(TC)被认为是调控程序复杂性和精确性的主要机制。尽管许多来自众多实验技术的不同类型的数据被用于尝试理解组合转录调控,但能够整合各种数据资源并将其整合到生物模型中的互补计算方法仍在开发中。
我们开发了一种新的贝叶斯方法,用于整合蛋白质组学、转录组学和基因组学数据的分析,以识别特定的 TC。模型评估表明,来自不同数据源的特征具有可区分的能力,并且对模型性能至关重要。我们的模型优于其他分类器和替代方法。该应用程序将 TC 置于肝癌发生的背景下,揭示了与癌症相关的改变。衍生的 TC 网络在捕获验证的协同作用以及揭示新的协同作用方面非常重要。我们的方法是第一个用于预测 TC 动态性质的多数据集成方法。它有望识别组织或疾病特异性 TC,并进一步促进对各种生理条件下潜在机制的解释。
补充数据可在 Bioinformatics 在线获取。