Suppr超能文献

一种用于发现重要子网的可扩展方法。

A scalable method for discovering significant subnetworks.

作者信息

Hasan Md Mahmudul, Kavurucu Yusuf, Kahveci Tamer

出版信息

BMC Syst Biol. 2013;7 Suppl 4(Suppl 4):S3. doi: 10.1186/1752-0509-7-S4-S3. Epub 2013 Oct 23.

Abstract

BACKGROUND

Study of biological networks is an essential first step to understand the complex functions they govern in different organisms. The topology of interactions that define how biological networks operate is often determined through high-throughput experiments. Noisy nature of high-throughput experiments, however, can result in multiple alternative network topologies that explain this data equally well. One key step to resolve the differences is to identify the subnetworks which appear significantly more frequently in a biological network data set than expected.

METHOD

We present a method named SiS (Significant Subnetworks) to find subnetworks with the largest probability to appear in a collection of biological networks. We define these subnetworks as the most probable subnetworks. SiS summarizes the interactions in the given collection of networks in a special template network. It uses the template network to guide the search for most probable subnetworks. It computes the lower and upper bound scores on how good the potential solutions are (i.e., the number of input networks that contain the subnetwork). As the search continues, it tightens the bound dynamically and prunes a massive number of unpromising solutions in that process.

RESULTS AND CONCLUSIONS

Experiments on comprehensive data sets depict that the most probable subnetworks found by SiS in a large collection of networks are also very frequent as well. In metabolic network data set, we found that subnetworks in eukaryote are more conserved than those of prokaryote. SiS also scales well to large data sets and subnetworks and runs orders of magnitude faster than an existing method, MULE. Depending on the size of the subnetwork in the same data set, the running time of SiS ranges from a few seconds to minutes; MULE, on the other hand, runs either for hours or does not even finish in days. In human transcription regulatory network data set, SiS finds a large backbone subnetwork that appears frequently regardless of diverse cell types.

摘要

背景

生物网络研究是理解其在不同生物体中所调控的复杂功能的重要第一步。定义生物网络如何运作的相互作用拓扑结构通常通过高通量实验来确定。然而,高通量实验的噪声特性可能导致多种能够同样好地解释这些数据的替代网络拓扑结构。解决这些差异的一个关键步骤是识别在生物网络数据集中出现频率明显高于预期的子网。

方法

我们提出一种名为SiS(显著子网)的方法,用于在生物网络集合中找到出现概率最大的子网。我们将这些子网定义为最可能的子网。SiS在一个特殊的模板网络中总结给定网络集合中的相互作用。它使用模板网络来指导寻找最可能的子网。它计算潜在解决方案优劣程度的上下界分数(即包含该子网的输入网络数量)。随着搜索的继续,它会动态收紧边界,并在此过程中修剪大量没有前途的解决方案。

结果与结论

在综合数据集上的实验表明,SiS在大量网络集合中找到的最可能的子网也非常频繁出现。在代谢网络数据集中,我们发现真核生物中的子网比原核生物中的子网更保守。SiS对于大数据集和子网也具有良好的扩展性,并且比现有方法MULE运行速度快几个数量级。根据同一数据集中子网的大小,SiS的运行时间从几秒到几分钟不等;而MULE则要运行数小时,甚至在数天内都无法完成。在人类转录调控网络数据集中,SiS发现了一个无论细胞类型如何都频繁出现的大型骨干子网。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ecd7/3854656/db19a47fbed5/1752-0509-7-S4-S3-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验