Suppr超能文献

基于数据的网络对齐。

Data-driven network alignment.

机构信息

Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, United States of America.

Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, United States of America.

出版信息

PLoS One. 2020 Jul 2;15(7):e0234978. doi: 10.1371/journal.pone.0234978. eCollection 2020.

Abstract

In this study, we deal with the problem of biological network alignment (NA), which aims to find a node mapping between species' molecular networks that uncovers similar network regions, thus allowing for the transfer of functional knowledge between the aligned nodes. We provide evidence that current NA methods, which assume that topologically similar nodes (i.e., nodes whose network neighborhoods are isomorphic-like) have high functional relatedness, do not actually end up aligning functionally related nodes. That is, we show that the current topological similarity assumption does not hold well. Consequently, we argue that a paradigm shift is needed with how the NA problem is approached. So, we redefine NA as a data-driven framework, called TARA (data-driven NA), which attempts to learn the relationship between topological relatedness and functional relatedness without assuming that topological relatedness corresponds to topological similarity. TARA makes no assumptions about what nodes should be aligned, distinguishing it from existing NA methods. Specifically, TARA trains a classifier to predict whether two nodes from different networks are functionally related based on their network topological patterns (features). We find that TARA is able to make accurate predictions. TARA then takes each pair of nodes that are predicted as related to be part of an alignment. Like traditional NA methods, TARA uses this alignment for the across-species transfer of functional knowledge. TARA as currently implemented uses topological but not protein sequence information for functional knowledge transfer. In this context, we find that TARA outperforms existing state-of-the-art NA methods that also use topological information, WAVE and SANA, and even outperforms or complements a state-of-the-art NA method that uses both topological and sequence information, PrimAlign. Hence, adding sequence information to TARA, which is our future work, is likely to further improve its performance. The software and data are available at http://www.nd.edu/~cone/TARA/.

摘要

在这项研究中,我们处理了生物网络对齐(NA)的问题,其目的是在物种的分子网络之间找到节点映射,从而揭示相似的网络区域,从而可以在对齐的节点之间转移功能知识。我们提供的证据表明,当前的 NA 方法假设拓扑相似的节点(即网络邻居是同构的节点)具有高度的功能相关性,但实际上并没有对齐功能相关的节点。也就是说,我们表明当前的拓扑相似性假设并不能很好地成立。因此,我们认为需要改变解决 NA 问题的方法。因此,我们将 NA 重新定义为一种数据驱动的框架,称为 TARA(数据驱动的 NA),它试图在不假设拓扑相似性的情况下学习拓扑相关性和功能相关性之间的关系。TARA 对应该对齐哪些节点没有任何假设,这使其有别于现有的 NA 方法。具体来说,TARA 训练一个分类器来根据来自不同网络的两个节点的网络拓扑模式(特征)来预测它们是否具有功能相关性。我们发现 TARA 能够做出准确的预测。然后,TARA 将预测为相关的每对节点作为对齐的一部分。与传统的 NA 方法一样,TARA 使用这种对齐来进行跨物种的功能知识转移。TARA 目前实现的功能是使用拓扑信息而不是蛋白质序列信息进行功能知识转移。在这种情况下,我们发现 TARA 优于使用拓扑信息的现有最先进的 NA 方法,WAVE 和 SANA,甚至优于或补充了使用拓扑和序列信息的最先进的 NA 方法 PrimAlign。因此,在 TARA 中添加序列信息(这是我们未来的工作)很可能会进一步提高其性能。软件和数据可在 http://www.nd.edu/~cone/TARA/ 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a461/7331999/7eaf57b2b9d4/pone.0234978.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验