Suppr超能文献

用于蛋白质相互作用平衡学习的反对称框架。

Anti-symmetric framework for balanced learning of protein-protein interactions.

机构信息

School of Modern Posts, Nanjing University of Posts and Telecommunications, Nanjing 210023, China.

School of Artificial Intelligence, Jilin University, Changchun 130012, China.

出版信息

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae603.

Abstract

MOTIVATION

Protein-protein interactions (PPIs) are essential for the regulation and facilitation of virtually all biological processes. Computational tools, particularly those based on deep learning, are preferred for the efficient prediction of PPIs. Despite recent progress, two challenges remain unresolved: (i) the imbalanced nature of PPI characteristics is often ignored and (ii) there exists a high computational cost associated with capturing long-range dependencies within protein data, typically exhibiting quadratic complexity relative to the length of the protein sequence.

RESULT

Here, we propose an anti-symmetric graph learning model, BaPPI, for the balanced prediction of PPIs and extrapolation of the involved patterns in PPI network. In BaPPI, the contextualized information of protein data is efficiently handled by an attention-free mechanism formed by recurrent convolution operator. The anti-symmetric graph convolutional network is employed to model the uneven distribution within PPI networks, aiming to learn a more robust and balanced representation of the relationships between proteins. Ultimately, the model is updated using asymmetric loss. The experimental results on classical baseline datasets demonstrate that BaPPI outperforms four state-of-the-art PPI prediction methods. In terms of Micro-F1, BaPPI exceeds the second-best method by 6.5% on SHS27K and 5.3% on SHS148K. Further analysis of the generalization ability and patterns of predicted PPIs also demonstrates our model's generalizability and robustness to the imbalanced nature of PPI datasets.

AVAILABILITY AND IMPLEMENTATION

The source code of this work is publicly available at https://github.com/ttan6729/BaPPI.

摘要

动机

蛋白质-蛋白质相互作用(PPIs)对于几乎所有生物过程的调节和促进都是至关重要的。计算工具,特别是基于深度学习的工具,是高效预测 PPIs 的首选。尽管最近取得了进展,但仍有两个挑战尚未解决:(i)PPIs 特征的不平衡性质经常被忽略,(ii)在捕获蛋白质数据中的长程依赖关系时存在高计算成本,通常相对于蛋白质序列的长度呈二次复杂度。

结果

在这里,我们提出了一种反对称图学习模型 BaPPI,用于平衡预测 PPIs 并外推 PPI 网络中涉及的模式。在 BaPPI 中,通过由递归卷积运算符形成的无注意力机制来有效地处理蛋白质数据的上下文信息。使用反对称图卷积网络来模拟 PPI 网络中的不均匀分布,旨在学习更稳健和平衡的蛋白质之间关系表示。最终,使用不对称损失来更新模型。在经典基准数据集上的实验结果表明,BaPPI 优于四种最先进的 PPI 预测方法。在 Micro-F1 方面,BaPPI 在 SHS27K 上比第二好的方法高出 6.5%,在 SHS148K 上高出 5.3%。对预测 PPIs 的泛化能力和模式的进一步分析也表明,我们的模型对 PPI 数据集的不平衡性质具有通用性和稳健性。

可用性和实现

这项工作的源代码可在 https://github.com/ttan6729/BaPPI 上公开获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/94ae/11513017/99ac1167af89/btae603f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验