一种可解释的深度几何学习模型，用于使用大规模蛋白质语言模型预测突变对蛋白质-蛋白质相互作用的影响。

Zhang Caiya, Sun Yan, Hu Pingzhao

Department of Computer Science, Western University, London, ON, Canada.

Department of Computer Science, University of Manitoba, Winnipeg, MB, Canada.

J Cheminform. 2025 Mar 21;17(1):35. doi: 10.1186/s13321-025-00979-5.

Protein-protein interactions (PPIs) are central to the mechanisms of signaling pathways and immune responses, which can help us understand disease etiology. Therefore, there is a significant need for efficient and rapid automated approaches to predict changes in PPIs. In recent years, there has been a significant increase in applying deep learning techniques to predict changes in binding affinity between the original protein complex and its mutant variants. Particularly, the adoption of graph neural networks (GNNs) has gained prominence for their ability to learn representations of protein-protein complexes. However, the conventional GNNs have mainly concentrated on capturing local features, often disregarding the interactions among distant elements that hold potential important information. In this study, we have developed a transformer-based graph neural network to extract features of the mutant segment from the three-dimensional structure of protein-protein complexes. By embracing both local and global features, the approach ensures a more comprehensive understanding of the intricate relationships, thus promising more accurate predictions of binding affinity changes. To enhance the representation capability of protein features, we incorporate a large-scale pre-trained protein language model into our approach and employ the global protein feature it provides. The proposed model is shown to be able to predict the mutation changes in binding affinity with a root mean square error of 1.10 and a Pearson correlation coefficient of near 0.71, as demonstrated by performance on test and validation cases. Our experiments on all five datasets, including both single mutant and multiple mutant cases, demonstrate that our model outperforms four state-of-the-art baseline methods, and the efficacy was subjected to comprehensive experimental evaluation. Our study introduces a transformer-based graph neural network approach to accurately predict changes in protein-protein interactions (PPIs). By integrating local and global features and leveraging pretrained protein language models, our model outperforms state-of-the-art methods across diverse datasets. The results of this study can provide new views for studying immune responses and disease etiology related to protein mutations. Furthermore, this approach may contribute to other biological or biochemical studies related to PPIs.Scientific contribution Our scientific contribution lies in the development of a novel transformer-based graph neural network tailored to predict changes in protein-protein interactions (PPIs) with excellent accuracy. By seamlessly integrating both local and global features extracted from the three-dimensional structure of protein-protein complexes, and leveraging the rich representations provided by pretrained protein language models, our approach surpasses existing methods across diverse datasets. Our findings may offer novel insights for the understanding of complex disease etiology associated with protein mutations. The novel tool can be applicable to various biological and biochemical investigations involving protein mutations.

蛋白质-蛋白质相互作用（PPIs）是信号通路和免疫反应机制的核心，有助于我们理解疾病病因。因此，迫切需要高效、快速的自动化方法来预测PPIs的变化。近年来，将深度学习技术应用于预测原始蛋白质复合物与其突变变体之间结合亲和力的变化有了显著增加。特别是，图神经网络（GNNs）因其能够学习蛋白质-蛋白质复合物的表示而备受关注。然而，传统的GNNs主要集中于捕捉局部特征，往往忽略了持有潜在重要信息的远距离元素之间的相互作用。在本研究中，我们开发了一种基于Transformer的图神经网络，从蛋白质-蛋白质复合物的三维结构中提取突变片段的特征。通过兼顾局部和全局特征，该方法确保了对复杂关系更全面的理解，从而有望更准确地预测结合亲和力的变化。为了增强蛋白质特征的表示能力，我们将大规模预训练的蛋白质语言模型纳入我们的方法，并利用其提供的全局蛋白质特征。结果表明，所提出的模型能够预测结合亲和力的突变变化，均方根误差为1.10，皮尔逊相关系数接近0.71，测试和验证案例的性能证明了这一点。我们在包括单突变和多突变案例的所有五个数据集上的实验表明，我们的模型优于四种最先进的基线方法，并且该方法的有效性经过了全面的实验评估。我们的研究引入了一种基于Transformer的图神经网络方法来准确预测蛋白质-蛋白质相互作用（PPIs）的变化。通过整合局部和全局特征并利用预训练的蛋白质语言模型，我们的模型在不同数据集上优于现有方法。本研究结果可为研究与蛋白质突变相关免疫反应和疾病病因提供新的视角。此外，该方法可能有助于其他与PPIs相关的生物学或生物化学研究。

科学贡献我们的科学贡献在于开发了一种新颖的基于Transformer的图神经网络，专门用于以优异的精度预测蛋白质-蛋白质相互作用（PPIs）的变化。通过无缝整合从蛋白质-蛋白质复合物三维结构中提取的局部和全局特征，并利用预训练蛋白质语言模型提供的丰富表示，我们的方法在不同数据集上超越了现有方法。我们的发现可能为理解与蛋白质突变相关的复杂疾病病因提供新的见解。这种新颖的工具可应用于涉及蛋白质突变的各种生物学和生物化学研究。

相似文献

An interpretable deep geometric learning model to predict the effects of mutations on protein-protein interactions using large-scale protein language model.

J Cheminform. 2025 Mar 21;17(1):35. doi: 10.1186/s13321-025-00979-5.

An End-to-End Knowledge Graph Fused Graph Neural Network for Accurate Protein-Protein Interactions Prediction.

IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2518-2530. doi: 10.1109/TCBB.2024.3486216. Epub 2024 Dec 10.

An Integrated Fuzzy Neural Network and Topological Data Analysis for Molecular Graph Representation Learning and Property Forecasting.

Mol Inform. 2025 Mar;44(3):e202400335. doi: 10.1002/minf.202400335.

Predicting protein-protein interaction with interpretable bilinear attention network.

Comput Methods Programs Biomed. 2025 Jun;265:108756. doi: 10.1016/j.cmpb.2025.108756. Epub 2025 Mar 30.

GTC: GNN-Transformer co-contrastive learning for self-supervised heterogeneous graph representation.

Neural Netw. 2025 Jan;181:106645. doi: 10.1016/j.neunet.2024.106645. Epub 2024 Aug 16.

MVGNN-PPIS: A novel multi-view graph neural network for protein-protein interaction sites prediction based on Alphafold3-predicted structures and transfer learning.

Int J Biol Macromol. 2025 Apr;300:140096. doi: 10.1016/j.ijbiomac.2025.140096. Epub 2025 Jan 21.

A Topology-Enhanced Multi-Viewed Contrastive Approach for Molecular Graph Representation Learning and Classification.

Mol Inform. 2025 Jan;44(1):e202400252. doi: 10.1002/minf.202400252.

GNNSeq: A Sequence-Based Graph Neural Network for Predicting Protein-Ligand Binding Affinity.

Pharmaceuticals (Basel). 2025 Feb 26;18(3):329. doi: 10.3390/ph18030329.

PRITrans: A Transformer-Based Approach for the Prediction of the Effects of Missense Mutation on Protein-RNA Interactions.

Int J Mol Sci. 2024 Nov 17;25(22):12348. doi: 10.3390/ijms252212348.

Multiphysical graph neural network (MP-GNN) for COVID-19 drug design.

Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac231.

本文引用的文献

ProS-GNN: Predicting effects of mutations on protein stability using graph neural networks.

Comput Biol Chem. 2023 Dec;107:107952. doi: 10.1016/j.compbiolchem.2023.107952. Epub 2023 Aug 26.

Efficient evolution of human antibodies from general protein language models.

Nat Biotechnol. 2024 Feb;42(2):275-283. doi: 10.1038/s41587-023-01763-2. Epub 2023 Apr 24.

Large language models generate functional protein sequences across diverse families.

Nat Biotechnol. 2023 Aug;41(8):1099-1106. doi: 10.1038/s41587-022-01618-2. Epub 2023 Jan 26.

ScanNet: an interpretable geometric deep learning model for structure-based protein binding site prediction.

Nat Methods. 2022 Jun;19(6):730-739. doi: 10.1038/s41592-022-01490-7. Epub 2022 May 30.

Mining the Protein Data Bank to improve prediction of changes in protein-protein binding.

PLoS One. 2021 Nov 2;16(11):e0257614. doi: 10.1371/journal.pone.0257614. eCollection 2021.

Membrane fusion and immune evasion by the spike protein of SARS-CoV-2 Delta variant.

Science. 2021 Dec 10;374(6573):1353-1360. doi: 10.1126/science.abl9463. Epub 2021 Oct 26.

Deep geometric representations for modeling effects of mutations on protein-protein binding affinity.

PLoS Comput Biol. 2021 Aug 4;17(8):e1009284. doi: 10.1371/journal.pcbi.1009284. eCollection 2021 Aug.

Revealing the Threat of Emerging SARS-CoV-2 Mutations to Antibody Therapies.

J Mol Biol. 2021 Sep 3;433(18):167155. doi: 10.1016/j.jmb.2021.167155. Epub 2021 Jul 14.

A topology-based network tree for the prediction of protein-protein binding affinity changes following mutation.

Nat Mach Intell. 2020;2(2):116-123. doi: 10.1038/s42256-020-0149-6. Epub 2020 Feb 14.

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.

Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

An interpretable deep geometric learning model to predict the effects of mutations on protein-protein interactions using large-scale protein language model.

J Cheminform. 2025 Mar 21;17(1):35. doi: 10.1186/s13321-025-00979-5.

An End-to-End Knowledge Graph Fused Graph Neural Network for Accurate Protein-Protein Interactions Prediction.

IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2518-2530. doi: 10.1109/TCBB.2024.3486216. Epub 2024 Dec 10.

An Integrated Fuzzy Neural Network and Topological Data Analysis for Molecular Graph Representation Learning and Property Forecasting.

Mol Inform. 2025 Mar;44(3):e202400335. doi: 10.1002/minf.202400335.

Predicting protein-protein interaction with interpretable bilinear attention network.

Comput Methods Programs Biomed. 2025 Jun;265:108756. doi: 10.1016/j.cmpb.2025.108756. Epub 2025 Mar 30.

GTC: GNN-Transformer co-contrastive learning for self-supervised heterogeneous graph representation.

Neural Netw. 2025 Jan;181:106645. doi: 10.1016/j.neunet.2024.106645. Epub 2024 Aug 16.

MVGNN-PPIS: A novel multi-view graph neural network for protein-protein interaction sites prediction based on Alphafold3-predicted structures and transfer learning.

Int J Biol Macromol. 2025 Apr;300:140096. doi: 10.1016/j.ijbiomac.2025.140096. Epub 2025 Jan 21.

A Topology-Enhanced Multi-Viewed Contrastive Approach for Molecular Graph Representation Learning and Classification.

Mol Inform. 2025 Jan;44(1):e202400252. doi: 10.1002/minf.202400252.

GNNSeq: A Sequence-Based Graph Neural Network for Predicting Protein-Ligand Binding Affinity.

Pharmaceuticals (Basel). 2025 Feb 26;18(3):329. doi: 10.3390/ph18030329.

PRITrans: A Transformer-Based Approach for the Prediction of the Effects of Missense Mutation on Protein-RNA Interactions.

Int J Mol Sci. 2024 Nov 17;25(22):12348. doi: 10.3390/ijms252212348.

Multiphysical graph neural network (MP-GNN) for COVID-19 drug design.

Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac231.

本文引用的文献

ProS-GNN: Predicting effects of mutations on protein stability using graph neural networks.

Comput Biol Chem. 2023 Dec;107:107952. doi: 10.1016/j.compbiolchem.2023.107952. Epub 2023 Aug 26.

Efficient evolution of human antibodies from general protein language models.

Nat Biotechnol. 2024 Feb;42(2):275-283. doi: 10.1038/s41587-023-01763-2. Epub 2023 Apr 24.

Large language models generate functional protein sequences across diverse families.

Nat Biotechnol. 2023 Aug;41(8):1099-1106. doi: 10.1038/s41587-022-01618-2. Epub 2023 Jan 26.

ScanNet: an interpretable geometric deep learning model for structure-based protein binding site prediction.

Nat Methods. 2022 Jun;19(6):730-739. doi: 10.1038/s41592-022-01490-7. Epub 2022 May 30.

Mining the Protein Data Bank to improve prediction of changes in protein-protein binding.

PLoS One. 2021 Nov 2;16(11):e0257614. doi: 10.1371/journal.pone.0257614. eCollection 2021.

Membrane fusion and immune evasion by the spike protein of SARS-CoV-2 Delta variant.

Science. 2021 Dec 10;374(6573):1353-1360. doi: 10.1126/science.abl9463. Epub 2021 Oct 26.

Deep geometric representations for modeling effects of mutations on protein-protein binding affinity.

PLoS Comput Biol. 2021 Aug 4;17(8):e1009284. doi: 10.1371/journal.pcbi.1009284. eCollection 2021 Aug.

Revealing the Threat of Emerging SARS-CoV-2 Mutations to Antibody Therapies.

J Mol Biol. 2021 Sep 3;433(18):167155. doi: 10.1016/j.jmb.2021.167155. Epub 2021 Jul 14.

A topology-based network tree for the prediction of protein-protein binding affinity changes following mutation.

Nat Mach Intell. 2020;2(2):116-123. doi: 10.1038/s42256-020-0149-6. Epub 2020 Feb 14.

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.

Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.

Suppr
超能文献

An interpretable deep geometric learning model to predict the effects of mutations on protein-protein interactions using large-scale protein language model.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献

Suppr超能文献

一种可解释的深度几何学习模型，用于使用大规模蛋白质语言模型预测突变对蛋白质-蛋白质相互作用的影响。

An interpretable deep geometric learning model to predict the effects of mutations on protein-protein interactions using large-scale protein language model.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献

Suppr
超能文献