• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于蛋白质功能预测的多模态模型。

A multimodal model for protein function prediction.

作者信息

Mao Yu, Xu WenHui, Shun Yue, Chai LongXin, Xue Lei, Yang Yong, Li Mei

机构信息

State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, 430062, Hubei, China.

出版信息

Sci Rep. 2025 Mar 26;15(1):10465. doi: 10.1038/s41598-025-94612-y.

DOI:10.1038/s41598-025-94612-y
PMID:40140535
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11947276/
Abstract

Protein function, which is determined by sequence, structure, and other characteristics, plays a crucial role in an organism's performance. Existing protein function prediction methods mainly rely on sequence data and often ignore structural properties that are crucial for accurate prediction. Protein structure provides richer spatial and functional insights, which can significantly improve prediction accuracy. In this work, we propose a multi-modal protein function prediction model (MMPFP) that integrates protein sequence and structure information through the use of GCN, CNN, and Transformer models. We validate the model using the PDBest dataset, demonstrating that MMPFP outperforms traditional single-modal models in the molecular function (MF), biological process (BP), and cellular component (CC) prediction tasks. Specifically, MMPFP achieved AUPR scores of 0.693, 0.355, and 0.478; [Formula: see text] scores of 0.752, 0.629, and 0.691; and [Formula: see text] scores of 0.336, 0.488, and 0.459, showing a 3-5% improvement over single-modal models. Additionally, ablation studies confirm the effectiveness of the Transformer module within the GCN branch, further validating MMPFP's superior performance over existing methods. This multi-modal approach offers a more accurate and comprehensive framework for protein function prediction, addressing key limitations of current models.

摘要

由序列、结构和其他特征决定的蛋白质功能在生物体的表现中起着至关重要的作用。现有的蛋白质功能预测方法主要依赖序列数据,并且常常忽略对准确预测至关重要的结构特性。蛋白质结构提供了更丰富的空间和功能见解,这可以显著提高预测准确性。在这项工作中,我们提出了一种多模态蛋白质功能预测模型(MMPFP),该模型通过使用GCN、CNN和Transformer模型整合蛋白质序列和结构信息。我们使用PDBest数据集对模型进行了验证,证明MMPFP在分子功能(MF)、生物过程(BP)和细胞成分(CC)预测任务中优于传统的单模态模型。具体而言,MMPFP在MF、BP和CC任务上的AUPR分数分别为0.693、0.355和0.478;[公式:见原文]分数分别为0.752、0.629和0.691;[公式:见原文]分数分别为0.336、0.488和0.459,比单模态模型提高了3-5%。此外,消融研究证实了GCN分支中Transformer模块的有效性,进一步验证了MMPFP相对于现有方法的优越性能。这种多模态方法为蛋白质功能预测提供了一个更准确、更全面的框架,解决了当前模型的关键局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/348371e6d7dc/41598_2025_94612_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/835c27adb409/41598_2025_94612_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/85256985775d/41598_2025_94612_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/0cb573b51015/41598_2025_94612_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/7d59cfea64c5/41598_2025_94612_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/348371e6d7dc/41598_2025_94612_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/835c27adb409/41598_2025_94612_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/85256985775d/41598_2025_94612_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/0cb573b51015/41598_2025_94612_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/7d59cfea64c5/41598_2025_94612_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/21a9/11947276/348371e6d7dc/41598_2025_94612_Fig5_HTML.jpg

相似文献

1
A multimodal model for protein function prediction.一种用于蛋白质功能预测的多模态模型。
Sci Rep. 2025 Mar 26;15(1):10465. doi: 10.1038/s41598-025-94612-y.
2
TAWFN: a deep learning framework for protein function prediction.TAWFN:一种用于蛋白质功能预测的深度学习框架。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae571.
3
MVGNN-PPIS: A novel multi-view graph neural network for protein-protein interaction sites prediction based on Alphafold3-predicted structures and transfer learning.MVGNN-PPIS:一种基于Alphafold3预测结构和迁移学习的用于蛋白质-蛋白质相互作用位点预测的新型多视图图神经网络。
Int J Biol Macromol. 2025 Apr;300:140096. doi: 10.1016/j.ijbiomac.2025.140096. Epub 2025 Jan 21.
4
GTPLM-GO: Enhancing Protein Function Prediction Through Dual-Branch Graph Transformer and Protein Language Model Fusing Sequence and Local-Global PPI Information.GTPLM-GO:通过融合序列和局部-全局蛋白质-蛋白质相互作用信息的双分支图变换器和蛋白质语言模型增强蛋白质功能预测
Int J Mol Sci. 2025 Apr 25;26(9):4088. doi: 10.3390/ijms26094088.
5
RGCNPPIS: A Residual Graph Convolutional Network for Protein-Protein Interaction Site Prediction.RGCNPPIS:一种用于蛋白质-蛋白质相互作用位点预测的残差图卷积网络。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):1676-1684. doi: 10.1109/TCBB.2024.3410350. Epub 2024 Dec 10.
6
A CNN-CBAM-BIGRU model for protein function prediction.基于 CNN-CBAM-BIGRU 的蛋白质功能预测模型。
Stat Appl Genet Mol Biol. 2024 Jul 1;23(1). doi: 10.1515/sagmb-2024-0004. eCollection 2024 Jan 1.
7
MMAgentRec, a personalized multi-modal recommendation agent with large language model.MMAgentRec,一个带有大语言模型的个性化多模态推荐代理。
Sci Rep. 2025 Apr 8;15(1):12062. doi: 10.1038/s41598-025-96458-w.
8
MMGCN: Multi-modal multi-view graph convolutional networks for cancer prognosis prediction.多模态多视图图卷积网络用于癌症预后预测。
Comput Methods Programs Biomed. 2024 Dec;257:108400. doi: 10.1016/j.cmpb.2024.108400. Epub 2024 Sep 6.
9
Deep learning model for protein multi-label subcellular localization and function prediction based on multi-task collaborative training.基于多任务协同训练的蛋白质多标签亚细胞定位和功能预测深度学习模型。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae568.
10
SST-ResNet: A Sequence and Structure Information Integration Model for Protein Property Prediction.SST-ResNet:一种用于蛋白质属性预测的序列与结构信息整合模型。
Int J Mol Sci. 2025 Mar 19;26(6):2783. doi: 10.3390/ijms26062783.

本文引用的文献

1
TAWFN: a deep learning framework for protein function prediction.TAWFN:一种用于蛋白质功能预测的深度学习框架。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae571.
2
Hierarchical graph transformer with contrastive learning for protein function prediction.基于对比学习的层次图转换器在蛋白质功能预测中的应用。
Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad410.
3
A Comprehensive Survey of Deep Learning Techniques in Protein Function Prediction.深度学习技术在蛋白质功能预测中的综合研究
IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):2291-2301. doi: 10.1109/TCBB.2023.3247634. Epub 2023 Jun 5.
4
Integrating unsupervised language model with triplet neural networks for protein gene ontology prediction.将无监督语言模型与三重态神经网络集成,用于蛋白质基因本体预测。
PLoS Comput Biol. 2022 Dec 22;18(12):e1010793. doi: 10.1371/journal.pcbi.1010793. eCollection 2022 Dec.
5
TALE-cmap: Protein function prediction based on a TALE-based architecture and the structure information from contact map.TALE-cmap:基于 TALE 架构和来自接触图的结构信息的蛋白质功能预测。
Comput Biol Med. 2022 Oct;149:105938. doi: 10.1016/j.compbiomed.2022.105938. Epub 2022 Aug 20.
6
Deep learning program to predict protein functions based on sequence information.基于序列信息预测蛋白质功能的深度学习程序。
MethodsX. 2022 Jan 15;9:101622. doi: 10.1016/j.mex.2022.101622. eCollection 2022.
7
Accurate protein function prediction via graph attention networks with predicted structure information.通过结合预测结构信息的图注意力网络进行准确的蛋白质功能预测。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab502.
8
A Sub-Sequence Based Approach to Protein Function Prediction via Multi-Attention Based Multi-Aspect Network.基于子序列的多注意力多方面网络的蛋白质功能预测方法。
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):94-105. doi: 10.1109/TCBB.2021.3130923. Epub 2023 Feb 3.
9
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
10
Structure-based protein function prediction using graph convolutional networks.基于结构的蛋白质功能预测使用图卷积网络。
Nat Commun. 2021 May 26;12(1):3168. doi: 10.1038/s41467-021-23303-9.