• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将ESM-2和图神经网络与AlphaFold-2结构相结合以增强蛋白质功能预测

Integrating ESM‑2 and Graph Neural Networks with AlphaFold‑2 Structures for Enhanced Protein Function Prediction.

作者信息

Nguyen Thi-Tuyen, Jiang Zhuocheng, Nguyen Van-Nui, Le Nguyen Quoc Khanh, Chua Matthew Chin Heng

机构信息

University of Information and Communication Technology, Thai Nguyen University, Thai Nguyen 25000, Viet Nam.

Department of Biomedical Informatics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore.

出版信息

ACS Omega. 2025 Aug 16;10(33):38103-38111. doi: 10.1021/acsomega.5c05484. eCollection 2025 Aug 26.

DOI:10.1021/acsomega.5c05484
PMID:40893250
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12391975/
Abstract

Protein function prediction is essential for elucidating biological processes and accelerating drug discovery. However, the vast number of unannotated protein sequences and the limited availability of experimentally validated functional data remain major challenges. Although deep learning models based on protein sequences or protein-protein interaction networks have shown promise, their performance is still restricted, particularly for proteins without interaction data. Furthermore, many existing approaches treat sequence and structural information separately, potentially resulting in suboptimal feature representations. To address these limitations, we propose an improved graph-based framework that integrates two key innovations: (i) ESM-2, a state-of-the-art protein language model, to generate semantically rich sequence embeddings; and (ii) a hybrid pooling mechanism within graph convolutional blocks to better capture both global and local structural features from AlphaFold2-predicted structures. Experiments on the human proteome demonstrate that our model consistently outperforms existing methods in predicting molecular function, cellular component, and biological process annotations. These findings highlight the advantages of combining advanced sequence representations with enhanced structural learning for accurate and generalizable protein function prediction.

摘要

蛋白质功能预测对于阐明生物过程和加速药物发现至关重要。然而,大量未注释的蛋白质序列以及实验验证的功能数据的有限可用性仍然是主要挑战。尽管基于蛋白质序列或蛋白质-蛋白质相互作用网络的深度学习模型已显示出前景,但其性能仍然受到限制,特别是对于没有相互作用数据的蛋白质。此外,许多现有方法分别处理序列和结构信息,可能导致次优的特征表示。为了解决这些限制,我们提出了一种改进的基于图的框架,该框架集成了两项关键创新:(i)ESM-2,一种先进的蛋白质语言模型,用于生成语义丰富的序列嵌入;(ii)图卷积块内的混合池化机制,以更好地从AlphaFold2预测的结构中捕获全局和局部结构特征。在人类蛋白质组上的实验表明,我们的模型在预测分子功能、细胞成分和生物过程注释方面始终优于现有方法。这些发现突出了将先进的序列表示与增强的结构学习相结合以进行准确且通用的蛋白质功能预测的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/346c/12391975/ae1f28e3d098/ao5c05484_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/346c/12391975/1783c7415cae/ao5c05484_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/346c/12391975/ae1f28e3d098/ao5c05484_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/346c/12391975/1783c7415cae/ao5c05484_0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/346c/12391975/ae1f28e3d098/ao5c05484_0002.jpg

相似文献

1
Integrating ESM‑2 and Graph Neural Networks with AlphaFold‑2 Structures for Enhanced Protein Function Prediction.将ESM-2和图神经网络与AlphaFold-2结构相结合以增强蛋白质功能预测
ACS Omega. 2025 Aug 16;10(33):38103-38111. doi: 10.1021/acsomega.5c05484. eCollection 2025 Aug 26.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Short-Term Memory Impairment短期记忆障碍
4
Anti-Symmetric Molecular Graph Learning Approach With Residual Adaptive Network Based Fuzzy Inference System for Lethal Dose Forecasting Problem.基于残差自适应网络模糊推理系统的反对称分子图学习方法用于致死剂量预测问题
J Comput Chem. 2025 Jul 15;46(19):e70176. doi: 10.1002/jcc.70176.
5
iACP-DPNet: a dual-pooling causal dilated convolutional network for interpretable anticancer peptide identification.iACP-DPNet:一种用于可解释抗癌肽识别的双池因果扩张卷积网络。
Funct Integr Genomics. 2025 Jul 4;25(1):147. doi: 10.1007/s10142-025-01641-x.
6
Distilling knowledge from graph neural networks trained on cell graphs to non-neural student models.从在细胞图上训练的图神经网络中提取知识,用于非神经学生模型。
Sci Rep. 2025 Aug 10;15(1):29274. doi: 10.1038/s41598-025-13697-7.
7
Combining protein sequences and structures with transformers and equivariant graph neural networks to predict protein function.将蛋白质序列和结构与转换器和等变图神经网络相结合,以预测蛋白质功能。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i318-i325. doi: 10.1093/bioinformatics/btad208.
8
A Hybrid Ensemble End-to-End Neural Network for Accurate Protein-Protein Interactions Prediction.一种用于精确预测蛋白质-蛋白质相互作用的混合集成端到端神经网络。
IEEE Trans Comput Biol Bioinform. 2025 Jul 29;PP. doi: 10.1109/TCBBIO.2025.3593469.
9
A unified graph-based approach for protein function prediction using AlphaFold structures and sequence features.
Comput Biol Chem. 2025 Aug 14;120(Pt 1):108609. doi: 10.1016/j.compbiolchem.2025.108609.
10
Soft graph clustering for single-cell RNA sequencing data.用于单细胞RNA测序数据的软图聚类
BMC Bioinformatics. 2025 Jul 25;26(1):195. doi: 10.1186/s12859-025-06231-z.

本文引用的文献

1
PortPred: Exploiting deep learning embeddings of amino acid sequences for the identification of transporter proteins and their substrates.PortPred:利用氨基酸序列的深度学习嵌入物来识别转运蛋白及其底物。
J Cell Biochem. 2023 Nov;124(11):1803-1824. doi: 10.1002/jcb.30490. Epub 2023 Oct 25.
2
Struct2GO: protein function prediction based on graph pooling algorithm and AlphaFold2 structure information.Struct2GO:基于图池化算法和 AlphaFold2 结构信息的蛋白质功能预测。
Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad637.
3
Hierarchical graph transformer with contrastive learning for protein function prediction.
基于对比学习的层次图转换器在蛋白质功能预测中的应用。
Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad410.
4
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
5
The Gene Ontology knowledgebase in 2023.2023 版基因本体论知识库。
Genetics. 2023 May 4;224(1). doi: 10.1093/genetics/iyad031.
6
DALI shines a light on remote homologs: One hundred discoveries.DALI 揭示了远程同源物:一百项发现。
Protein Sci. 2023 Jan;32(1):e4519. doi: 10.1002/pro.4519.
7
UniProt: the Universal Protein Knowledgebase in 2023.UniProt:2023 年的通用蛋白质知识库。
Nucleic Acids Res. 2023 Jan 6;51(D1):D523-D531. doi: 10.1093/nar/gkac1052.
8
Improved Prediction Model of Protein and Peptide Toxicity by Integrating Channel Attention into a Convolutional Neural Network and Gated Recurrent Units.通过将通道注意力机制集成到卷积神经网络和门控循环单元中改进蛋白质和肽毒性的预测模型
ACS Omega. 2022 Oct 27;7(44):40569-40577. doi: 10.1021/acsomega.2c05881. eCollection 2022 Nov 8.
9
Prediction of protein-protein interaction using graph neural networks.基于图神经网络的蛋白质-蛋白质相互作用预测。
Sci Rep. 2022 May 19;12(1):8360. doi: 10.1038/s41598-022-12201-9.
10
ProteinBERT: a universal deep-learning model of protein sequence and function.蛋白质 BERT:一种通用的蛋白质序列和功能深度学习模型。
Bioinformatics. 2022 Apr 12;38(8):2102-2110. doi: 10.1093/bioinformatics/btac020.