深度蛋白质功能预测平台（DeepPFP）：一种用于蛋白质功能预测的多任务感知架构。

DeepPFP: a multi-task-aware architecture for protein function prediction.

作者信息

Wang Han, Ren Zilin, Sun Jinghong, Chen Yongbing, Bo Xiaochen, Xue JiGuo, Gao Jingyang, Ni Ming

机构信息

College of Information Science and Technology, Beijing University of Chemical Technology, No. 15 North Third Ring East Road, Chaoyang District, Beijing 100029, China.

Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, State Key Laboratory of Pathogen and Biosecurity, Key Laboratory of Jilin Province for Zoonosis Prevention and Control, Changchun 130122, China.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae579.

DOI:10.1093/bib/bbae579

PMID:39905954

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11794456/

Abstract

Deriving protein function from protein sequences poses a significant challenge due to the intricate relationship between sequence and function. Deep learning has made remarkable strides in predicting sequence-function relationships. However, models tailored for specific tasks or protein types encounter difficulties when using transfer learning across domains. This is attributed to the fact that protein function relies heavily on structural characteristics rather than mere sequence information. Consequently, there is a pressing need for a model capable of capturing shared features among diverse sequence-function mapping tasks to address the generalization issue. In this study, we explore the potential of Model-Agnostic Meta-Learning combined with a protein language model called Evolutionary Scale Modeling to tackle this challenge. Our approach involves training the architecture on five out-domain deep mutational scanning (DMS) datasets and evaluating its performance across four key dimensions. Our findings demonstrate that the proposed architecture exhibits satisfactory performance in terms of generalization and employs an effective few-shot learning strategy. To explain further, Compared to the best results, the Pearson's correlation coefficient (PCC) in the final stage increased by ~0.31%. Furthermore, we leverage the trained architecture to predict binding affinity scores of the DMS dataset of SARS-CoV-2 using transfer learning. Notably, training on a subset of the Ube4b dataset with 500 samples resulted in a notable improvement of 0.11 in the PCC. These results underscore the potential of our conceptual architecture as a promising methodology for multi-task protein function prediction.

摘要

由于序列与功能之间存在复杂的关系，从蛋白质序列推导蛋白质功能面临重大挑战。深度学习在预测序列-功能关系方面取得了显著进展。然而，针对特定任务或蛋白质类型定制的模型在跨领域使用迁移学习时会遇到困难。这归因于蛋白质功能严重依赖于结构特征而非仅仅是序列信息。因此，迫切需要一种能够捕捉不同序列-功能映射任务之间共享特征的模型来解决泛化问题。在本研究中，我们探索了模型无关元学习与一种名为进化尺度建模的蛋白质语言模型相结合来应对这一挑战的潜力。我们的方法包括在五个域外深度突变扫描（DMS）数据集上训练该架构，并在四个关键维度上评估其性能。我们的研究结果表明，所提出的架构在泛化方面表现出令人满意的性能，并采用了有效的少样本学习策略。进一步解释，与最佳结果相比，最终阶段的皮尔逊相关系数（PCC）提高了约0.31%。此外，我们利用训练好的架构通过迁移学习预测SARS-CoV-2的DMS数据集的结合亲和力分数。值得注意的是，在包含500个样本的Ube4b数据集的一个子集上进行训练，导致PCC显著提高了0.11。这些结果强调了我们的概念性架构作为一种有前途的多任务蛋白质功能预测方法的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d92/11794456/36323f12a2b7/bbae579f1.jpg

相似文献

DeepPFP: a multi-task-aware architecture for protein function prediction.

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae579.

Flattening the curve-How to get better results with small deep-mutational-scanning datasets.

Proteins. 2024 Jul;92(7):886-902. doi: 10.1002/prot.26686. Epub 2024 Mar 19.

Protein multi-level structure feature-integrated deep learning method for mutational effect prediction.

Biotechnol J. 2024 Aug;19(8):e2400203. doi: 10.1002/biot.202400203.

ILMCNet: A Deep Neural Network Model That Uses PLM to Process Features and Employs CRF to Predict Protein Secondary Structure.

Genes (Basel). 2024 Oct 21;15(10):1350. doi: 10.3390/genes15101350.

Deep learning model for protein multi-label subcellular localization and function prediction based on multi-task collaborative training.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae568.

DeepPD: A Deep Learning Method for Predicting Peptide Detectability Based on Multi-feature Representation and Information Bottleneck.

Interdiscip Sci. 2025 Mar;17(1):200-214. doi: 10.1007/s12539-024-00665-4. Epub 2024 Dec 11.

Improved deep learning prediction of antigen-antibody interactions.

Proc Natl Acad Sci U S A. 2024 Oct 8;121(41):e2410529121. doi: 10.1073/pnas.2410529121. Epub 2024 Oct 3.

Modeling aspects of the language of life through transfer-learning protein sequences.

BMC Bioinformatics. 2019 Dec 17;20(1):723. doi: 10.1186/s12859-019-3220-8.

Deep learning for discriminating non-trivial conformational changes in molecular dynamics simulations of SARS-CoV-2 spike-ACE2.

Sci Rep. 2024 Sep 30;14(1):22639. doi: 10.1038/s41598-024-72842-w.

Multimodal radiotherapy dose prediction using a multi-task deep learning model.

Med Phys. 2024 Jun;51(6):3932-3949. doi: 10.1002/mp.17115. Epub 2024 May 6.

本文引用的文献

Human-like systematic generalization through a meta-learning neural network.

Nature. 2023 Nov;623(7985):115-121. doi: 10.1038/s41586-023-06668-3. Epub 2023 Oct 25.

Meta Learning With Graph Attention Networks for Low-Data Drug Discovery.

IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):11218-11230. doi: 10.1109/TNNLS.2023.3250324. Epub 2024 Aug 5.

Multi-Learner Based Deep Meta-Learning for Few-Shot Medical Image Classification.

IEEE J Biomed Health Inform. 2023 Jan;27(1):17-28. doi: 10.1109/JBHI.2022.3215147. Epub 2023 Jan 5.

TALE-cmap: Protein function prediction based on a TALE-based architecture and the structure information from contact map.

Comput Biol Med. 2022 Oct;149:105938. doi: 10.1016/j.compbiomed.2022.105938. Epub 2022 Aug 20.

Domain Generalization: A Survey.

IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4396-4415. doi: 10.1109/TPAMI.2022.3195549. Epub 2023 Mar 7.

ProteinBERT: a universal deep-learning model of protein sequence and function.

Bioinformatics. 2022 Apr 12;38(8):2102-2110. doi: 10.1093/bioinformatics/btac020.

Neural networks to learn protein sequence-function relationships from deep mutational scanning data.

Proc Natl Acad Sci U S A. 2021 Nov 30;118(48). doi: 10.1073/pnas.2104878118.

Few-shot learning creates predictive models of drug response that translate from high-throughput screens to individual patients.

Nat Cancer. 2021 Feb;2(2):233-244. doi: 10.1038/s43018-020-00169-2. Epub 2021 Jan 25.

Meta-Learning in Neural Networks: A Survey.

IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):5149-5169. doi: 10.1109/TPAMI.2021.3079209. Epub 2022 Aug 4.

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.

Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

深度蛋白质功能预测平台（DeepPFP）：一种用于蛋白质功能预测的多任务感知架构。

DeepPFP: a multi-task-aware architecture for protein function prediction.

作者信息

Wang Han, Ren Zilin, Sun Jinghong, Chen Yongbing, Bo Xiaochen, Xue JiGuo, Gao Jingyang, Ni Ming

机构信息

College of Information Science and Technology, Beijing University of Chemical Technology, No. 15 North Third Ring East Road, Chaoyang District, Beijing 100029, China.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae579.

DOI:10.1093/bib/bbae579

PMID:39905954

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11794456/

Abstract

摘要

深度蛋白质功能预测平台（DeepPFP）：一种用于蛋白质功能预测的多任务感知架构。

DeepPFP: a multi-task-aware architecture for protein function prediction.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

深度蛋白质功能预测平台（DeepPFP）：一种用于蛋白质功能预测的多任务感知架构。

DeepPFP: a multi-task-aware architecture for protein function prediction.

作者信息

机构信息

出版信息

相似文献

本文引用的文献