Suppr超能文献

利用蛋白质语言模型的嵌入来改进蛋白质琥珀酰化位点预测。

Improving protein succinylation sites prediction using embeddings from protein language model.

机构信息

Department of Computer Science, Michigan Technological University, Houghton, MI, USA.

Department of Informatics, Bioinformatics and Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany.

出版信息

Sci Rep. 2022 Oct 8;12(1):16933. doi: 10.1038/s41598-022-21366-2.

Abstract

Protein succinylation is an important post-translational modification (PTM) responsible for many vital metabolic activities in cells, including cellular respiration, regulation, and repair. Here, we present a novel approach that combines features from supervised word embedding with embedding from a protein language model called ProtT5-XL-UniRef50 (hereafter termed, ProtT5) in a deep learning framework to predict protein succinylation sites. To our knowledge, this is one of the first attempts to employ embedding from a pre-trained protein language model to predict protein succinylation sites. The proposed model, dubbed LMSuccSite, achieves state-of-the-art results compared to existing methods, with performance scores of 0.36, 0.79, 0.79 for MCC, sensitivity, and specificity, respectively. LMSuccSite is likely to serve as a valuable resource for exploration of succinylation and its role in cellular physiology and disease.

摘要

蛋白质琥珀酰化是一种重要的翻译后修饰(PTM),负责细胞中的许多重要代谢活动,包括细胞呼吸、调节和修复。在这里,我们提出了一种新的方法,该方法结合了有监督的词嵌入特征和一种名为 ProtT5-XL-UniRef50(简称 ProtT5)的蛋白质语言模型的嵌入,在深度学习框架中预测蛋白质琥珀酰化位点。据我们所知,这是首次尝试使用预训练的蛋白质语言模型的嵌入来预测蛋白质琥珀酰化位点。所提出的模型,称为 LMSuccSite,与现有方法相比,取得了最先进的结果,MCC、敏感性和特异性的性能得分分别为 0.36、0.79 和 0.79。LMSuccSite 可能成为探索琥珀酰化及其在细胞生理学和疾病中的作用的有价值的资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4251/9547916/eb0850ed5045/41598_2022_21366_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验