Suppr超能文献

利用蛋白质语言模型的嵌入来改进蛋白质琥珀酰化位点预测。

Improving protein succinylation sites prediction using embeddings from protein language model.

机构信息

Department of Computer Science, Michigan Technological University, Houghton, MI, USA.

Department of Informatics, Bioinformatics and Computational Biology - i12, TUM (Technical University of Munich), Boltzmannstr. 3, 85748, Garching/Munich, Germany.

出版信息

Sci Rep. 2022 Oct 8;12(1):16933. doi: 10.1038/s41598-022-21366-2.

Abstract

Protein succinylation is an important post-translational modification (PTM) responsible for many vital metabolic activities in cells, including cellular respiration, regulation, and repair. Here, we present a novel approach that combines features from supervised word embedding with embedding from a protein language model called ProtT5-XL-UniRef50 (hereafter termed, ProtT5) in a deep learning framework to predict protein succinylation sites. To our knowledge, this is one of the first attempts to employ embedding from a pre-trained protein language model to predict protein succinylation sites. The proposed model, dubbed LMSuccSite, achieves state-of-the-art results compared to existing methods, with performance scores of 0.36, 0.79, 0.79 for MCC, sensitivity, and specificity, respectively. LMSuccSite is likely to serve as a valuable resource for exploration of succinylation and its role in cellular physiology and disease.

摘要

蛋白质琥珀酰化是一种重要的翻译后修饰(PTM),负责细胞中的许多重要代谢活动,包括细胞呼吸、调节和修复。在这里,我们提出了一种新的方法,该方法结合了有监督的词嵌入特征和一种名为 ProtT5-XL-UniRef50(简称 ProtT5)的蛋白质语言模型的嵌入,在深度学习框架中预测蛋白质琥珀酰化位点。据我们所知,这是首次尝试使用预训练的蛋白质语言模型的嵌入来预测蛋白质琥珀酰化位点。所提出的模型,称为 LMSuccSite,与现有方法相比,取得了最先进的结果,MCC、敏感性和特异性的性能得分分别为 0.36、0.79 和 0.79。LMSuccSite 可能成为探索琥珀酰化及其在细胞生理学和疾病中的作用的有价值的资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4251/9547916/eb0850ed5045/41598_2022_21366_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验