Suppr超能文献

ProSTAGE:利用蛋白质嵌入和图卷积网络预测突变对蛋白质稳定性的影响。

ProSTAGE: Predicting Effects of Mutations on Protein Stability by Using Protein Embeddings and Graph Convolutional Networks.

机构信息

Production and R&D Center I of LSS, GenScript (Shanghai) Biotech Co., Ltd., Shanghai 200131, China.

出版信息

J Chem Inf Model. 2024 Jan 22;64(2):340-347. doi: 10.1021/acs.jcim.3c01697. Epub 2024 Jan 2.

Abstract

Protein thermodynamic stability is essential to clarify the relationships among structure, function, and interaction. Therefore, developing a faster and more accurate method to predict the impact of the mutations on protein stability is helpful for protein design and understanding the phenotypic variation. Recent studies have shown that protein embedding will be particularly powerful at modeling sequence information with context dependence, such as subcellular localization, variant effect, and secondary structure prediction. Herein, we introduce a novel method, ProSTAGE, which is a deep learning method that fuses structure and sequence embedding to predict protein stability changes upon single point mutations. Our model combines graph-based techniques and language models to predict stability changes. Moreover, ProSTAGE is trained on a larger data set, which is almost twice as large as the most used S2648 data set. It consistently outperforms all existing state-of-the-art methods on mutation-affected problems as benchmarked on several independent data sets. The protein embedding as the prediction input achieves better results than the previous results, which shows the potential of protein language models in predicting the effect of mutations on proteins. ProSTAGE is implemented as a user-friendly web server.

摘要

蛋白质热力学稳定性对于阐明结构、功能和相互作用之间的关系至关重要。因此,开发一种更快、更准确的方法来预测突变对蛋白质稳定性的影响,有助于蛋白质设计和理解表型变异。最近的研究表明,蛋白质嵌入在建模具有上下文依赖性的序列信息方面将特别强大,例如亚细胞定位、变体效应和二级结构预测。在这里,我们引入了一种新的方法 ProSTAGE,这是一种融合结构和序列嵌入的深度学习方法,用于预测单点突变对蛋白质稳定性变化的影响。我们的模型结合了基于图的技术和语言模型来预测稳定性变化。此外,ProSTAGE 是在一个更大的数据集上进行训练的,这个数据集几乎是最常用的 S2648 数据集的两倍大。它在几个独立的数据集中作为基准,在受突变影响的问题上始终优于所有现有的最先进的方法。将蛋白质嵌入作为预测输入可以获得比以前更好的结果,这表明蛋白质语言模型在预测突变对蛋白质的影响方面具有潜力。ProSTAGE 被实现为一个用户友好的网络服务器。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de12/10806799/123a1f31d2ed/ci3c01697_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验