Suppr超能文献

EvoRator2:基于深度学习利用蛋白质结构信息预测氨基酸的特定位置取代。

EvoRator2: Predicting Site-specific Amino Acid Substitutions Based on Protein Structural Information Using Deep Learning.

机构信息

The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.

Blavatnik School of Computer Science, Raymond & Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel.

出版信息

J Mol Biol. 2023 Jul 15;435(14):168155. doi: 10.1016/j.jmb.2023.168155. Epub 2023 May 23.

Abstract

Multiple sequence alignments (MSAs) are the workhorse of molecular evolution and structural biology research. From MSAs, the amino acids that are tolerated at each site during protein evolution can be inferred. However, little is known regarding the repertoire of tolerated amino acids in proteins when only a few or no sequence homologs are available, such as orphan and de novo designed proteins. Here we present EvoRator2, a deep-learning algorithm trained on over 15,000 protein structures that can predict which amino acids are tolerated at any given site, based exclusively on protein structural information mined from atomic coordinate files. We show that EvoRator2 obtained satisfying results for the prediction of position-weighted scoring matrices (PSSM). We further show that EvoRator2 obtained near state-of-the-art performance on proteins with high quality structures in predicting the effect of mutations in deep mutation scanning (DMS) experiments and that for certain DMS targets, EvoRator2 outperformed state-of-the-art methods. We also show that by combining EvoRator2's predictions with those obtained by a state-of-the-art deep-learning method that accounts for the information in the MSA, the prediction of the effect of mutation in DMS experiments was improved in terms of both accuracy and stability. EvoRator2 is designed to predict which amino-acid substitutions are tolerated in such proteins without many homologous sequences, including orphan or de novo designed proteins. We implemented our approach in the EvoRator web server (https://evorator.tau.ac.il).

摘要

多序列比对(MSA)是分子进化和结构生物学研究的主力军。从 MSA 中,可以推断出蛋白质进化过程中每个位置耐受的氨基酸。然而,当只有少数或没有序列同源物(如孤儿和从头设计的蛋白质)时,对于蛋白质中耐受氨基酸的范围知之甚少。在这里,我们介绍了 EvoRator2,这是一种基于超过 15000 个蛋白质结构的深度学习算法,可以根据从原子坐标文件中挖掘的蛋白质结构信息,仅在给定位置预测哪些氨基酸是耐受的。我们表明,EvoRator2 对位置加权评分矩阵(PSSM)的预测获得了令人满意的结果。我们进一步表明,EvoRator2 在预测深突变扫描(DMS)实验中突变的影响方面,对于高质量结构的蛋白质,接近最先进的性能,并且对于某些 DMS 目标,EvoRator2 的表现优于最先进的方法。我们还表明,通过将 EvoRator2 的预测与一种考虑 MSA 中信息的最先进的深度学习方法的预测相结合,可以提高 DMS 实验中突变影响的预测准确性和稳定性。EvoRator2 旨在预测在没有许多同源序列的情况下,包括孤儿或从头设计的蛋白质,哪些氨基酸取代是耐受的。我们在 EvoRator 网络服务器(https://evorator.tau.ac.il)中实现了我们的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验