Suppr超能文献

通过人工智能驱动的迭代学习在蛋白质序列空间中进行高效搜索。

Efficient Searches in Protein Sequence Space Through AI-Driven Iterative Learning.

作者信息

Suárez-Martín Ignacio, Risso Valeria A, Romero-Zaliz Rocío, Sanchez-Ruiz Jose M

机构信息

Unidad de Excelencia de Química Aplicada a Biomedicina y Medioambiente (UEQ), Departamento de Química Física, Facultad de Ciencias, Universidad de Granada, 18071 Granada, Spain.

Centro de Investigación en Tecnologías de la Información y las Telecomunicaciones (CITIC-UGR), Universidad de Granada, 18071 Granada, Spain.

出版信息

Int J Mol Sci. 2025 May 15;26(10):4741. doi: 10.3390/ijms26104741.

Abstract

The protein sequence space is vast. This fact, together with the prevalence of epistasis, hampers the engineering of novel enzymes through library screening and is a major obstacle to any attempt to predict natural protein evolution. Recently, specialized methodologies have been used to determine fitness data on 260,000 sequences for the gene of the enzyme dihydrofolate reductase and antibody affinity data for all combinations of the mutations present in the receptor-binding domain (RBD) of the Omicron strain of SARS-CoV-2 (30,000 variants). We show that upon iterative training on a total of just a few hundred variants, various state-of-the-art AI tools (multi-layer perceptron, random forest, and XGBoost algorithms) find very high fitness variants of the enzyme and predict the antibody evasion patterns of the RBD. This work provides a basis for efficient, widely applicable, low-throughput experimental approaches to assess viral protein evolution and to engineer enzymes for biotechnological applications.

摘要

蛋白质序列空间极为广阔。这一事实,再加上上位性的普遍存在,阻碍了通过文库筛选来设计新型酶,并且是预测天然蛋白质进化的任何尝试的主要障碍。最近,已经使用专门的方法来确定约260,000个序列的二氢叶酸还原酶基因的适应性数据,以及严重急性呼吸综合征冠状病毒2(SARS-CoV-2)奥密克戎毒株受体结合域(RBD)中存在的所有突变组合的抗体亲和力数据(约30,000个变体)。我们表明,在总共仅几百个变体上进行迭代训练后,各种先进的人工智能工具(多层感知器、随机森林和XGBoost算法)找到了该酶的适应性非常高的变体,并预测了RBD的抗体逃逸模式。这项工作为评估病毒蛋白质进化和设计用于生物技术应用的酶提供了高效、广泛适用的低通量实验方法的基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c70/12112320/14a803b0f875/ijms-26-04741-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验