Suppr超能文献

通过主动学习增强人工金属酶的序列-活性映射及进化

Enhanced Sequence-Activity Mapping and Evolution of Artificial Metalloenzymes by Active Learning.

作者信息

Vornholt Tobias, Mutný Mojmír, Schmidt Gregor W, Schellhaas Christian, Tachibana Ryo, Panke Sven, Ward Thomas R, Krause Andreas, Jeschek Markus

机构信息

Department of Biosystems Science and Engineering, ETH Zurich, Mattenstrasse 26, 4058 Basel, Switzerland.

National Centre of Competence in Research (NCCR) Molecular Systems Engineering, 4056 Basel,Switzerland.

出版信息

ACS Cent Sci. 2024 May 22;10(7):1357-1370. doi: 10.1021/acscentsci.4c00258. eCollection 2024 Jul 24.

Abstract

Tailored enzymes are crucial for the transition to a sustainable bioeconomy. However, enzyme engineering is laborious and failure-prone due to its reliance on serendipity. The efficiency and success rates of engineering campaigns may be improved by applying machine learning to map the sequence-activity landscape based on small experimental data sets. Yet, it often proves challenging to reliably model large sequence spaces while keeping the experimental effort tractable. To address this challenge, we present an integrated pipeline combining large-scale screening with active machine learning, which we applied to engineer an artificial metalloenzyme (ArM) catalyzing a new-to-nature hydroamination reaction. Combining lab automation and next-generation sequencing, we acquired sequence-activity data for several thousand ArM variants. We then used Gaussian process regression to model the activity landscape and guide further screening rounds. Critical characteristics of our pipeline include the cost-effective generation of information-rich data sets, the integration of an explorative round to improve the model's performance, and the inclusion of experimental noise. Our approach led to an order-of-magnitude boost in the hit rate while making efficient use of experimental resources. Search strategies like this should find broad utility in enzyme engineering and accelerate the development of novel biocatalysts.

摘要

定制酶对于向可持续生物经济的转型至关重要。然而,由于酶工程依赖于偶然性,因此它既费力又容易失败。通过应用机器学习基于小型实验数据集绘制序列-活性图谱,可以提高工程改造活动的效率和成功率。然而,在保持实验工作量可控的同时,可靠地对大型序列空间进行建模往往具有挑战性。为了应对这一挑战,我们提出了一种将大规模筛选与主动机器学习相结合的集成流程,并将其应用于设计一种催化新型氢化胺化反应的人工金属酶(ArM)。结合实验室自动化和下一代测序技术,我们获得了数千个ArM变体的序列-活性数据。然后,我们使用高斯过程回归对活性图谱进行建模,并指导进一步的筛选轮次。我们流程的关键特性包括经济高效地生成信息丰富的数据集、整合探索轮次以提高模型性能以及纳入实验噪声。我们的方法在有效利用实验资源的同时,使命中率提高了一个数量级。这样的搜索策略在酶工程中应具有广泛的用途,并加速新型生物催化剂的开发。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fa7/11273458/118f075b228f/oc4c00258_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验