Bian Jiahao, Tan Pan, Nie Ting, Hong Liang, Yang Guang-Yu
State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology Shanghai Jiao Tong University Shanghai China.
Institute of Key Biological Raw Material Shanghai Academy of Experimental Medicine Shanghai China.
mLife. 2024 Dec 26;3(4):492-504. doi: 10.1002/mlf2.12151. eCollection 2024 Dec.
Optimizing enzyme thermostability is essential for advancements in protein science and industrial applications. Currently, (semi-)rational design and random mutagenesis methods can accurately identify single-point mutations that enhance enzyme thermostability. However, complex epistatic interactions often arise when multiple mutation sites are combined, leading to the complete inactivation of combinatorial mutants. As a result, constructing an optimized enzyme often requires repeated rounds of design to incrementally incorporate single mutation sites, which is highly time-consuming. In this study, we developed an AI-aided strategy for enzyme thermostability engineering that efficiently facilitates the recombination of beneficial single-point mutations. We utilized thermostability data from creatinase, including 18 single-point mutants, 22 double-point mutants, 21 triple-point mutants, and 12 quadruple-point mutants. Using these data as inputs, we used a temperature-guided protein language model, Pro-PRIME, to learn epistatic features and design combinatorial mutants. After two rounds of design, we obtained 50 combinatorial mutants with superior thermostability, achieving a success rate of 100%. The best mutant, 13M4, contained 13 mutation sites and maintained nearly full catalytic activity compared to the wild-type. It showed a 10.19°C increase in the melting temperature and an ~655-fold increase in the half-life at 58°C. Additionally, the model successfully captured epistasis in high-order combinatorial mutants, including sign epistasis (K351E) and synergistic epistasis (D17V/I149V). We elucidated the mechanism of long-range epistasis in detail using a dynamics cross-correlation matrix method. Our work provides an efficient framework for designing enzyme thermostability and studying high-order epistatic effects in protein-directed evolution.
优化酶的热稳定性对于蛋白质科学和工业应用的进步至关重要。目前,(半)理性设计和随机诱变方法可以准确识别增强酶热稳定性的单点突变。然而,当多个突变位点组合时,常常会出现复杂的上位性相互作用,导致组合突变体完全失活。因此,构建优化的酶通常需要反复进行多轮设计,逐步纳入单点突变位点,这非常耗时。在本研究中,我们开发了一种人工智能辅助的酶热稳定性工程策略,该策略有效地促进了有益单点突变的重组。我们利用了肌酸酶的热稳定性数据,包括18个单点突变体、22个双点突变体、21个三点突变体和12个四点突变体。以这些数据为输入,我们使用了一个温度引导的蛋白质语言模型Pro-PRIME来学习上位性特征并设计组合突变体。经过两轮设计,我们获得了50个具有优异热稳定性的组合突变体,成功率达到100%。最佳突变体13M4包含13个突变位点,与野生型相比保持了几乎完全的催化活性。它的解链温度提高了10.19°C,在58°C下的半衰期增加了约655倍。此外,该模型成功捕捉到了高阶组合突变体中的上位性,包括符号上位性(K351E)和协同上位性(D17V/I149V)。我们使用动力学交叉相关矩阵方法详细阐明了远程上位性的机制。我们的工作为设计酶的热稳定性和研究蛋白质定向进化中的高阶上位性效应提供了一个有效的框架。