State Key Laboratory of Microbial Technology, Shandong University, Qingdao 266237, China.
School of Software, Shandong University, Jinan 250101, China.
Acta Biochim Biophys Sin (Shanghai). 2023 Mar 25;55(3):343-355. doi: 10.3724/abbs.2023033.
Thermal stability is one of the most important properties of enzymes, which sustains life and determines the potential for the industrial application of biocatalysts. Although traditional methods such as directed evolution and classical rational design contribute greatly to this field, the enormous sequence space of proteins implies costly and arduous experiments. The development of enzyme engineering focuses on automated and efficient strategies because of the breakthrough of high-throughput DNA sequencing and machine learning models. In this review, we propose a data-driven architecture for enzyme thermostability engineering and summarize some widely adopted datasets, as well as machine learning-driven approaches for designing the thermal stability of enzymes. In addition, we present a series of existing challenges while applying machine learning in enzyme thermostability design, such as the data dilemma, model training, and use of the proposed models. Additionally, a few promising directions for enhancing the performance of the models are discussed. We anticipate that the efficient incorporation of machine learning can provide more insights and solutions for the design of enzyme thermostability in the coming years.
热稳定性是酶的最重要性质之一,它维持着生命的存在,并决定了生物催化剂在工业应用中的潜力。尽管定向进化和经典理性设计等传统方法对此领域贡献巨大,但蛋白质的巨大序列空间意味着需要进行昂贵且艰巨的实验。由于高通量 DNA 测序和机器学习模型的突破,酶工程的发展侧重于自动化和高效的策略。在这篇综述中,我们提出了一种用于酶热稳定性工程的基于数据的架构,并总结了一些广泛采用的数据集,以及用于设计酶热稳定性的机器学习驱动方法。此外,我们还提出了在应用机器学习进行酶热稳定性设计时存在的一系列挑战,例如数据困境、模型训练以及所提出模型的使用。此外,还讨论了增强模型性能的一些有前途的方向。我们预计,在未来几年,机器学习的有效结合将为酶热稳定性设计提供更多的见解和解决方案。