Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA.
Center for Structural Biology, Vanderbilt University, Nashville, TN 37235, USA.
Protein Eng Des Sel. 2023 Jan 21;36. doi: 10.1093/protein/gzac009.
Identifying function-enhancing enzyme variants is a 'holy grail' challenge in protein science because it will allow researchers to expand the biocatalytic toolbox for late-stage functionalization of drug-like molecules, environmental degradation of plastics and other pollutants, and medical treatment of food allergies. Data-driven strategies, including statistical modeling, machine learning, and deep learning, have largely advanced the understanding of the sequence-structure-function relationships for enzymes. They have also enhanced the capability of predicting and designing new enzymes and enzyme variants for catalyzing the transformation of new-to-nature reactions. Here, we reviewed the recent progresses of data-driven models that were applied in identifying efficiency-enhancing mutants for catalytic reactions. We also discussed existing challenges and obstacles faced by the community. Although the review is by no means comprehensive, we hope that the discussion can inform the readers about the state-of-the-art in data-driven enzyme engineering, inspiring more joint experimental-computational efforts to develop and apply data-driven modeling to innovate biocatalysts for synthetic and pharmaceutical applications.
鉴定具有增强功能的酶变体是蛋白质科学的“圣杯”挑战,因为它将使研究人员能够扩展生物催化工具包,用于药物样分子的后期功能化、塑料和其他污染物的环境降解,以及食物过敏的医疗治疗。数据驱动的策略,包括统计建模、机器学习和深度学习,在很大程度上促进了对酶的序列-结构-功能关系的理解。它们还提高了预测和设计新酶和酶变体的能力,以催化新的自然反应的转化。在这里,我们回顾了应用于鉴定催化反应增效突变体的数据驱动模型的最新进展。我们还讨论了该领域面临的现有挑战和障碍。虽然这篇综述并不全面,但我们希望讨论能让读者了解数据驱动的酶工程的最新技术,激发更多的实验计算联合努力,开发和应用数据驱动的建模来创新用于合成和制药应用的生物催化剂。