Kouba Petr, Kohout Pavel, Haddadi Faraneh, Bushuiev Anton, Samusevich Raman, Sedlar Jiri, Damborsky Jiri, Pluskal Tomas, Sivic Josef, Mazurenko Stanislav
Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic.
Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic.
ACS Catal. 2023 Oct 13;13(21):13863-13895. doi: 10.1021/acscatal.3c02743. eCollection 2023 Nov 3.
Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid in the discovery and annotation of promising enzymes, as well as in suggesting beneficial mutations for improving known targets. The field of machine learning for protein engineering is gathering steam, driven by recent success stories and notable progress in other areas. It already encompasses ambitious tasks such as understanding and predicting protein structure and function, catalytic efficiency, enantioselectivity, protein dynamics, stability, solubility, aggregation, and more. Nonetheless, the field is still evolving, with many challenges to overcome and questions to address. In this Perspective, we provide an overview of ongoing trends in this domain, highlight recent case studies, and examine the current limitations of machine learning-based methods. We emphasize the crucial importance of thorough experimental validation of emerging models before their use for rational protein design. We present our opinions on the fundamental problems and outline the potential directions for future research.
在设计极具潜力的生物催化剂方面,近期的进展越来越多地涉及机器学习方法。这些方法利用现有的实验和模拟数据,以协助发现和注释有前景的酶,并为改进已知靶点提出有益的突变。蛋白质工程的机器学习领域正在蓬勃发展,这得益于近期的成功案例以及其他领域的显著进展。它已经涵盖了诸如理解和预测蛋白质结构与功能、催化效率、对映选择性、蛋白质动力学、稳定性、溶解度、聚集等雄心勃勃的任务。尽管如此,该领域仍在不断发展,有许多挑战需要克服,也有诸多问题需要解决。在这篇综述文章中,我们概述了该领域当前的发展趋势,突出了近期的案例研究,并审视了基于机器学习方法目前存在的局限性。我们强调在将新兴模型用于理性蛋白质设计之前,对其进行全面实验验证的至关重要性。我们阐述了对基本问题的看法,并概述了未来研究的潜在方向。