Johnston Kadina E, Fannjiang Clara, Wittmann Bruce J, Hie Brian L, Yang Kevin K, Wu Zachary
California Institute of Technology.
University of California, Berkeley.
ArXiv. 2023 May 26:arXiv:2305.16634v1.
Directed evolution of proteins has been the most effective method for protein engineering. However, a new paradigm is emerging, fusing the library generation and screening approaches of traditional directed evolution with computation through the training of machine learning models on protein sequence fitness data. This chapter highlights successful applications of machine learning to protein engineering and directed evolution, organized by the improvements that have been made with respect to each step of the directed evolution cycle. Additionally, we provide an outlook for the future based on the current direction of the field, namely in the development of calibrated models and in incorporating other modalities, such as protein structure.
蛋白质的定向进化一直是蛋白质工程中最有效的方法。然而,一种新的模式正在出现,它将传统定向进化的文库生成和筛选方法与计算相结合,通过在蛋白质序列适应性数据上训练机器学习模型来实现。本章重点介绍机器学习在蛋白质工程和定向进化中的成功应用,按照定向进化周期各步骤所取得的改进进行组织。此外,我们根据该领域的当前发展方向,即校准模型的开发以及纳入其他模式(如蛋白质结构),对未来进行了展望。