Li Xiaofan, Wang Qihan, Li Jianfeng, Tang Fangfang, Yang Xiaofeng, Lin Zhanglin
School of Biology and Biological Engineering, South China University of Technology, Guangzhou, Guangdong 510006, China.
iScience. 2025 Aug 7;28(9):113324. doi: 10.1016/j.isci.2025.113324. eCollection 2025 Sep 19.
Deep learning has rapidly emerged as a promising toolkit for protein optimization, yet its success remains limited, particularly in the realm of activity. Moreover, most algorithms lack rigorous iterative evaluation, a crucial aspect of protein engineering exemplified by classical directed evolution. This study introduces DeepDE, a robust iterative deep learning-guided algorithm leveraging triple mutants as building blocks and a compact library of ∼1,000 mutants for training. Triple mutants allow for the exploration of a much greater sequence space compared to single or double mutants in each iteration. When applied to GFP from , DeepDE achieved a remarkable 74.3-fold increase in activity over four rounds of evolution, far surpassing the benchmark superfolder GFP. Our study suggests that limited screening involving experimentally affordable ∼1,000 variants significantly enhances the performance of DeepDE, likely by mitigating the constraints imposed by the intractable data sparsity problem in protein engineering.
深度学习已迅速成为蛋白质优化的一个有前景的工具包,但其成功仍有限,尤其是在活性方面。此外,大多数算法缺乏严格的迭代评估,而这是经典定向进化所体现的蛋白质工程的一个关键方面。本研究引入了DeepDE,这是一种强大的迭代深度学习引导算法,它利用三重突变体作为构建模块,并使用一个约1000个突变体的紧凑文库进行训练。与每次迭代中的单突变体或双突变体相比,三重突变体能够探索更大的序列空间。当应用于来自的绿色荧光蛋白(GFP)时,DeepDE在四轮进化中活性显著提高了74.3倍,远远超过了基准超级折叠GFP。我们的研究表明,涉及实验上可承受的约1000个变体的有限筛选显著提高了DeepDE的性能,这可能是通过减轻蛋白质工程中棘手的数据稀疏问题所带来的限制实现的。