Nie Zhiwei, Ma Yiming, Liu Yutian, Huang Xiansong, Liu Zhihong, Yang Peng, Xu Fan, Yin Feng, Li Zigang, Fu Jie, Ren Zhixiang, Zhang Wen-Bin, Chen Jie
School of Electronic and Computer Engineering, Peking University, Shenzhen, China.
Pengcheng Laboratory, Shenzhen, China.
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf319.
Predicting the protein stability changes upon mutations is one of the effective ways to improve the efficiency of protein engineering. Here, we propose a dual-view ensemble learning-based framework, DVE-stability, for mutation-induced protein stability change prediction from single sequence. DVE-stability integrates the global and local dependencies of mutations to capture the intramolecular interactions from two views through ensemble learning, in which a structural microenvironment simulation module is designed to indirectly introduce the information of structural microenvironment at the sequence level. DVE-stability achieved state-of-the-art prediction performance on seven single-point mutation benchmark datasets, and comprehensively surpassed other methods on five of them. Furthermore, DVE-stability outperformed other methods comprehensively through zero-shot inference on multiple-point mutation prediction task, demonstrating superior model generalizability to capture the epistasis of multiple-point mutations. More importantly, DVE-stability exhibited superior generalization performance in predicting rare beneficial mutations that are crucial for practical protein directed evolution scenarios. In addition, DVE-stability identified important intramolecular interactions via attention scores, demonstrating interpretable. Overall, DVE-stability provides a flexible and efficient tool for mutation-induced protein stability change prediction in an interpretable ensemble learning manner.
预测突变后蛋白质稳定性的变化是提高蛋白质工程效率的有效方法之一。在此,我们提出了一种基于双视图集成学习的框架DVE-stability,用于从单序列预测突变诱导的蛋白质稳定性变化。DVE-stability整合了突变的全局和局部依赖性,通过集成学习从两个视图捕获分子内相互作用,其中设计了一个结构微环境模拟模块,以在序列水平间接引入结构微环境信息。DVE-stability在七个单点突变基准数据集上实现了最优的预测性能,并且在其中五个数据集上全面超越了其他方法。此外,DVE-stability在多点突变预测任务的零样本推理中全面优于其他方法,证明了其在捕获多点突变上位性方面具有卓越的模型通用性。更重要的是,DVE-stability在预测对实际蛋白质定向进化场景至关重要的罕见有益突变方面表现出卓越的泛化性能。此外,DVE-stability通过注意力分数识别了重要的分子内相互作用,具有可解释性。总体而言,DVE-stability以可解释的集成学习方式为突变诱导的蛋白质稳定性变化预测提供了一个灵活高效的工具。