Liang Tianjian, Sun Ze-Yu, Ishima Rieko, Xie Xiang-Qun, Xue Ying, Li Wei, Feng Zhiwei
Department of Pharmaceutical Sciences, Computational Chemical Genomics Screening Center, and Pharmacometrics and System Pharmacology PharmacoAnalytics, School of Pharmacy, National Center of Excellence for Computational Drug Abuse Research, University of Pittsburgh, Pittsburgh, PA 15261, USA.
Department of Structural Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
Research (Wash D C). 2025 Apr 15;8:0674. doi: 10.34133/research.0674. eCollection 2025.
Proteins play a critical role in biology and biopharma due to their specificity and minimal side effects. Predicting the effects of mutations on protein stability is vital but experimentally challenging. Deep learning offers an efficient solution to this problem. In the present work, we introduced ProstaNet, a deep learning framework that predicts stability changes resulting from single- and multiple-point mutations using geometric vector perceptrons-graph neural network for 3-dimensional feature processing. For training ProstaNet, we meticulously crafted ProstaDB, a comprehensive and pristine thermodynamics repository, including 3,784 single-point mutations and 1,642 multiple-point mutations. We also created thermodynamic looping for enlarging the limited data size of multiple-point mutation and applied an innovative clustering method to generate a standard testing set of multiple-point mutation. Besides, we identified residue scoring as the most important encoding method in protein properties prediction. With these innovations, ProstaNet accurately predicts thermostability changes for both single-point and multiple-point mutations without showing any bias. ProstaNet achieves an accuracy of 0.75, outperforming existing methods for single-point mutation prediction, including ThermoMPNN (0.63), PoPMuSiC (0.66), MUPRO (0.52), and FoldX (0.71). ProstaNet also achieves a 1.3-fold increase in accuracy compared to FoldX for multiple-point mutation predictions. Validated by experiment, 4 out of 5 single-point mutation predictions (80%) and all multiple-point mutation predictions (100%) for HuJ3 mutants were accurate, demonstrating the potential benefits of ProstaNet for protein engineering and drug development.
由于蛋白质具有特异性且副作用极小,因此在生物学和生物制药领域发挥着关键作用。预测突变对蛋白质稳定性的影响至关重要,但在实验上具有挑战性。深度学习为解决这一问题提供了一种有效的方法。在本研究中,我们引入了ProstaNet,这是一个深度学习框架,它使用几何向量感知器-图神经网络进行三维特征处理,来预测单点和多点突变引起的稳定性变化。为了训练ProstaNet,我们精心构建了ProstaDB,这是一个全面且纯净的热力学数据库,包括3784个单点突变和1642个多点突变。我们还创建了热力学循环以扩大多点突变有限的数据量,并应用了一种创新的聚类方法来生成多点突变的标准测试集。此外,我们确定残基评分是蛋白质特性预测中最重要的编码方法。通过这些创新,ProstaNet能够准确预测单点和多点突变的热稳定性变化,且无任何偏差。ProstaNet的准确率达到了0.75,优于现有的单点突变预测方法,包括ThermoMPNN(0.63)、PoPMuSiC(0.66)、MUPRO(0.52)和FoldX(0.71)。在多点突变预测方面,ProstaNet的准确率相比FoldX提高了1.3倍。经实验验证,HuJ3突变体的5个单点突变预测中有4个(80%)以及所有多点突变预测(100%)都是准确的,这证明了ProstaNet在蛋白质工程和药物开发方面的潜在优势。