Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.
Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia.
Nucleic Acids Res. 2021 Jul 2;49(W1):W417-W424. doi: 10.1093/nar/gkab273.
Protein-protein interactions play a crucial role in all cellular functions and biological processes and mutations leading to their disruption are enriched in many diseases. While a number of computational methods to assess the effects of variants on protein-protein binding affinity have been proposed, they are in general limited to the analysis of single point mutations and have been shown to perform poorly on independent test sets. Here, we present mmCSM-PPI, a scalable and effective machine learning model for accurately assessing changes in protein-protein binding affinity caused by single and multiple missense mutations. We expanded our well-established graph-based signatures in order to capture physicochemical and geometrical properties of multiple wild-type residue environments and integrated them with substitution scores and dynamics terms from normal mode analysis. mmCSM-PPI was able to achieve a Pearson's correlation of up to 0.75 (RMSE = 1.64 kcal/mol) under 10-fold cross-validation and 0.70 (RMSE = 2.06 kcal/mol) on a non-redundant blind test, outperforming existing methods. Our method is freely available as a user-friendly and easy-to-use web server and API at http://biosig.unimelb.edu.au/mmcsm_ppi.
蛋白质-蛋白质相互作用在所有细胞功能和生物过程中都起着至关重要的作用,导致它们破坏的突变在许多疾病中富集。虽然已经提出了许多计算方法来评估变体对蛋白质-蛋白质结合亲和力的影响,但它们通常仅限于单一点突变的分析,并且在独立测试集上的表现不佳。在这里,我们提出了 mmCSM-PPI,这是一种可扩展且有效的机器学习模型,可准确评估单和多个错义突变引起的蛋白质-蛋白质结合亲和力的变化。我们扩展了我们成熟的基于图的特征,以捕获多个野生型残基环境的物理化学和几何性质,并将它们与来自正常模式分析的取代得分和动力学项集成在一起。mmCSM-PPI 在 10 倍交叉验证下达到高达 0.75(RMSE = 1.64 kcal/mol)的 Pearson 相关系数,在非冗余盲测中达到 0.70(RMSE = 2.06 kcal/mol),优于现有方法。我们的方法可免费作为用户友好且易于使用的 Web 服务器和 API 在 http://biosig.unimelb.edu.au/mmcsm_ppi 上使用。