Xu Xiaotong, Bonvin Alexandre M J J
Department of Chemistry, Faculty of Science, Computational Structural Biology Group, Bijvoet Centre for Biomolecular Research, Utrecht 3584 CS, The Netherlands.
Bioinform Adv. 2024 Jan 5;4(1):vbad191. doi: 10.1093/bioadv/vbad191. eCollection 2024.
Protein-Protein interactions (PPIs) play critical roles in numerous cellular processes. By modelling the 3D structures of the correspond protein complexes valuable insights can be obtained, providing, e.g. starting points for drug and protein design. One challenge in the modelling process is however the identification of near-native models from the large pool of generated models. To this end we have previously developed DeepRank-GNN, a graph neural network that integrates structural and sequence information to enable effective pattern learning at PPI interfaces. Its main features are related to the Position Specific Scoring Matrices (PSSMs), which are computationally expensive to generate, significantly limits the algorithm's usability.
We introduce here DeepRank-GNN-esm that includes as additional features protein language model embeddings from the ESM-2 model. We show that the ESM-2 embeddings can actually replace the PSSM features at no cost in-, or even better performance on two PPI-related tasks: scoring docking poses and detecting crystal artifacts. This new DeepRank version bypasses thus the need of generating PSSM, greatly improving the usability of the software and opening new application opportunities for systems for which PSSM profiles cannot be obtained or are irrelevant (e.g. antibody-antigen complexes).
DeepRank-GNN-esm is freely available from https://github.com/DeepRank/DeepRank-GNN-esm.
蛋白质-蛋白质相互作用(PPI)在众多细胞过程中发挥着关键作用。通过对相应蛋白质复合物的三维结构进行建模,可以获得有价值的见解,例如为药物和蛋白质设计提供起点。然而,建模过程中的一个挑战是从大量生成的模型中识别接近天然的模型。为此,我们之前开发了DeepRank-GNN,这是一种图神经网络,它整合了结构和序列信息,以便在PPI界面进行有效的模式学习。其主要特征与位置特异性评分矩阵(PSSM)相关,而生成PSSM的计算成本很高,这显著限制了该算法的可用性。
我们在此引入DeepRank-GNN-esm,它包含来自ESM-2模型的蛋白质语言模型嵌入作为附加特征。我们表明,在两个与PPI相关的任务中,即对接姿势评分和晶体伪像检测,ESM-2嵌入实际上可以免费替代PSSM特征,甚至性能更好。因此,这个新的DeepRank版本无需生成PSSM,大大提高了软件的可用性,并为无法获得或不相关PSSM概况的系统(例如抗体-抗原复合物)开辟了新的应用机会。
DeepRank-GNN-esm可从https://github.com/DeepRank/DeepRank-GNN-esm免费获取。