Department of Molecular Genetics, The University of Toronto, Toronto, ON M5S 1A8, Canada.
Donnelly Centre for Cellular and Biomolecular Research, The University of Toronto, Toronto, ON M5S 3E1, Canada.
Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae636.
Protein-protein interactions are essential for a variety of biological phenomena including mediating biochemical reactions, cell signaling, and the immune response. Proteins seek to form interfaces which reduce overall system energy. Although determination of single polypeptide chain protein structures has been revolutionized by deep learning techniques, complex prediction has still not been perfected. Additionally, experimentally determining structures is incredibly resource and time expensive. An alternative is the technique of computational docking, which takes the solved individual structures of proteins to produce candidate interfaces (decoys). Decoys are then scored using a mathematical function that assess the quality of the system, known as scoring functions. Beyond docking, scoring functions are a critical component of assessing structures produced by many protein generative models. Scoring models are also used as a final filtering in many generative deep learning models including those that generate antibody binders, and those which perform docking.
In this work, we present improved scoring functions for protein-protein interactions which utilizes cutting-edge Euclidean graph neural network architectures, to assess protein-protein interfaces. These Euclidean docking score models are known as EuDockScore, and EuDockScore-Ab with the latter being antibody-antigen dock specific. Finally, we provided EuDockScore-AFM a model trained on antibody-antigen outputs from AlphaFold-Multimer (AFM) which proves useful in reranking large numbers of AFM outputs.
The code for these models is available at https://gitlab.com/mcfeemat/eudockscore.
蛋白质-蛋白质相互作用对于各种生物现象至关重要,包括介导生化反应、细胞信号传递和免疫反应。蛋白质试图形成界面,从而降低整个系统的能量。尽管深度学习技术已经彻底改变了单多肽链蛋白质结构的测定,但复杂的预测仍然没有完善。此外,实验确定结构非常耗费资源和时间。另一种方法是计算对接技术,它利用已解决的单个蛋白质结构来产生候选界面(诱饵)。然后使用评估系统质量的数学函数对诱饵进行评分,该函数称为评分函数。除了对接之外,评分函数还是评估许多蛋白质生成模型生成的结构的关键组成部分。评分模型也被用作许多生成式深度学习模型的最终筛选器,包括生成抗体结合物的模型和执行对接的模型。
在这项工作中,我们提出了改进的蛋白质-蛋白质相互作用评分函数,该函数利用了最先进的欧式图神经网络架构,以评估蛋白质-蛋白质界面。这些欧式对接评分模型被称为 EuDockScore,以及针对抗体-抗原对接的 EuDockScore-Ab。最后,我们提供了基于 AlphaFold-Multimer (AFM) 抗体-抗原输出训练的 EuDockScore-AFM 模型,该模型在重新排序大量 AFM 输出时非常有用。
这些模型的代码可在 https://gitlab.com/mcfeemat/eudockscore 获得。