Department of Information Engineering and Mathematics, University of Siena, Via Roma, 56, 53100 Siena, Italy.
Int J Mol Sci. 2024 May 28;25(11):5870. doi: 10.3390/ijms25115870.
Protein-protein interactions (PPIs) are fundamental processes governing cellular functions, crucial for understanding biological systems at the molecular level. Compared to experimental methods for PPI prediction and site identification, computational deep learning approaches represent an affordable and efficient solution to tackle these problems. Since protein structure can be summarized as a graph, graph neural networks (GNNs) represent the ideal deep learning architecture for the task. In this work, PPI prediction is modeled as a node-focused binary classification task using a GNN to determine whether a generic residue is part of the interface. Biological data were obtained from the Protein Data Bank in Europe (PDBe), leveraging the Protein Interfaces, Surfaces, and Assemblies (PISA) service. To gain a deeper understanding of how proteins interact, the data obtained from PISA were assembled into three datasets: , , and , consisting of data on the whole protein, couples of interacting chains, and single chains, respectively. These three datasets correspond to three different nuances of the problem: identifying interfaces between protein complexes, between chains of the same protein, and interface regions in general. The results indicate that GNNs are capable of solving each of the three tasks with very good performance levels.
蛋白质-蛋白质相互作用 (PPIs) 是控制细胞功能的基本过程,对于在分子水平上理解生物系统至关重要。与用于预测蛋白质相互作用和位点鉴定的实验方法相比,计算深度学习方法是解决这些问题的一种经济高效的解决方案。由于蛋白质结构可以概括为一个图,因此图神经网络 (GNN) 是解决该任务的理想深度学习架构。在这项工作中,使用 GNN 将 PPI 预测建模为一个以节点为中心的二进制分类任务,以确定通用残基是否属于界面的一部分。生物数据来自欧洲蛋白质数据库 (PDBe),利用蛋白质界面、表面和组装 (PISA) 服务。为了更深入地了解蛋白质如何相互作用,从 PISA 获得的数据被组装成三个数据集: 、 和 ,分别包含关于整个蛋白质、相互作用链对和单个链的数据。这三个数据集对应于该问题的三个不同细微差别:识别蛋白质复合物之间的界面、同一蛋白质的链之间的界面以及一般的界面区域。结果表明,GNN 能够非常好地解决这三个任务中的每一个。