College of Mathematics and Computer Science, Dali University, 671003, Dali, China.
State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, 650000, Kunming, China.
BMC Genomics. 2024 May 9;25(1):406. doi: 10.1186/s12864-024-10299-x.
Most proteins exert their functions by interacting with other proteins, making the identification of protein-protein interactions (PPI) crucial for understanding biological activities, pathological mechanisms, and clinical therapies. Developing effective and reliable computational methods for predicting PPI can significantly reduce the time-consuming and labor-intensive associated traditional biological experiments. However, accurately identifying the specific categories of protein-protein interactions and improving the prediction accuracy of the computational methods remain dual challenges. To tackle these challenges, we proposed a novel graph neural network method called GNNGL-PPI for multi-category prediction of PPI based on global graphs and local subgraphs. GNNGL-PPI consisted of two main components: using Graph Isomorphism Network (GIN) to extract global graph features from PPI network graph, and employing GIN As Kernel (GIN-AK) to extract local subgraph features from the subgraphs of protein vertices. Additionally, considering the imbalanced distribution of samples in each category within the benchmark datasets, we introduced an Asymmetric Loss (ASL) function to further enhance the predictive performance of the method. Through evaluations on six benchmark test sets formed by three different dataset partitioning algorithms (Random, BFS, DFS), GNNGL-PPI outperformed the state-of-the-art multi-category prediction methods of PPI, as measured by the comprehensive performance evaluation metric F1-measure. Furthermore, interpretability analysis confirmed the effectiveness of GNNGL-PPI as a reliable multi-category prediction method for predicting protein-protein interactions.
大多数蛋白质通过与其他蛋白质相互作用来发挥其功能,因此鉴定蛋白质-蛋白质相互作用(PPI)对于理解生物活性、病理机制和临床治疗至关重要。开发有效且可靠的计算方法来预测 PPI 可以显著减少与传统生物实验相关的耗时和劳动密集型工作。然而,准确识别蛋白质-蛋白质相互作用的特定类别并提高计算方法的预测准确性仍然是双重挑战。为了解决这些挑战,我们提出了一种新的图神经网络方法,称为 GNNGL-PPI,用于基于全局图和局部子图的 PPI 多类别预测。GNNGL-PPI 由两个主要部分组成:使用图同构网络(GIN)从 PPI 网络图中提取全局图特征,以及使用 GIN 作为核(GIN-AK)从蛋白质顶点的子图中提取局部子图特征。此外,考虑到基准数据集内每个类别样本的不平衡分布,我们引入了不对称损失(ASL)函数来进一步提高方法的预测性能。通过对由三种不同数据集分区算法(Random、BFS、DFS)形成的六个基准测试集进行评估,GNNGL-PPI 在综合性能评估指标 F1 度量方面优于最先进的 PPI 多类别预测方法。此外,可解释性分析证实了 GNNGL-PPI 作为一种可靠的多类别预测方法,可用于预测蛋白质-蛋白质相互作用。