College of Mathematics and Computer Science, Dali University, Dali, 671003, China.
Yunnan Key Laboratory of Screening and Research on Anti-pathogenic Plant Resources from Western Yunnan, Dali University, Dali, 671000, China.
Comput Biol Med. 2024 Jan;168:107683. doi: 10.1016/j.compbiomed.2023.107683. Epub 2023 Nov 14.
Accurately pinpointing protein-protein interaction site (PPIS) on the molecular level is of utmost significance for annotating protein function and comprehending the mechanisms underpinning various diseases. While numerous computational methods for predicting PPIS have emerged, they have indeed mitigated the labor and time constraints associated with traditional experimental methods. However, the predictive accuracy of these methods has yet to reach the desired threshold. In this context, we proposed a groundbreaking graph-based computational model called GHGPR-PPIS. This innovative model leveraged a graph convolutional network using heat kernel (GraphHeat) in conjunction with Generalized PageRank techniques (GHGPR) to predict PPIS. Additionally, building upon the GHGPR framework, we devised an edge self-attention feature processing block, further augmenting the performance of the model. Experimental findings conclusively demonstrated that GHGPR-PPIS surpassed all competing state-of-the-art models when evaluated on the benchmark test set. Impressively, on two distinct independent test sets and a specific protein chain, GHGPR-PPIS consistently demonstrated superior generalization performance and practical applicability compared to the comparative model, AGAT-PPIS. Lastly, leveraging the t-SNE dimensionality reduction algorithm and clustering visualization technique, we delved into an interpretability analysis of the effectiveness of GHGPR-PPIS by meticulously comparing the outputs from different stages of the model.
准确地确定蛋白质-蛋白质相互作用位点 (PPIS) 在分子水平上对于注释蛋白质功能和理解各种疾病的机制至关重要。虽然已经出现了许多预测 PPIS 的计算方法,但它们确实减轻了传统实验方法相关的劳动和时间限制。然而,这些方法的预测准确性尚未达到预期的阈值。在这种情况下,我们提出了一种名为 GHGPR-PPIS 的开创性基于图的计算模型。该创新模型利用了图卷积网络和热核(GraphHeat)以及广义 PageRank 技术(GHGPR)来预测 PPIS。此外,我们在 GHGPR 框架的基础上,设计了一个边自注意特征处理块,进一步提高了模型的性能。实验结果表明,在基准测试集上进行评估时,GHGPR-PPIS 超过了所有竞争的最先进模型。令人印象深刻的是,在两个不同的独立测试集和特定的蛋白质链上,与比较模型 AGAT-PPIS 相比,GHGPR-PPIS 始终表现出卓越的泛化性能和实际适用性。最后,我们利用 t-SNE 降维算法和聚类可视化技术,通过仔细比较模型不同阶段的输出,深入研究了 GHGPR-PPIS 的有效性的可解释性分析。