Zhang Qing, Hu You-Hang, Zhou Yu, Hu Jun, Zhou Xiao-Gen, Zhang Biao
College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China.
Center for AI and Computational Biology, Suzhou Institute of Systems Medicine, Suzhou, 215123, China.
Anal Biochem. 2025 Oct;705:115929. doi: 10.1016/j.ab.2025.115929. Epub 2025 Jun 28.
Protein-protein interactions (PPIs) play a pivotal role in numerous biological processes. Accurate identification of the amino acid residues involved in these interactions is essential for understanding the functional mechanisms of proteins. To effectively integrate both structure and sequence information, we propose a new interaction site predictor, TargetPPI, which leverages bidirectional long short-term memory networks (Bi-LSTM), convolutional neural networks (CNN), and Edge Aggregation through Graph Attention layers with Node Similarity (EGR-NS) neural networks. In TargetPPI, CNN and Bi-LSTM are first employed to extract the global and local feature information, respectively. The combination of global and local features is then used as node embeddings in the graph derived from the protein structure. We have also extracted six discriminative structural features as edge features in the graph. Additionally, a mean ensemble strategy is used to integrate multiple prediction models with diverse model parameters into the final model, resulting in more accurate PPIs prediction performance. Benchmarked results on seven independent testing datasets demonstrate that, compared to most of the state-of-the-art methods, TargetPPI achieves higher accuracy, precision, and Matthews Correlation Coefficient (MCC) values on average, specifically, 84.3 %, 57.6 %, and 0.383, respectively. The source code of TargetPPI is freely available at https://github.com/bukkeshuo/TargetPPI.
蛋白质-蛋白质相互作用(PPIs)在众多生物过程中起着关键作用。准确识别参与这些相互作用的氨基酸残基对于理解蛋白质的功能机制至关重要。为了有效地整合结构和序列信息,我们提出了一种新的相互作用位点预测器TargetPPI,它利用双向长短期记忆网络(Bi-LSTM)、卷积神经网络(CNN)以及通过具有节点相似性的图注意力层进行边缘聚合(EGR-NS)的神经网络。在TargetPPI中,首先使用CNN和Bi-LSTM分别提取全局和局部特征信息。然后,将全局和局部特征的组合用作从蛋白质结构派生的图中的节点嵌入。我们还提取了六个有区分力的结构特征作为图中的边特征。此外,采用均值集成策略将具有不同模型参数的多个预测模型集成到最终模型中,从而获得更准确的PPIs预测性能。在七个独立测试数据集上的基准测试结果表明,与大多数现有方法相比,TargetPPI平均实现了更高的准确率、精确率和马修斯相关系数(MCC)值,具体分别为84.3%、57.6%和0.383。TargetPPI的源代码可在https://github.com/bukkeshuo/TargetPPI上免费获取。