Huaihe Hospital of Henan University, Kaifeng 475004, China; School of Computer and Information Engineering, Henan University, Kaifeng 475004, China.
School of Computer and Information Engineering, Henan University, Kaifeng 475004, China.
Comput Biol Med. 2024 Jul;177:108669. doi: 10.1016/j.compbiomed.2024.108669. Epub 2024 May 29.
The process of experimentally confirming complex interaction networks among proteins is time-consuming and laborious. This study aims to address Protein-Protein Interactions (PPIs) prediction based on graph neural networks (GNN). A novel multilevel prediction model for PPIs named DSSGNN-PPI (Double Structure and Sequence GNN for PPIs) is designed. Initially, a distance graph between amino acid residues is constructed. Subsequently, the distance graph is fed into an underlying graph attention network module. This enables us to efficiently learn vector representations that encode the three-dimensional structure of proteins and simultaneously aggregate key local patterns and overall topological information to obtain graph embedding that adequately represent local and global structural features. In addition, the embedding representations that reflect sequence properties are obtained. Two features are fused to construct high-level protein complex networks, which are fed into the designed gated graph attention network to extract complex topological patterns. By combining heterogeneous multi-source information from downstream structure graph and upstream sequence models, the understanding of PPIs is comprehensively enhanced. A series of evaluation results validate the remarkable effectiveness of DSSGNN-PPI framework in enhancing the prediction of multi-type interactions among proteins. The multilevel representation learning and information fusion strategies provide a new effective solution paradigm for structural biology problems. The source code for DSSGNN-PPI has been hosted on GitHub and is available at https://github.com/cstudy1/DSSGNN-PPI.
实验验证蛋白质之间复杂相互作用网络的过程既耗时又费力。本研究旨在基于图神经网络(GNN)解决蛋白质-蛋白质相互作用(PPIs)预测问题。设计了一种名为 DSSGNN-PPI(用于 PPIs 的双重结构和序列 GNN)的新型 PPI 多级预测模型。首先,构建氨基酸残基之间的距离图。然后,将距离图输入到底层图注意网络模块中。这使我们能够有效地学习向量表示,该表示编码蛋白质的三维结构,并同时聚合关键局部模式和整体拓扑信息,以获得充分表示局部和全局结构特征的图嵌入。此外,还获得了反映序列特性的嵌入表示。融合两个特征来构建高级蛋白质复杂网络,并将其输入到设计的门控图注意网络中,以提取复杂的拓扑模式。通过结合来自下游结构图和上游序列模型的异构多源信息,全面增强了对 PPIs 的理解。一系列评估结果验证了 DSSGNN-PPI 框架在增强蛋白质之间多种类型相互作用预测方面的显著有效性。多层次表示学习和信息融合策略为结构生物学问题提供了一种新的有效解决方案范例。DSSGNN-PPI 的源代码已托管在 GitHub 上,并可在 https://github.com/cstudy1/DSSGNN-PPI 上获得。