College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China.
College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China; Hunan Engineering Research Center of Advanced Embedded Computing and Intelligent Medical Systems, Hunan University, Changsha, 410082, China.
Comput Biol Med. 2024 Jun;176:108543. doi: 10.1016/j.compbiomed.2024.108543. Epub 2024 May 3.
Proteins play a vital role in various biological processes and achieve their functions through protein-protein interactions (PPIs). Thus, accurate identification of PPI sites is essential. Traditional biological methods for identifying PPIs are costly, labor-intensive, and time-consuming. The development of computational prediction methods for PPI sites offers promising alternatives. Most known deep learning (DL) methods employ layer-wise multi-scale CNNs to extract features from protein sequences. But, these methods usually neglect the spatial positions and hierarchical information embedded within protein sequences, which are actually crucial for PPI site prediction. In this paper, we propose MR2CPPIS, a novel sequence-based DL model that utilizes the multi-scale Res2Net with coordinate attention mechanism to exploit multi-scale features and enhance PPI site prediction capability. We leverage the multi-scale Res2Net to expand the receptive field for each network layer, thus capturing multi-scale information of protein sequences at a granular level. To further explore the local contextual features of each target residue, we employ a coordinate attention block to characterize the precise spatial position information, enabling the network to effectively extract long-range dependencies. We evaluate our MR2CPPIS on three public benchmark datasets (Dset 72, Dset 186, and PDBset 164), achieving state-of-the-art performance. The source codes are available at https://github.com/YyinGong/MR2CPPIS.
蛋白质在各种生物过程中发挥着重要作用,通过蛋白质-蛋白质相互作用(PPIs)实现其功能。因此,准确识别 PPI 位点是至关重要的。传统的生物方法识别 PPIs 成本高、劳动强度大、耗时。开发用于 PPI 位点的计算预测方法提供了有希望的替代方法。大多数已知的深度学习(DL)方法都采用分层多尺度 CNN 从蛋白质序列中提取特征。但是,这些方法通常忽略了蛋白质序列中嵌入的空间位置和层次信息,而这些信息实际上对 PPI 位点预测至关重要。在本文中,我们提出了 MR2CPPIS,这是一种新颖的基于序列的 DL 模型,利用多尺度 Res2Net 与坐标注意力机制来利用多尺度特征并增强 PPI 位点预测能力。我们利用多尺度 Res2Net 来扩展每个网络层的感受野,从而在细粒度水平上捕获蛋白质序列的多尺度信息。为了进一步探索每个目标残基的局部上下文特征,我们采用坐标注意力块来描述精确的空间位置信息,使网络能够有效地提取远程依赖关系。我们在三个公共基准数据集(Dset 72、Dset 186 和 PDBset 164)上评估了我们的 MR2CPPIS,达到了最先进的性能。源代码可在 https://github.com/YyinGong/MR2CPPIS 上获得。