School of Computer Science and Engineering, Central South University, Changsha 410083, People's Republic of China.
Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon SKS7N5A9, Canada.
Bioinformatics. 2020 Feb 15;36(4):1114-1120. doi: 10.1093/bioinformatics/btz699.
Protein-protein interactions (PPIs) play important roles in many biological processes. Conventional biological experiments for identifying PPI sites are costly and time-consuming. Thus, many computational approaches have been proposed to predict PPI sites. Existing computational methods usually use local contextual features to predict PPI sites. Actually, global features of protein sequences are critical for PPI site prediction.
A new end-to-end deep learning framework, named DeepPPISP, through combining local contextual and global sequence features, is proposed for PPI site prediction. For local contextual features, we use a sliding window to capture features of neighbors of a target amino acid as in previous studies. For global sequence features, a text convolutional neural network is applied to extract features from the whole protein sequence. Then the local contextual and global sequence features are combined to predict PPI sites. By integrating local contextual and global sequence features, DeepPPISP achieves the state-of-the-art performance, which is better than the other competing methods. In order to investigate if global sequence features are helpful in our deep learning model, we remove or change some components in DeepPPISP. Detailed analyses show that global sequence features play important roles in DeepPPISP.
The DeepPPISP web server is available at http://bioinformatics.csu.edu.cn/PPISP/. The source code can be obtained from https://github.com/CSUBioGroup/DeepPPISP.
Supplementary data are available at Bioinformatics online.
蛋白质-蛋白质相互作用(PPIs)在许多生物过程中起着重要作用。传统的用于识别 PPI 位点的生物实验既昂贵又耗时。因此,已经提出了许多计算方法来预测 PPI 位点。现有的计算方法通常使用局部上下文特征来预测 PPI 位点。实际上,蛋白质序列的全局特征对于 PPI 位点预测至关重要。
提出了一种新的端到端深度学习框架,名为 DeepPPISP,用于通过结合局部上下文和全局序列特征来预测 PPI 位点。对于局部上下文特征,我们使用滑动窗口来捕获目标氨基酸的邻居的特征,如先前的研究中所述。对于全局序列特征,应用文本卷积神经网络从整个蛋白质序列中提取特征。然后将局部上下文和全局序列特征结合起来预测 PPI 位点。通过整合局部上下文和全局序列特征,DeepPPISP 实现了最先进的性能,优于其他竞争方法。为了研究全局序列特征是否有助于我们的深度学习模型,我们在 DeepPPISP 中删除或更改了一些组件。详细分析表明,全局序列特征在 DeepPPISP 中起着重要作用。
DeepPPISP 网络服务器可在 http://bioinformatics.csu.edu.cn/PPISP/ 获得。源代码可从 https://github.com/CSUBioGroup/DeepPPISP 获得。
补充数据可在 Bioinformatics 在线获得。