College of Computer Science and Engineering, Northeastern University, Shenyang, China.
Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, China.
Biomed Res Int. 2020 Jun 13;2020:5072520. doi: 10.1155/2020/5072520. eCollection 2020.
Protein-protein interactions (PPIs) are important for almost all cellular processes, including metabolic cycles, DNA transcription and replication, and signaling cascades. The experimental methods for identifying PPIs are always time-consuming and expensive. Therefore, it is important to develop computational approaches for predicting PPIs. In this paper, an improved model is proposed to use a machine learning method in the study of protein-protein interactions. With the consideration of the factors affecting the prediction of the PPIs, a method of feature extraction and fusion is proposed to improve the variety of the features to be considered in the prediction. Besides, with the consideration of the effect affected by the different input order of the two proteins, we propose a "Y-type" Bi-RNN model and train the network by using a method which both needs backward and forward training. In order to insure the training time caused on the extra training either a backward one or a forward one, this paper proposes a weight-sharing policy to minimize the parameters in the training. The experimental results show that the proposed method can achieve an accuracy of 99.57%, recall of 99.36%, sensitivity of 99.76%, precision of 99.74%, MCC of 99.14%, and AUC of 99.56% under the benchmark dataset.
蛋白质-蛋白质相互作用(PPIs)对于几乎所有的细胞过程都很重要,包括代谢循环、DNA 转录和复制以及信号级联。用于识别 PPIs 的实验方法通常既耗时又昂贵。因此,开发用于预测 PPIs 的计算方法非常重要。在本文中,提出了一种改进的模型,用于在蛋白质-蛋白质相互作用的研究中使用机器学习方法。考虑到影响 PPIs 预测的因素,提出了一种特征提取和融合的方法,以提高预测中要考虑的特征的多样性。此外,考虑到两种蛋白质的不同输入顺序对预测的影响,我们提出了一种“Y 型”双向 RNN 模型,并通过一种需要前向和后向训练的方法来训练网络。为了确保额外的后向或前向训练所导致的训练时间,本文提出了一种权重共享策略,以最小化训练中的参数。实验结果表明,在所使用的基准数据集上,该方法在精度、召回率、灵敏度、准确率、MCC 和 AUC 方面的预测准确率分别达到了 99.57%、99.36%、99.76%、99.74%、99.14%和 99.56%。