College of Computer Science and technology, China University of Petroleum (East China), Qingdao, China.
BMC Genomics. 2022 Jun 27;23(1):474. doi: 10.1186/s12864-022-08687-2.
Protein-protein interactions (PPIs) dominate intracellular molecules to perform a series of tasks such as transcriptional regulation, information transduction, and drug signalling. The traditional wet experiment method to obtain PPIs information is costly and time-consuming.
In this paper, SDNN-PPI, a PPI prediction method based on self-attention and deep learning is proposed. The method adopts amino acid composition (AAC), conjoint triad (CT), and auto covariance (AC) to extract global and local features of protein sequences, and leverages self-attention to enhance DNN feature extraction to more effectively accomplish the prediction of PPIs. In order to verify the generalization ability of SDNN-PPI, a 5-fold cross-validation on the intraspecific interactions dataset of Saccharomyces cerevisiae (core subset) and human is used to measure our model in which the accuracy reaches 95.48% and 98.94% respectively. The accuracy of 93.15% and 88.33% are obtained in the interspecific interactions dataset of human-Bacillus Anthracis and Human-Yersinia pestis, respectively. In the independent data set Caenorhabditis elegans, Escherichia coli, Homo sapiens, and Mus musculus, all prediction accuracy is 100%, which is higher than the previous PPIs prediction methods. To further evaluate the advantages and disadvantages of the model, the one-core and crossover network are conducted to predict PPIs, and the data show that the model correctly predicts the interaction pairs in the network.
In this paper, AAC, CT and AC methods are used to encode the sequence, and SDNN-PPI method is proposed to predict PPIs based on self-attention deep learning neural network. Satisfactory results are obtained on interspecific and intraspecific data sets, and good performance is also achieved in cross-species prediction. It can also correctly predict the protein interaction of cell and tumor information contained in one-core network and crossover network.The SDNN-PPI proposed in this paper not only explores the mechanism of protein-protein interaction, but also provides new ideas for drug design and disease prevention.
蛋白质-蛋白质相互作用(PPIs)支配着细胞内分子,以执行一系列任务,如转录调控、信息转导和药物信号转导。传统的获取 PPIs 信息的湿实验方法既昂贵又耗时。
在本文中,提出了一种基于自注意力和深度学习的 PPI 预测方法 SDNN-PPI。该方法采用氨基酸组成(AAC)、联合三联体(CT)和自协方差(AC)提取蛋白质序列的全局和局部特征,并利用自注意力增强 DNN 特征提取,以更有效地完成 PPIs 的预测。为了验证 SDNN-PPI 的泛化能力,我们在酿酒酵母(核心子集)和人类的种内相互作用数据集上进行了 5 折交叉验证,以衡量我们的模型,其准确性分别达到 95.48%和 98.94%。在人类-炭疽杆菌和人类-鼠疫耶尔森菌的种间相互作用数据集中,分别获得了 93.15%和 88.33%的准确性。在独立数据集秀丽隐杆线虫、大肠杆菌、智人和小鼠中,所有预测准确性均为 100%,高于之前的 PPIs 预测方法。为了进一步评估模型的优缺点,我们进行了单核和交叉网络的 PPIs 预测,数据表明该模型正确预测了网络中的相互作用对。
本文使用 AAC、CT 和 AC 方法对序列进行编码,并基于自注意力深度学习神经网络提出了 SDNN-PPI 方法来预测 PPIs。在种内和种间数据集上取得了令人满意的结果,在跨物种预测中也取得了良好的性能。它还可以正确预测单核网络和交叉网络中包含的细胞和肿瘤信息的蛋白质相互作用。本文提出的 SDNN-PPI 不仅探索了蛋白质-蛋白质相互作用的机制,还为药物设计和疾病预防提供了新的思路。