Czibula Gabriela, Albu Alexandra-Ioana, Bocicor Maria Iuliana, Chira Camelia
Department of Computer Science, Babeş-Bolyai University, 400084 Cluj-Napoca, Romania.
Entropy (Basel). 2021 May 21;23(6):643. doi: 10.3390/e23060643.
Proteins are essential molecules, that must correctly perform their roles for the good health of living organisms. The majority of proteins operate in complexes and the way they interact has pivotal influence on the proper functioning of such organisms. In this study we address the problem of protein-protein interaction and we propose and investigate a method based on the use of an ensemble of autoencoders. Our approach, entitled AutoPPI, adopts a strategy based on two autoencoders, one for each type of interactions (positive and negative) and we advance three types of neural network architectures for the autoencoders. Experiments were performed on several data sets comprising proteins from four different species. The results indicate good performances of our proposed model, with accuracy and AUC values of over 0.97 in all cases. The best performing model relies on a Siamese architecture in both the encoder and the decoder, which advantageously captures common features in protein pairs. Comparisons with other machine learning techniques applied for the same problem prove that AutoPPI outperforms most of its contenders, for the considered data sets.
蛋白质是必不可少的分子,它们必须正确履行其职责才能确保生物体的健康。大多数蛋白质以复合物的形式发挥作用,它们的相互作用方式对这些生物体的正常功能有着至关重要的影响。在本研究中,我们探讨了蛋白质-蛋白质相互作用的问题,并提出并研究了一种基于使用自动编码器集成的方法。我们的方法名为AutoPPI,采用基于两个自动编码器的策略,每个自动编码器用于一种类型的相互作用(正相互作用和负相互作用),并且我们为自动编码器提出了三种类型的神经网络架构。我们在包含来自四个不同物种的蛋白质的几个数据集上进行了实验。结果表明,我们提出的模型具有良好的性能,在所有情况下准确率和AUC值均超过0.97。性能最佳的模型在编码器和解码器中都依赖于连体架构,该架构有利地捕获了蛋白质对中的共同特征。与应用于同一问题的其他机器学习技术的比较证明,对于所考虑的数据集,AutoPPI优于大多数竞争对手。