Proteome Sci. 2013 Nov 7;11(Suppl 1):S21. doi: 10.1186/1477-5956-11-S1-S21.
Many computational approaches have been developed to detect protein complexes from protein-protein interaction (PPI) networks. However, these PPI networks are always built from high-throughput experiments. The presence of unreliable interactions in PPI network makes this task very challenging.
In this study, we proposed a Genetic-Algorithm Fuzzy Naïve Bayes (GAFNB) filter to classify the protein complexes from candidate subgraphs. It takes unreliability into consideration and tackles the presence of unreliable interactions in protein complex. We first got candidate protein complexes through existed popular methods. Each candidate protein complex is represented by 29 graph features and 266 biological property based features. GAFNB model is then applied to classify the candidate complexes into positive or negative.
Our evaluation indicates that the protein complex identification algorithms using the GAFNB model filtering outperform original ones. For evaluation of GAFNB model, we also compared the performance of GAFNB with Naïve Bayes (NB). Results show that GAFNB performed better than NB. It indicates that a fuzzy model is more suitable when unreliability is present.
We conclude that filtering candidate protein complexes with GAFNB model can improve the effectiveness of protein complex identification. It is necessary to consider the unreliability in this task.
已经开发出许多计算方法来从蛋白质-蛋白质相互作用(PPI)网络中检测蛋白质复合物。然而,这些 PPI 网络通常是从高通量实验中构建的。PPI 网络中存在不可靠的相互作用使得这项任务极具挑战性。
在本研究中,我们提出了一种遗传算法模糊朴素贝叶斯(GAFNB)过滤器,用于从候选子图中分类蛋白质复合物。它考虑了不可靠性,并解决了蛋白质复合物中存在不可靠相互作用的问题。我们首先通过现有的流行方法获得候选蛋白质复合物。每个候选蛋白质复合物由 29 个图特征和 266 个基于生物学特性的特征表示。然后,将 GAFNB 模型应用于分类候选复合物为阳性或阴性。
我们的评估表明,使用 GAFNB 模型过滤候选蛋白质复合物的蛋白质复合物识别算法优于原始算法。为了评估 GAFNB 模型,我们还比较了 GAFNB 与朴素贝叶斯(NB)的性能。结果表明,GAFNB 的性能优于 NB。这表明在存在不可靠性时,模糊模型更适用。
我们得出结论,使用 GAFNB 模型过滤候选蛋白质复合物可以提高蛋白质复合物识别的有效性。在这项任务中,考虑不可靠性是必要的。