BMC Bioinformatics. 2014;15 Suppl 2(Suppl 2):S6. doi: 10.1186/1471-2105-15-S2-S6. Epub 2014 Jan 24.
Protein complexes play important roles in biological systems such as gene regulatory networks and metabolic pathways. Most methods for predicting protein complexes try to find protein complexes with size more than three. It, however, is known that protein complexes with smaller sizes occupy a large part of whole complexes for several species. In our previous work, we developed a method with several feature space mappings and the domain composition kernel for prediction of heterodimeric protein complexes, which outperforms existing methods.
We propose methods for prediction of heterotrimeric protein complexes by extending techniques in the previous work on the basis of the idea that most heterotrimeric protein complexes are not likely to share the same protein with each other. We make use of the discriminant function in support vector machines (SVMs), and design novel feature space mappings for the second phase. As the second classifier, we examine SVMs and relevance vector machines (RVMs). We perform 10-fold cross-validation computational experiments. The results suggest that our proposed two-phase methods and SVM with the extended features outperform the existing method NWE, which was reported to outperform other existing methods such as MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler for prediction of heterotrimeric protein complexes.
We propose two-phase prediction methods with the extended features, the domain composition kernel, SVMs and RVMs. The two-phase method with the extended features and the domain composition kernel using SVM as the second classifier is particularly useful for prediction of heterotrimeric protein complexes.
蛋白质复合物在基因调控网络和代谢途径等生物系统中发挥着重要作用。大多数预测蛋白质复合物的方法都试图找到大小超过三个的蛋白质复合物。然而,已知几个物种的整个复合物中,小尺寸的蛋白质复合物占有很大一部分。在我们之前的工作中,我们开发了一种使用几种特征空间映射和域组成核的方法来预测异二聚体蛋白质复合物,该方法优于现有的方法。
我们提出了一种基于大多数异三聚体蛋白质复合物不太可能彼此共享相同蛋白质的想法,通过扩展我们之前关于预测异三聚体蛋白质复合物的工作中的技术来预测异三聚体蛋白质复合物的方法。我们利用支持向量机(SVM)中的判别函数,并为第二阶段设计新的特征空间映射。作为第二个分类器,我们检查了 SVM 和相关向量机(RVM)。我们进行了 10 倍交叉验证的计算实验。结果表明,我们提出的两阶段方法和使用扩展特征的 SVM 优于现有的 NWE 方法,该方法被报道优于其他现有的方法,如 MCL、MCODE、DPClus、CMC、COACH、RRW 和 PPSampler,用于预测异三聚体蛋白质复合物。
我们提出了具有扩展特征、域组成核、SVM 和 RVM 的两阶段预测方法。使用扩展特征和域组成核的两阶段方法以及将 SVM 用作第二个分类器的方法对于预测异三聚体蛋白质复合物特别有用。