School of Computer Science, Fudan University, Shanghai 200433, China.
Bioinformatics. 2011 Jul 1;27(13):i159-66. doi: 10.1093/bioinformatics/btr212.
Protein complexes are of great importance for unraveling the secrets of cellular organization and function. The AP-MS technique has provided an effective high-throughput screening to directly measure the co-complex relationship among multiple proteins, but its performance suffers from both false positives and false negatives. To computationally predict complexes from AP-MS data, most existing approaches either required the additional knowledge from known complexes (supervised learning), or had numerous parameters to tune.
In this article, we propose a novel unsupervised approach, without relying on the knowledge of existing complexes. Our method probabilistically calculates the affinity between two proteins, where the affinity score is evaluated by a co-complexed score or C2S in brief. In particular, our method measures the log-likelihood ratio of two proteins being co-complexed to being drawn randomly, and we then predict protein complexes by applying hierarchical clustering algorithm on the C2S score matrix.
Compared with existing approaches, our approach is computationally efficient and easy to implement. It has just one parameter to set and its value has little effect on the results. It can be applied to different species as long as the AP-MS data are available. Despite its simplicity, it is competitive or superior in performance over many aspects when compared with the state-of-the-art predictions performed by supervised or unsupervised approaches.
蛋白质复合物对于揭示细胞组织和功能的奥秘具有重要意义。AP-MS 技术提供了一种有效的高通量筛选方法,可以直接测量多个蛋白质之间的共复合物关系,但它的性能受到假阳性和假阴性的影响。为了从 AP-MS 数据中计算预测复合物,大多数现有的方法要么需要来自已知复合物的额外知识(监督学习),要么需要调整许多参数。
在本文中,我们提出了一种新颖的无监督方法,不依赖于现有复合物的知识。我们的方法概率计算两个蛋白质之间的亲和力,其中亲和力得分由共复合物得分或简称 C2S 评估。具体来说,我们的方法测量两个蛋白质被共复合物化的对数似然比与随机抽取的对数似然比,并通过在 C2S 得分矩阵上应用层次聚类算法来预测蛋白质复合物。
与现有的方法相比,我们的方法计算效率高,易于实现。它只有一个参数需要设置,其值对结果的影响很小。只要有 AP-MS 数据,它就可以应用于不同的物种。尽管它很简单,但与监督或无监督方法的最新预测相比,它在许多方面的性能都具有竞争力或优势。