Suppr超能文献

整合领域相似性以提高 TAP-MS 数据中的蛋白质复合物鉴定。

Integrating domain similarity to improve protein complexes identification in TAP-MS data.

出版信息

Proteome Sci. 2013 Nov 7;11(Suppl 1):S2. doi: 10.1186/1477-5956-11-S1-S2.

Abstract

BACKGROUND

Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks.

METHODS

This paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained.

RESULTS

The proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks.

摘要

背景

在蛋白质-蛋白质相互作用(PPI)网络中检测蛋白质复合物对于提高我们对细胞组织动态的理解起着重要作用。然而,高通量实验(如酵母双杂交(Y2H)和串联亲和纯化/质谱(TAP-MS))生成的蛋白质相互作用数据的特点是存在大量的假阳性和假阴性。近年来,越来越倾向于结合各种领域知识来支持大规模的 PPI 网络分析。

方法

本文提出了一种新算法,通过整合基于基因本体论(GO)的语义相似性,从 TAP-MS 生成的 PPI 网络中检测蛋白质复合物。通过考虑 TAP-MS 数据中的共复合物关系,TAP-MS PPI 网络被建模为二部图,诱饵蛋白由一组节点组成,而猎物蛋白则在另一组节点上。通过同时考虑拓扑特征和 GO 驱动的语义相似性,计算成对诱饵蛋白之间的相似性。然后,根据诱饵蛋白之间的成对相似性将它们分组到簇集中,生成一组“种子”簇。将扩展过程应用于每个“种子”簇,以招募与同一组诱饵蛋白显著相关的猎物蛋白。这样,就可以得到完全鉴定的蛋白质复合物。

结果

该算法已应用于真实的 TAP-MS PPI 网络。使用了十五种质量度量标准来评估生成的蛋白质复合物的质量。实验结果表明,该算法大大提高了识别复合物的准确性,并优于几种最先进的聚类算法。此外,通过整合语义相似性,该算法对网络中的噪声更具鲁棒性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e95e/3907791/dabea69785ef/1477-5956-11-S1-S2-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验