整合领域相似性以提高 TAP-MS 数据中的蛋白质复合物鉴定。

Integrating domain similarity to improve protein complexes identification in TAP-MS data.

出版信息

Proteome Sci. 2013 Nov 7;11(Suppl 1):S2. doi: 10.1186/1477-5956-11-S1-S2.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3907791/

Abstract

BACKGROUND

Detecting protein complexes in protein-protein interaction (PPI) networks plays an important role in improving our understanding of the dynamic of cellular organisation. However, protein interaction data generated by high-throughput experiments such as yeast-two-hybrid (Y2H) and tandem affinity-purification/mass-spectrometry (TAP-MS) are characterised by the presence of a significant number of false positives and false negatives. In recent years there has been a growing trend to incorporate diverse domain knowledge to support large-scale analysis of PPI networks.

METHODS

This paper presents a new algorithm, by incorporating Gene Ontology (GO) based semantic similarities, to detect protein complexes from PPI networks generated by TAP-MS. By taking co-complex relations in TAP-MS data into account, TAP-MS PPI networks are modelled as bipartite graph, where bait proteins consist of one set of nodes and prey proteins are on the other. Similarities between pairs of bait proteins are computed by considering both the topological features and GO-driven semantic similarities. Bait proteins are then grouped in to sets of clusters based on their pair-wise similarities to produce a set of 'seed' clusters. An expansion process is applied to each 'seed' cluster to recruit prey proteins which are significantly associated with the same set of bait proteins. Thus, completely identified protein complexes are then obtained.

RESULTS

The proposed algorithm has been applied to real TAP-MS PPI networks. Fifteen quality measures have been employed to evaluate the quality of generated protein complexes. Experimental results show that the proposed algorithm has greatly improved the accuracy of identifying complexes and outperformed several state-of-the-art clustering algorithms. Moreover, by incorporating semantic similarity, the proposed algorithm is more robust to noises in the networks.

摘要

背景

在蛋白质-蛋白质相互作用（PPI）网络中检测蛋白质复合物对于提高我们对细胞组织动态的理解起着重要作用。然而，高通量实验（如酵母双杂交（Y2H）和串联亲和纯化/质谱（TAP-MS））生成的蛋白质相互作用数据的特点是存在大量的假阳性和假阴性。近年来，越来越倾向于结合各种领域知识来支持大规模的 PPI 网络分析。

方法

本文提出了一种新算法，通过整合基于基因本体论（GO）的语义相似性，从 TAP-MS 生成的 PPI 网络中检测蛋白质复合物。通过考虑 TAP-MS 数据中的共复合物关系，TAP-MS PPI 网络被建模为二部图，诱饵蛋白由一组节点组成，而猎物蛋白则在另一组节点上。通过同时考虑拓扑特征和 GO 驱动的语义相似性，计算成对诱饵蛋白之间的相似性。然后，根据诱饵蛋白之间的成对相似性将它们分组到簇集中，生成一组“种子”簇。将扩展过程应用于每个“种子”簇，以招募与同一组诱饵蛋白显著相关的猎物蛋白。这样，就可以得到完全鉴定的蛋白质复合物。

结果

该算法已应用于真实的 TAP-MS PPI 网络。使用了十五种质量度量标准来评估生成的蛋白质复合物的质量。实验结果表明，该算法大大提高了识别复合物的准确性，并优于几种最先进的聚类算法。此外，通过整合语义相似性，该算法对网络中的噪声更具鲁棒性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e95e/3907791/dabea69785ef/1477-5956-11-S1-S2-1.jpg

相似文献

Integrating domain similarity to improve protein complexes identification in TAP-MS data.整合领域相似性以提高 TAP-MS 数据中的蛋白质复合物鉴定。

Proteome Sci. 2013 Nov 7;11(Suppl 1):S2. doi: 10.1186/1477-5956-11-S1-S2.

Detection of protein complexes from affinity purification/mass spectrometry data.从亲和纯化/质谱数据中检测蛋白质复合物。

BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S4. doi: 10.1186/1752-0509-6-S3-S4. Epub 2012 Dec 17.

Identification of Protein Complexes from Tandem Affinity Purification/Mass Spectrometry Data via Biased Random Walk.通过有偏随机游走从串联亲和纯化/质谱数据中鉴定蛋白质复合物

IEEE/ACM Trans Comput Biol Bioinform. 2015 Mar-Apr;12(2):455-66. doi: 10.1109/TCBB.2014.2352616.

Discovery of protein complexes with core-attachment structures from Tandem Affinity Purification (TAP) data.从串联亲和纯化（TAP）数据中发现具有核心-附着结构的蛋白质复合物。

J Comput Biol. 2012 Sep;19(9):1027-42. doi: 10.1089/cmb.2010.0293. Epub 2011 Jul 21.

Detecting overlapping protein complexes in PPI networks based on robustness.基于稳健性检测蛋白质相互作用网络中的重叠蛋白质复合物。

Proteome Sci. 2013 Nov 7;11(Suppl 1):S18. doi: 10.1186/1477-5956-11-S1-S18.

Filtering Gene Ontology semantic similarity for identifying protein complexes in large protein interaction networks.过滤基因本体语义相似性以识别大型蛋白质相互作用网络中的蛋白质复合物。

Proteome Sci. 2012 Jun 21;10 Suppl 1(Suppl 1):S18. doi: 10.1186/1477-5956-10-S1-S18.

Which clustering algorithm is better for predicting protein complexes?哪种聚类算法更适合预测蛋白质复合物？

BMC Res Notes. 2011 Dec 20;4:549. doi: 10.1186/1756-0500-4-549.

DPCT: A Dynamic Method for Detecting Protein Complexes From TAP-Aware Weighted PPI Network.DPCT：一种从TAP感知加权蛋白质-蛋白质相互作用网络中检测蛋白质复合物的动态方法。

Front Genet. 2020 Jun 26;11:567. doi: 10.3389/fgene.2020.00567. eCollection 2020.

A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity.基于拓扑相似性重建蛋白质-蛋白质相互作用网络的新链接预测算法。

Bioinformatics. 2013 Feb 1;29(3):355-64. doi: 10.1093/bioinformatics/bts688. Epub 2012 Dec 11.

Predicting overlapping protein complexes from weighted protein interaction graphs by gradually expanding dense neighborhoods.通过逐步扩展密集邻域从加权蛋白质相互作用图预测重叠蛋白质复合物。

Artif Intell Med. 2016 Jul;71:62-9. doi: 10.1016/j.artmed.2016.05.006. Epub 2016 Jun 28.

引用本文的文献

A seed-extended algorithm for detecting protein complexes based on density and modularity with topological structure and GO annotations.基于拓扑结构和 GO 注释的密度和模块性的种子扩展算法，用于检测蛋白质复合物。

BMC Genomics. 2019 Aug 7;20(1):637. doi: 10.1186/s12864-019-5956-y.

Bipartite graphs in systems biology and medicine: a survey of methods and applications.系统生物学和医学中的二部图：方法和应用综述。

Gigascience. 2018 Apr 1;7(4):1-31. doi: 10.1093/gigascience/giy014.

Getting to the edge: protein dynamical networks as a new frontier in plant-microbe interactions.探索前沿：蛋白质动态网络作为植物 - 微生物相互作用的新领域。

Front Plant Sci. 2014 Jun 30;5:312. doi: 10.3389/fpls.2014.00312. eCollection 2014.

本文引用的文献

Predictive Integration of Gene Ontology-Driven Similarity and Functional Interactions.基因本体驱动的相似性与功能相互作用的预测性整合

Proc IEEE Int Conf Data Min. 2006 Dec;2006:114-119. doi: 10.1109/ICDMW.2006.130.

Incorporating Ontology-Driven Similarity Knowledge into Functional Genomics: An Exploratory Study.将本体驱动的相似性知识融入功能基因组学：一项探索性研究。

BIBE 2004. 2004 May;2004:317-324. doi: 10.1109/BIBE.2004.1317360.

Detection of protein complexes from affinity purification/mass spectrometry data.从亲和纯化/质谱数据中检测蛋白质复合物。

BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S4. doi: 10.1186/1752-0509-6-S3-S4. Epub 2012 Dec 17.

Identification of protein complexes from co-immunoprecipitation data.从共免疫沉淀数据中鉴定蛋白质复合物。

Bioinformatics. 2011 Jan 1;27(1):111-7. doi: 10.1093/bioinformatics/btq652. Epub 2010 Nov 25.

How and when should interactome-derived clusters be used to predict functional modules and protein function?应当如何以及何时使用互作组学衍生的聚类来预测功能模块和蛋白质功能？

Bioinformatics. 2009 Dec 1;25(23):3143-50. doi: 10.1093/bioinformatics/btp551. Epub 2009 Sep 21.

RRW: repeated random walks on genome-scale protein networks for local cluster discovery.RRW：基于全基因组尺度蛋白质网络的重复随机游走用于局部簇发现。

BMC Bioinformatics. 2009 Sep 9;10:283. doi: 10.1186/1471-2105-10-283.

PLoS Comput Biol. 2009 Jul;5(7):e1000443. doi: 10.1371/journal.pcbi.1000443. Epub 2009 Jul 31.

A core-attachment based method to detect protein complexes in PPI networks.一种基于核心附着的方法来检测蛋白质-蛋白质相互作用网络中的蛋白质复合物。

BMC Bioinformatics. 2009 Jun 2;10:169. doi: 10.1186/1471-2105-10-169.

Up-to-date catalogues of yeast protein complexes.最新的酵母蛋白质复合物目录。

Nucleic Acids Res. 2009 Feb;37(3):825-31. doi: 10.1093/nar/gkn1005. Epub 2008 Dec 18.

NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways.NeAT：用于分析生物网络、簇、类别和通路的工具箱。

Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W444-51. doi: 10.1093/nar/gkn336. Epub 2008 Jun 4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

整合领域相似性以提高 TAP-MS 数据中的蛋白质复合物鉴定。

Integrating domain similarity to improve protein complexes identification in TAP-MS data.

出版信息

BACKGROUND

METHODS

RESULTS

背景

方法

结果

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献