• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

哪种聚类算法更适合预测蛋白质复合物?

Which clustering algorithm is better for predicting protein complexes?

作者信息

Moschopoulos Charalampos N, Pavlopoulos Georgios A, Iacucci Ernesto, Aerts Jan, Likothanassis Spiridon, Schneider Reinhard, Kossida Sophia

机构信息

Bioinformatics & Medical Informatics Team, Biomedical Research Foundation, Academy of Athens, Soranou Efessiou 4, 11527 Athens, Greece.

出版信息

BMC Res Notes. 2011 Dec 20;4:549. doi: 10.1186/1756-0500-4-549.

DOI:10.1186/1756-0500-4-549
PMID:22185599
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3267700/
Abstract

BACKGROUND

Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks.

RESULTS

In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases.

CONCLUSIONS

While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm.

摘要

背景

蛋白质-蛋白质相互作用(PPI)在决定大多数细胞过程的结果中起着关键作用。正确识别和表征蛋白质相互作用及其所构成的网络,对于理解细胞内的分子机制至关重要。为了检测生物体中的蛋白质相互作用,人们使用了诸如下拉分析和串联亲和纯化等大规模技术。如今,相对较新的高通量方法,如酵母双杂交、质谱、微阵列和噬菌体展示,也被用于揭示蛋白质相互作用网络。

结果

在本文中,我们使用六个不同的相互作用数据集评估了四种不同的聚类算法。我们对马尔可夫聚类算法(MCL)、谱聚类算法、基于随机邻居搜索的聚类算法(RNSC)和亲和传播算法进行了参数化,并将它们应用于通过酵母双杂交(Y2H)和串联亲和纯化(TAP)方法实验产生的六个PPI数据集。然后,将预测的聚类,即所谓的蛋白质复合物,与已发表数据库中存储的已知复合物进行比较和基准测试。

结论

虽然参数化后结果可能不同,但MCL和RNSC算法在预测PPI复合物方面似乎更有前景且更准确。此外,它们预测的复合物绝对数量比其他所审查的算法更多。另一方面,谱聚类算法在我们的实验中实现了最高的有效预测率。然而,在几何准确性方面,它几乎总是不如RNSC和MCL,同时它产生的有效聚类比任何其他所审查的算法都少。本文展示了各种指标来评估此类预测的准确性,如下文所述。补充材料可在以下网址找到:http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9184/3267700/596ad415753b/1756-0500-4-549-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9184/3267700/e4ea8a4f3b95/1756-0500-4-549-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9184/3267700/e55761162903/1756-0500-4-549-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9184/3267700/b6657640837f/1756-0500-4-549-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9184/3267700/596ad415753b/1756-0500-4-549-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9184/3267700/e4ea8a4f3b95/1756-0500-4-549-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9184/3267700/e55761162903/1756-0500-4-549-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9184/3267700/b6657640837f/1756-0500-4-549-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9184/3267700/596ad415753b/1756-0500-4-549-4.jpg

相似文献

1
Which clustering algorithm is better for predicting protein complexes?哪种聚类算法更适合预测蛋白质复合物?
BMC Res Notes. 2011 Dec 20;4:549. doi: 10.1186/1756-0500-4-549.
2
Evaluation of clustering algorithms for protein-protein interaction networks.蛋白质-蛋白质相互作用网络聚类算法的评估
BMC Bioinformatics. 2006 Nov 6;7:488. doi: 10.1186/1471-2105-7-488.
3
Integrating domain similarity to improve protein complexes identification in TAP-MS data.整合领域相似性以提高 TAP-MS 数据中的蛋白质复合物鉴定。
Proteome Sci. 2013 Nov 7;11(Suppl 1):S2. doi: 10.1186/1477-5956-11-S1-S2.
4
Discovery of protein complexes with core-attachment structures from Tandem Affinity Purification (TAP) data.从串联亲和纯化(TAP)数据中发现具有核心-附着结构的蛋白质复合物。
J Comput Biol. 2012 Sep;19(9):1027-42. doi: 10.1089/cmb.2010.0293. Epub 2011 Jul 21.
5
Predicting protein complexes from weighted protein-protein interaction graphs with a novel unsupervised methodology: Evolutionary enhanced Markov clustering.利用一种新颖的无监督方法从加权蛋白质 - 蛋白质相互作用图预测蛋白质复合物:进化增强的马尔可夫聚类。
Artif Intell Med. 2015 Mar;63(3):181-9. doi: 10.1016/j.artmed.2014.12.012. Epub 2015 Feb 18.
6
Detection of protein complexes from affinity purification/mass spectrometry data.从亲和纯化/质谱数据中检测蛋白质复合物。
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S4. doi: 10.1186/1752-0509-6-S3-S4. Epub 2012 Dec 17.
7
Predicting overlapping protein complexes from weighted protein interaction graphs by gradually expanding dense neighborhoods.通过逐步扩展密集邻域从加权蛋白质相互作用图预测重叠蛋白质复合物。
Artif Intell Med. 2016 Jul;71:62-9. doi: 10.1016/j.artmed.2016.05.006. Epub 2016 Jun 28.
8
Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.用于蛋白质相互作用图划分的马尔可夫聚类与亲和传播算法
BMC Bioinformatics. 2009 Mar 30;10:99. doi: 10.1186/1471-2105-10-99.
9
Protein complex prediction via cost-based clustering.基于成本聚类的蛋白质复合物预测
Bioinformatics. 2004 Nov 22;20(17):3013-20. doi: 10.1093/bioinformatics/bth351. Epub 2004 Jun 4.
10
Identifying protein complexes by reducing noise in interaction networks.通过降低相互作用网络中的噪声来识别蛋白质复合物。
Protein Pept Lett. 2014 Jul;21(7):688-95. doi: 10.2174/0929866521666140320111720.

引用本文的文献

1
Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters.在蛋白质家族水平上探索微生物功能多样性——从宏基因组序列 reads 到注释的蛋白质簇。
Front Bioinform. 2023 Mar 3;3:1157956. doi: 10.3389/fbinf.2023.1157956. eCollection 2023.
2
A Guide to Conquer the Biological Network Era Using Graph Theory.《利用图论征服生物网络时代指南》
Front Bioeng Biotechnol. 2020 Jan 31;8:34. doi: 10.3389/fbioe.2020.00034. eCollection 2020.
3
Detecting protein complexes based on a combination of topological and biological properties in protein-protein interaction network.

本文引用的文献

1
Computational approaches for detecting protein complexes from protein interaction networks: a survey.从蛋白质相互作用网络中检测蛋白质复合物的计算方法:综述。
BMC Genomics. 2010 Feb 10;11 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2164-11-S1-S3.
2
A max-flow based approach to the identification of protein complexes using protein interaction and microarray data.一种基于最大流的方法,利用蛋白质相互作用和微阵列数据来识别蛋白质复合物。
Comput Syst Bioinformatics Conf. 2008;7:51-62.
3
Bootstrapping the interactome: unsupervised identification of protein complexes in yeast.
基于蛋白质-蛋白质相互作用网络中拓扑和生物学特性的组合来检测蛋白质复合物。
J Genet Eng Biotechnol. 2018 Jun;16(1):217-226. doi: 10.1016/j.jgeb.2017.11.005. Epub 2017 Nov 26.
4
HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks.HipMCL:一种用于大规模网络的马尔可夫聚类算法的高性能并行实现。
Nucleic Acids Res. 2018 Apr 6;46(6):e33. doi: 10.1093/nar/gkx1313.
5
Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future.可视化基因组与系统生物学:技术、工具、实施方法及趋势,过去、现在与未来
Gigascience. 2015 Aug 25;4:38. doi: 10.1186/s13742-015-0077-2. eCollection 2015.
6
Computational analysis of protein interaction networks for infectious diseases.传染病蛋白质相互作用网络的计算分析
Brief Bioinform. 2016 May;17(3):517-26. doi: 10.1093/bib/bbv059. Epub 2015 Aug 10.
7
A protein interaction map of the LSU processome.大亚基加工体的蛋白质相互作用图谱。
Genes Dev. 2015 Apr 15;29(8):862-75. doi: 10.1101/gad.256370.114.
8
Affinity purification-mass spectrometry and network analysis to understand protein-protein interactions.亲和纯化-质谱分析及网络分析以了解蛋白质-蛋白质相互作用
Nat Protoc. 2014 Nov;9(11):2539-54. doi: 10.1038/nprot.2014.164. Epub 2014 Oct 2.
9
Identifying aging-related genes in mouse hippocampus using gateway nodes.利用网关节点鉴定小鼠海马体中与衰老相关的基因。
BMC Syst Biol. 2014 May 27;8:62. doi: 10.1186/1752-0509-8-62.
10
Inferring protein-protein interaction complexes from immunoprecipitation data.从免疫沉淀数据推断蛋白质-蛋白质相互作用复合体
BMC Res Notes. 2013 Nov 15;6:468. doi: 10.1186/1756-0500-6-468.
自引导蛋白质相互作用组:酵母中蛋白质复合物的无监督识别
J Comput Biol. 2009 Aug;16(8):971-87. doi: 10.1089/cmb.2009.0023.
4
GIBA: a clustering tool for detecting protein complexes.GIBA:一种用于检测蛋白质复合物的聚类工具。
BMC Bioinformatics. 2009 Jun 16;10 Suppl 6(Suppl 6):S11. doi: 10.1186/1471-2105-10-S6-S11.
5
jClust: a clustering and visualization toolbox.jClust:一个聚类和可视化工具包。
Bioinformatics. 2009 Aug 1;25(15):1994-6. doi: 10.1093/bioinformatics/btp330. Epub 2009 May 19.
6
Complex discovery from weighted PPI networks.基于加权 PPI 网络的复杂发现。
Bioinformatics. 2009 Aug 1;25(15):1891-7. doi: 10.1093/bioinformatics/btp311. Epub 2009 May 12.
7
Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.用于蛋白质相互作用图划分的马尔可夫聚类与亲和传播算法
BMC Bioinformatics. 2009 Mar 30;10:99. doi: 10.1186/1471-2105-10-99.
8
Human Protein Reference Database--2009 update.人类蛋白质参考数据库——2009年更新版
Nucleic Acids Res. 2009 Jan;37(Database issue):D767-72. doi: 10.1093/nar/gkn892. Epub 2008 Nov 6.
9
DroID: the Drosophila Interactions Database, a comprehensive resource for annotated gene and protein interactions.DroID:果蝇相互作用数据库,一个用于注释基因和蛋白质相互作用的综合资源库。
BMC Genomics. 2008 Oct 7;9:461. doi: 10.1186/1471-2164-9-461.
10
A structural approach for finding functional modules from large biological networks.一种从大型生物网络中寻找功能模块的结构化方法。
BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S19. doi: 10.1186/1471-2105-9-S9-S19.