• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SPICi:一种用于大型生物网络的快速聚类算法。

SPICi: a fast clustering algorithm for large biological networks.

机构信息

Lewis-Sigler Institute for Integrative Genomics and Department of Computer Science, Princeton University, Princeton, NJ 08544, USA.

出版信息

Bioinformatics. 2010 Apr 15;26(8):1105-11. doi: 10.1093/bioinformatics/btq078. Epub 2010 Feb 24.

DOI:10.1093/bioinformatics/btq078
PMID:20185405
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2853685/
Abstract

MOTIVATION

Clustering algorithms play an important role in the analysis of biological networks, and can be used to uncover functional modules and obtain hints about cellular organization. While most available clustering algorithms work well on biological networks of moderate size, such as the yeast protein physical interaction network, they either fail or are too slow in practice for larger networks, such as functional networks for higher eukaryotes. Since an increasing number of larger biological networks are being determined, the limitations of current clustering approaches curtail the types of biological network analyses that can be performed.

RESULTS

We present a fast local network clustering algorithm SPICi. SPICi runs in time O(V log V+E) and space O(E), where V and E are the number of vertices and edges in the network, respectively. We evaluate SPICi's performance on several existing protein interaction networks of varying size, and compare SPICi to nine previous approaches for clustering biological networks. We show that SPICi is typically several orders of magnitude faster than previous approaches and is the only one that can successfully cluster all test networks within very short time. We demonstrate that SPICi has state-of-the-art performance with respect to the quality of the clusters it uncovers, as judged by its ability to recapitulate protein complexes and functional modules. Finally, we demonstrate the power of our fast network clustering algorithm by applying SPICi across hundreds of large context-specific human networks, and identifying modules specific for single conditions.

AVAILABILITY

Source code is available under the GNU Public License at http://compbio.cs.princeton.edu/spici.

摘要

动机

聚类算法在生物网络分析中起着重要作用,可用于发现功能模块并获得有关细胞组织的提示。虽然大多数可用的聚类算法在中等规模的生物网络(如酵母蛋白质物理相互作用网络)上运行良好,但对于更大的网络(如高等真核生物的功能网络),它们要么无法正常工作,要么在实践中速度太慢。由于越来越多的更大的生物网络正在被确定,当前聚类方法的局限性限制了可以进行的生物网络分析类型。

结果

我们提出了一种快速的局部网络聚类算法 SPICi。SPICi 的运行时间为 O(VlogV+E),空间复杂度为 O(E),其中 V 和 E 分别是网络中的顶点数和边数。我们在几个大小不同的现有蛋白质相互作用网络上评估了 SPICi 的性能,并将 SPICi 与之前用于聚类生物网络的九种方法进行了比较。我们表明,SPICi 通常比以前的方法快几个数量级,并且是唯一能够在非常短的时间内成功聚类所有测试网络的方法。我们证明,SPICi 在其发现的聚类质量方面具有最先进的性能,这可以通过其重新生成蛋白质复合物和功能模块的能力来判断。最后,我们通过在数百个大型特定于上下文的人类网络上应用 SPICi 并识别特定于单个条件的模块,展示了我们快速网络聚类算法的强大功能。

可用性

源代码可在 GNU 公共许可证下从 http://compbio.cs.princeton.edu/spici 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/16209bae1c67/btq078f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/0afd42ba65d8/btq078f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/3fce3e5b48b8/btq078f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/eb8c9140d27f/btq078f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/40940e5726be/btq078f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/16209bae1c67/btq078f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/0afd42ba65d8/btq078f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/3fce3e5b48b8/btq078f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/eb8c9140d27f/btq078f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/40940e5726be/btq078f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a89/2853685/16209bae1c67/btq078f5.jpg

相似文献

1
SPICi: a fast clustering algorithm for large biological networks.SPICi:一种用于大型生物网络的快速聚类算法。
Bioinformatics. 2010 Apr 15;26(8):1105-11. doi: 10.1093/bioinformatics/btq078. Epub 2010 Feb 24.
2
Impact of heuristics in clustering large biological networks.启发式方法在大型生物网络聚类中的影响。
Comput Biol Chem. 2015 Dec;59 Pt A:28-36. doi: 10.1016/j.compbiolchem.2015.05.007. Epub 2015 Jul 26.
3
A structural approach for finding functional modules from large biological networks.一种从大型生物网络中寻找功能模块的结构化方法。
BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S19. doi: 10.1186/1471-2105-9-S9-S19.
4
How and when should interactome-derived clusters be used to predict functional modules and protein function?应当如何以及何时使用互作组学衍生的聚类来预测功能模块和蛋白质功能?
Bioinformatics. 2009 Dec 1;25(23):3143-50. doi: 10.1093/bioinformatics/btp551. Epub 2009 Sep 21.
5
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.CytoCluster:一款用于生物网络聚类分析和可视化的Cytoscape插件。
Int J Mol Sci. 2017 Aug 31;18(9):1880. doi: 10.3390/ijms18091880.
6
MINE: Module Identification in Networks.矿:网络中的模块识别。
BMC Bioinformatics. 2011 May 23;12:192. doi: 10.1186/1471-2105-12-192.
7
Protein complex prediction for large protein protein interaction networks with the Core&Peel method.使用核心与剥离方法对大型蛋白质-蛋白质相互作用网络进行蛋白质复合物预测。
BMC Bioinformatics. 2016 Nov 8;17(Suppl 12):372. doi: 10.1186/s12859-016-1191-6.
8
Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.用于蛋白质相互作用图划分的马尔可夫聚类与亲和传播算法
BMC Bioinformatics. 2009 Mar 30;10:99. doi: 10.1186/1471-2105-10-99.
9
GraphCrunch 2: Software tool for network modeling, alignment and clustering.GraphCrunch 2:网络建模、对齐和聚类的软件工具。
BMC Bioinformatics. 2011 Jan 19;12:24. doi: 10.1186/1471-2105-12-24.
10
hF-measure: A new measurement for evaluating clusters in protein-protein interaction networks.HF-度量:一种用于评估蛋白质相互作用网络中聚类的新度量。
Proteomics. 2013 Jan;13(2):291-300. doi: 10.1002/pmic.201200436. Epub 2013 Jan 3.

引用本文的文献

1
Deciphering functional landscape and clinical implications of enhancer RNAs in lung adenocarcinoma.解析肺腺癌中增强子RNA的功能格局及临床意义
Sci Rep. 2025 Aug 12;15(1):29574. doi: 10.1038/s41598-025-15485-9.
2
Discovering the interactome, functions, and clinical relevance of enhancer RNAs in kidney renal clear cell carcinoma.探索肾透明细胞癌中增强子RNA的相互作用组、功能及临床相关性。
BMC Med Genomics. 2025 Jan 3;18(1):3. doi: 10.1186/s12920-024-02081-5.
3
eRNA-IDO: A One-stop Platform for Identification, Interactome Discovery, and Functional Annotation of Enhancer RNAs.

本文引用的文献

1
How and when should interactome-derived clusters be used to predict functional modules and protein function?应当如何以及何时使用互作组学衍生的聚类来预测功能模块和蛋白质功能?
Bioinformatics. 2009 Dec 1;25(23):3143-50. doi: 10.1093/bioinformatics/btp551. Epub 2009 Sep 21.
2
Exploring the human genome with functional maps.利用功能图谱探索人类基因组。
Genome Res. 2009 Jun;19(6):1093-106. doi: 10.1101/gr.082214.108. Epub 2009 Feb 26.
3
Enumeration of condition-dependent dense modules in protein interaction networks.蛋白质相互作用网络中条件依赖密集模块的枚举
eRNA-IDO:一个用于鉴定、互作组发现和增强子 RNA 功能注释的一站式平台。
Genomics Proteomics Bioinformatics. 2024 Oct 15;22(4). doi: 10.1093/gpbjnl/qzae059.
4
Visualizing metagenomic and metatranscriptomic data: A comprehensive review.宏基因组学和宏转录组学数据的可视化:全面综述
Comput Struct Biotechnol J. 2024 May 3;23:2011-2033. doi: 10.1016/j.csbj.2024.04.060. eCollection 2024 Dec.
5
Biocaiv: an integrative webserver for motif-based clustering analysis and interactive visualization of biological networks.Biocaiv:一个基于基序的聚类分析的综合网络服务器,用于生物网络的交互式可视化。
BMC Bioinformatics. 2023 Nov 29;24(1):451. doi: 10.1186/s12859-023-05574-9.
6
Unraveling the functional dark matter through global metagenomics.通过全球宏基因组学揭示功能暗物质。
Nature. 2023 Oct;622(7983):594-602. doi: 10.1038/s41586-023-06583-7. Epub 2023 Oct 11.
7
HPC-Atlas: Computationally Constructing A Comprehensive Atlas of Human Protein Complexes.HPC图谱:通过计算构建人类蛋白质复合物综合图谱
Genomics Proteomics Bioinformatics. 2023 Oct;21(5):976-990. doi: 10.1016/j.gpb.2023.05.001. Epub 2023 Sep 18.
8
PCGAN: a generative approach for protein complex identification from protein interaction networks.PCGAN:一种从蛋白质相互作用网络中识别蛋白质复合物的生成方法。
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad473.
9
Fast and accurate protein structure search with Foldseek.使用 Foldseek 进行快速准确的蛋白质结构搜索。
Nat Biotechnol. 2024 Feb;42(2):243-246. doi: 10.1038/s41587-023-01773-0. Epub 2023 May 8.
10
Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters.在蛋白质家族水平上探索微生物功能多样性——从宏基因组序列 reads 到注释的蛋白质簇。
Front Bioinform. 2023 Mar 3;3:1157956. doi: 10.3389/fbinf.2023.1157956. eCollection 2023.
Bioinformatics. 2009 Apr 1;25(7):933-40. doi: 10.1093/bioinformatics/btp080. Epub 2009 Feb 11.
4
Dense graphlet statistics of protein interaction and random networks.蛋白质相互作用网络和随机网络的密集图元统计
Pac Symp Biocomput. 2009:178-89. doi: 10.1142/9789812836939_0018.
5
Revealing biological modules via graph summarization.通过图摘要揭示生物模块。
J Comput Biol. 2009 Feb;16(2):253-64. doi: 10.1089/cmb.2008.11TT.
6
STRING 8--a global view on proteins and their functional interactions in 630 organisms.STRING 8——关于630种生物中蛋白质及其功能相互作用的全局视图。
Nucleic Acids Res. 2009 Jan;37(Database issue):D412-6. doi: 10.1093/nar/gkn760. Epub 2008 Oct 21.
7
Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space.用于对海量数据集进行精确层次聚类的高效算法:攻克整个蛋白质空间
Bioinformatics. 2008 Jul 1;24(13):i41-9. doi: 10.1093/bioinformatics/btn174.
8
Use and misuse of the gene ontology annotations.基因本体注释的使用与误用。
Nat Rev Genet. 2008 Jul;9(7):509-15. doi: 10.1038/nrg2363. Epub 2008 May 13.
9
The BioGRID Interaction Database: 2008 update.生物通用互作数据库:2008年更新版
Nucleic Acids Res. 2008 Jan;36(Database issue):D637-40. doi: 10.1093/nar/gkm1001. Epub 2007 Nov 13.
10
Evaluation of clustering algorithms for protein-protein interaction networks.蛋白质-蛋白质相互作用网络聚类算法的评估
BMC Bioinformatics. 2006 Nov 6;7:488. doi: 10.1186/1471-2105-7-488.