• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用层次凝聚聚类解析互作组学结构。

Resolving the structure of interactomes with hierarchical agglomerative clustering.

机构信息

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.

出版信息

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S44. doi: 10.1186/1471-2105-12-S1-S44.

DOI:10.1186/1471-2105-12-S1-S44
PMID:21342576
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3044301/
Abstract

BACKGROUND

Graphs provide a natural framework for visualizing and analyzing networks of many types, including biological networks. Network clustering is a valuable approach for summarizing the structure in large networks, for predicting unobserved interactions, and for predicting functional annotations. Many current clustering algorithms suffer from a common set of limitations: poor resolution of top-level clusters; over-splitting of bottom-level clusters; requirements to pre-define the number of clusters prior to analysis; and an inability to jointly cluster over multiple interaction types.

RESULTS

A new algorithm, Hierarchical Agglomerative Clustering (HAC), is developed for fast clustering of heterogeneous interaction networks. This algorithm uses maximum likelihood to drive the inference of a hierarchical stochastic block model for network structure. Bayesian model selection provides a principled method for collapsing the fine-structure within the smallest groups, and for identifying the top-level groups within a network. Model scores are additive over independent interaction types, providing a direct route for simultaneous analysis of multiple interaction types. In addition to inferring network structure, this algorithm generates link predictions that with cross-validation provide a quantitative assessment of performance for real-world examples.

CONCLUSIONS

When applied to genome-scale data sets representing several organisms and interaction types, HAC provides the overall best performance in link prediction when compared with other clustering methods and with model-free graph diffusion kernels. Investigation of performance on genome-scale yeast protein interactions reveals roughly 100 top-level clusters, with a long-tailed distribution of cluster sizes. These are in turn partitioned into 1000 fine-level clusters containing 5 proteins on average, again with a long-tailed size distribution. Top-level clusters correspond to broad biological processes, whereas fine-level clusters correspond to discrete complexes. Surprisingly, link prediction based on joint clustering of physical and genetic interactions performs worse than predictions based on individual data sets, suggesting a lack of synergy in current high-throughput data.

摘要

背景

图为可视化和分析多种类型的网络提供了一个自然的框架,包括生物网络。网络聚类是一种总结大型网络结构、预测未观察到的相互作用以及预测功能注释的有价值的方法。许多当前的聚类算法都存在一些共同的局限性:高层聚类分辨率差;底层聚类过度分割;在分析之前需要预先定义聚类的数量;以及无法联合聚类多种相互作用类型。

结果

开发了一种新的算法,层次凝聚聚类(HAC),用于快速聚类异构交互网络。该算法使用最大似然法驱动网络结构的层次随机块模型的推断。贝叶斯模型选择为在最小组内合并精细结构以及在网络内识别顶级组提供了一种原则方法。模型得分在独立的相互作用类型上是可加的,为同时分析多种相互作用类型提供了直接途径。除了推断网络结构外,该算法还生成链接预测,通过交叉验证为真实示例提供性能的定量评估。

结论

当应用于代表几种生物体和相互作用类型的基因组规模数据集时,HAC 在链接预测方面的性能优于其他聚类方法和无模型图扩散核。对酵母蛋白质相互作用的基因组规模数据的性能进行调查表明,大约有 100 个顶级聚类,聚类大小呈长尾分布。这些聚类反过来又被分成 1000 个包含平均 5 个蛋白质的精细级聚类,其大小分布也呈长尾分布。顶级聚类对应于广泛的生物学过程,而精细级聚类对应于离散的复合物。令人惊讶的是,基于物理和遗传相互作用的联合聚类进行的链接预测比基于单个数据集的预测效果更差,这表明当前高通量数据缺乏协同作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3aa/3044301/477fd9e00eba/1471-2105-12-S1-S44-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3aa/3044301/0f4b23fe49dd/1471-2105-12-S1-S44-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3aa/3044301/586456ed3555/1471-2105-12-S1-S44-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3aa/3044301/a7f808db2d32/1471-2105-12-S1-S44-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3aa/3044301/477fd9e00eba/1471-2105-12-S1-S44-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3aa/3044301/0f4b23fe49dd/1471-2105-12-S1-S44-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3aa/3044301/586456ed3555/1471-2105-12-S1-S44-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3aa/3044301/a7f808db2d32/1471-2105-12-S1-S44-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3aa/3044301/477fd9e00eba/1471-2105-12-S1-S44-4.jpg

相似文献

1
Resolving the structure of interactomes with hierarchical agglomerative clustering.利用层次凝聚聚类解析互作组学结构。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S44. doi: 10.1186/1471-2105-12-S1-S44.
2
How networks change with time.网络随时间的变化。
Bioinformatics. 2012 Jun 15;28(12):i40-8. doi: 10.1093/bioinformatics/bts211.
3
The relative vertex clustering value--a new criterion for the fast discovery of functional modules in protein interaction networks.相对顶点聚类值——蛋白质相互作用网络中功能模块快速发现的新准则。
BMC Bioinformatics. 2015;16 Suppl 4(Suppl 4):S3. doi: 10.1186/1471-2105-16-S4-S3. Epub 2015 Feb 23.
4
Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.用于蛋白质相互作用图划分的马尔可夫聚类与亲和传播算法
BMC Bioinformatics. 2009 Mar 30;10:99. doi: 10.1186/1471-2105-10-99.
5
A multi-network clustering method for detecting protein complexes from multiple heterogeneous networks.一种用于从多个异构网络中检测蛋白质复合物的多网络聚类方法。
BMC Bioinformatics. 2017 Dec 1;18(Suppl 13):463. doi: 10.1186/s12859-017-1877-4.
6
A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks.一种用于蛋白质相互作用网络中功能模块发现的快速层次聚类算法。
IEEE/ACM Trans Comput Biol Bioinform. 2011 May-Jun;8(3):607-20. doi: 10.1109/TCBB.2010.75.
7
How and when should interactome-derived clusters be used to predict functional modules and protein function?应当如何以及何时使用互作组学衍生的聚类来预测功能模块和蛋白质功能?
Bioinformatics. 2009 Dec 1;25(23):3143-50. doi: 10.1093/bioinformatics/btp551. Epub 2009 Sep 21.
8
Finding molecular complexes through multiple layer clustering of protein interaction networks.通过蛋白质相互作用网络的多层聚类寻找分子复合物。
Int J Bioinform Res Appl. 2007;3(1):65-85. doi: 10.1504/IJBRA.2007.011835.
9
Predicting protein complexes from weighted protein-protein interaction graphs with a novel unsupervised methodology: Evolutionary enhanced Markov clustering.利用一种新颖的无监督方法从加权蛋白质 - 蛋白质相互作用图预测蛋白质复合物:进化增强的马尔可夫聚类。
Artif Intell Med. 2015 Mar;63(3):181-9. doi: 10.1016/j.artmed.2014.12.012. Epub 2015 Feb 18.
10
Evaluation of clustering algorithms for protein-protein interaction networks.蛋白质-蛋白质相互作用网络聚类算法的评估
BMC Bioinformatics. 2006 Nov 6;7:488. doi: 10.1186/1471-2105-7-488.

引用本文的文献

1
Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci.通过跨基因座的贝叶斯数据整合从全基因组关联研究中确定因果基因的优先级。
PLoS Comput Biol. 2025 Jan 7;21(1):e1012725. doi: 10.1371/journal.pcbi.1012725. eCollection 2025 Jan.
2
Global airborne bacterial community-interactions with Earth's microbiomes and anthropogenic activities.全球气载细菌群落——与地球微生物组及人为活动的相互作用。
Proc Natl Acad Sci U S A. 2022 Oct 18;119(42):e2204465119. doi: 10.1073/pnas.2204465119. Epub 2022 Oct 10.
3
Functional network motifs defined through integration of protein-protein and genetic interactions.

本文引用的文献

1
Mixed Membership Stochastic Blockmodels.混合成员随机块模型
J Mach Learn Res. 2008 Sep;9:1981-2014.
2
The plasma membrane brings autophagosomes to life.质膜赋予自噬体活力。
Nat Cell Biol. 2010 Aug;12(8):735-7. doi: 10.1038/ncb0810-735.
3
Plasma membrane contributes to the formation of pre-autophagosomal structures.质膜有助于前自噬体结构的形成。
通过整合蛋白质-蛋白质相互作用和遗传相互作用定义功能网络基元。
PeerJ. 2022 Feb 22;10:e13016. doi: 10.7717/peerj.13016. eCollection 2022.
4
Mapping the multiscale structure of biological systems.绘制生物系统的多尺度结构。
Cell Syst. 2021 Jun 16;12(6):622-635. doi: 10.1016/j.cels.2021.05.012.
5
Compact Integration of Multi-Network Topology for Functional Analysis of Genes.用于基因功能分析的多网络拓扑结构的紧凑集成
Cell Syst. 2016 Dec 21;3(6):540-548.e5. doi: 10.1016/j.cels.2016.10.017. Epub 2016 Nov 23.
6
Inferring gene ontologies from pairwise similarity data.从成对相似性数据推断基因本体论。
Bioinformatics. 2014 Jun 15;30(12):i34-42. doi: 10.1093/bioinformatics/btu282.
7
NeXO Web: the NeXO ontology database and visualization platform.NeXO Web:NeXO 本体数据库和可视化平台。
Nucleic Acids Res. 2014 Jan;42(Database issue):D1269-74. doi: 10.1093/nar/gkt1192. Epub 2013 Nov 23.
8
Exploring the limits of community detection strategies in complex networks.探索复杂网络中社区检测策略的局限性。
Sci Rep. 2013;3:2216. doi: 10.1038/srep02216.
9
Surprise maximization reveals the community structure of complex networks.最大化惊喜揭示了复杂网络的社区结构。
Sci Rep. 2013;3:1060. doi: 10.1038/srep01060. Epub 2013 Jan 14.
10
A gene ontology inferred from molecular networks.从分子网络推断出的基因本体论。
Nat Biotechnol. 2013 Jan;31(1):38-45. doi: 10.1038/nbt.2463.
Nat Cell Biol. 2010 Aug;12(8):747-57. doi: 10.1038/ncb2078. Epub 2010 Jul 18.
4
Unconventional secretion by autophagosome exocytosis.自噬体胞吐作用的非常规分泌。
J Cell Biol. 2010 Feb 22;188(4):451-2. doi: 10.1083/jcb.201001121. Epub 2010 Feb 15.
5
NeMo: Network Module identification in Cytoscape.NeMo:Cytoscape 中的网络模块识别。
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S61. doi: 10.1186/1471-2105-11-S1-S61.
6
The genetic landscape of a cell.细胞的基因图谱。
Science. 2010 Jan 22;327(5964):425-31. doi: 10.1126/science.1180823.
7
Dynamic networks from hierarchical bayesian graph clustering.基于分层贝叶斯图聚类的动态网络
PLoS One. 2010 Jan 11;5(1):e8118. doi: 10.1371/journal.pone.0008118.
8
Regulation mechanisms and signaling pathways of autophagy.自噬的调控机制与信号通路。
Annu Rev Genet. 2009;43:67-93. doi: 10.1146/annurev-genet-102808-114910.
9
Precision and recall estimates for two-hybrid screens.双杂交筛选的精确率和召回率估计。
Bioinformatics. 2009 Feb 1;25(3):372-8. doi: 10.1093/bioinformatics/btn640. Epub 2008 Dec 17.
10
Finding friends and enemies in an enemies-only network: a graph diffusion kernel for predicting novel genetic interactions and co-complex membership from yeast genetic interactions.在仅含敌人的网络中寻找朋友和敌人:一种用于从酵母遗传相互作用预测新型遗传相互作用和共复合体成员关系的图扩散核
Genome Res. 2008 Dec;18(12):1991-2004. doi: 10.1101/gr.077693.108. Epub 2008 Oct 2.