• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于多种生物网络的健壮、可扩展且信息丰富的聚类。

Robust, scalable, and informative clustering for diverse biological networks.

机构信息

Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA.

Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, USA.

出版信息

Genome Biol. 2023 Oct 12;24(1):228. doi: 10.1186/s13059-023-03062-0.

DOI:10.1186/s13059-023-03062-0
PMID:37828545
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10571258/
Abstract

Clustering molecular data into informative groups is a primary step in extracting robust conclusions from big data. However, due to foundational issues in how they are defined and detected, such clusters are not always reliable, leading to unstable conclusions. We compare popular clustering algorithms across thousands of synthetic and real biological datasets, including a new consensus clustering algorithm-SpeakEasy2: Champagne. These tests identify trends in performance, show no single method is universally optimal, and allow us to examine factors behind variation in performance. Multiple metrics indicate SpeakEasy2 generally provides robust, scalable, and informative clusters for a range of applications.

摘要

将分子数据聚类为信息组是从大数据中提取可靠结论的首要步骤。然而,由于它们的定义和检测方式存在基础问题,因此这些聚类并不总是可靠的,导致结论不稳定。我们比较了数千个合成和真实生物数据集的流行聚类算法,包括一种新的共识聚类算法-SpeakEasy2:Champagne。这些测试确定了性能趋势,表明没有一种方法是普遍最优的,并使我们能够检查性能变化背后的因素。多种指标表明,SpeakEasy2 通常可为各种应用提供稳健、可扩展且信息丰富的聚类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/3b05104cbc0e/13059_2023_3062_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/dcf32eb861f9/13059_2023_3062_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/7311cc17de97/13059_2023_3062_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/562c01930753/13059_2023_3062_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/a33271183278/13059_2023_3062_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/2c0a2f64e66a/13059_2023_3062_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/007466d6c66e/13059_2023_3062_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/f346c69be637/13059_2023_3062_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/3b05104cbc0e/13059_2023_3062_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/dcf32eb861f9/13059_2023_3062_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/7311cc17de97/13059_2023_3062_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/562c01930753/13059_2023_3062_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/a33271183278/13059_2023_3062_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/2c0a2f64e66a/13059_2023_3062_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/007466d6c66e/13059_2023_3062_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/f346c69be637/13059_2023_3062_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1387/10571258/3b05104cbc0e/13059_2023_3062_Fig8_HTML.jpg

相似文献

1
Robust, scalable, and informative clustering for diverse biological networks.用于多种生物网络的健壮、可扩展且信息丰富的聚类。
Genome Biol. 2023 Oct 12;24(1):228. doi: 10.1186/s13059-023-03062-0.
2
Tight clustering for large datasets with an application to gene expression data.针对大型数据集的紧密聚类及其在基因表达数据中的应用。
Sci Rep. 2019 Feb 28;9(1):3053. doi: 10.1038/s41598-019-39459-w.
3
Knowledge-assisted recognition of cluster boundaries in gene expression data.基因表达数据中聚类边界的知识辅助识别。
Artif Intell Med. 2005 Sep-Oct;35(1-2):171-83. doi: 10.1016/j.artmed.2005.02.007.
4
A formal concept analysis approach to consensus clustering of multi-experiment expression data.一种正式的概念分析方法,用于多实验表达数据的共识聚类。
BMC Bioinformatics. 2014 May 19;15:151. doi: 10.1186/1471-2105-15-151.
5
Nearest Neighbor Networks: clustering expression data based on gene neighborhoods.最近邻网络:基于基因邻域对表达数据进行聚类。
BMC Bioinformatics. 2007 Jul 12;8:250. doi: 10.1186/1471-2105-8-250.
6
Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.聚类组学:针对异构数据集的整合上下文相关聚类
PLoS Comput Biol. 2017 Oct 16;13(10):e1005781. doi: 10.1371/journal.pcbi.1005781. eCollection 2017 Oct.
7
Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes.使用功能类别参考集评估基因表达数据聚类算法的方法。
BMC Bioinformatics. 2006 Aug 31;7:397. doi: 10.1186/1471-2105-7-397.
8
Recursive Consensus Clustering for novel subtype discovery from transcriptome data.基于转录组数据的新型亚型发现的递归共识聚类。
Sci Rep. 2020 Jul 3;10(1):11005. doi: 10.1038/s41598-020-67016-3.
9
Secuer: Ultrafast, scalable and accurate clustering of single-cell RNA-seq data.Secuer:单细胞 RNA-seq 数据的超快速、可扩展和精确聚类。
PLoS Comput Biol. 2022 Dec 5;18(12):e1010753. doi: 10.1371/journal.pcbi.1010753. eCollection 2022 Dec.
10
Clustering of gene expression data: performance and similarity analysis.基因表达数据的聚类:性能与相似性分析
BMC Bioinformatics. 2006 Dec 12;7 Suppl 4(Suppl 4):S19. doi: 10.1186/1471-2105-7-S4-S19.

引用本文的文献

1
Integration across biophysical scales identifies molecular and cellular correlates of person-to-person variability in human brain connectivity.跨生物物理尺度的整合确定了人类大脑连接中人与人之间变异性的分子和细胞相关性。
Nat Neurosci. 2024 Nov;27(11):2240-2252. doi: 10.1038/s41593-024-01788-z. Epub 2024 Oct 31.
2
Proteomic changes in Alzheimer's disease associated with progressive Aβ plaque and tau tangle pathologies.阿尔茨海默病相关的进行性 Aβ斑块和 tau 缠结病理的蛋白质组学变化。
Nat Neurosci. 2024 Oct;27(10):1880-1891. doi: 10.1038/s41593-024-01737-w. Epub 2024 Aug 26.
3
Identification of distinct and shared biomarker panels in different manifestations of cerebral small vessel disease through proteomic profiling.

本文引用的文献

1
Dynamic rewiring of biological activity across genotype and lineage revealed by context-dependent functional interactions.语境相关功能交互揭示了基因型和谱系间生物活性的动态重连。
Genome Biol. 2022 Jun 29;23(1):140. doi: 10.1186/s13059-022-02712-z.
2
Understudied proteins: opportunities and challenges for functional proteomics.研究不足的蛋白质:功能蛋白质组学面临的机遇与挑战
Nat Methods. 2022 Jul;19(7):774-779. doi: 10.1038/s41592-022-01454-x.
3
Community detection in networks using graph embeddings.使用图嵌入技术在网络中进行社区检测。
通过蛋白质组学分析鉴定脑小血管病不同表现形式中独特和共享的生物标志物组。
medRxiv. 2024 Jun 10:2024.06.10.24308599. doi: 10.1101/2024.06.10.24308599.
Phys Rev E. 2021 Feb;103(2-1):022316. doi: 10.1103/PhysRevE.103.022316.
4
Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data.基于单细胞 RNA-seq 数据的细胞类型聚类中数据预处理的影响。
BMC Bioinformatics. 2020 Oct 7;21(1):440. doi: 10.1186/s12859-020-03797-8.
5
Different mating contexts lead to extensive rewiring of female brain coexpression networks in the guppy.不同的交配环境导致了孔雀鱼雌性大脑共表达网络的广泛重连。
Genes Brain Behav. 2021 Mar;20(3):e12697. doi: 10.1111/gbb.12697. Epub 2020 Sep 22.
6
Benchmark and Parameter Sensitivity Analysis of Single-Cell RNA Sequencing Clustering Methods.单细胞RNA测序聚类方法的基准测试与参数敏感性分析
Front Genet. 2019 Dec 11;10:1253. doi: 10.3389/fgene.2019.01253. eCollection 2019.
7
Quantitative Systems Pharmacology for Neuroscience Drug Discovery and Development: Current Status, Opportunities, and Challenges.定量系统神经药理学在神经科学药物研发中的应用:现状、机遇与挑战。
CPT Pharmacometrics Syst Pharmacol. 2020 Jan;9(1):5-20. doi: 10.1002/psp4.12478. Epub 2019 Nov 24.
8
From Louvain to Leiden: guaranteeing well-connected communities.从鲁汶到莱顿:保障互联互通的社区。
Sci Rep. 2019 Mar 26;9(1):5233. doi: 10.1038/s41598-019-41695-z.
9
The many facets of community detection in complex networks.复杂网络中社区检测的多个方面。
Appl Netw Sci. 2017;2(1):4. doi: 10.1007/s41109-017-0023-6. Epub 2017 Feb 15.
10
Religious Orders Study and Rush Memory and Aging Project.宗教秩序研究和冲刺记忆与衰老项目。
J Alzheimers Dis. 2018;64(s1):S161-S189. doi: 10.3233/JAD-179939.