文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

用于转录组学和蛋白质组学数据的单细胞聚类算法的比较基准测试

Comparative benchmarking of single-cell clustering algorithms for transcriptomic and proteomic data.

作者信息

Yin Yu-Hang, Wang Fang, Li Wei, Liu Qiaoming, Zhou Shengming, Zhou Murong, Jiang Zhongjun, Yu Dong-Jun, Wang Guohua

机构信息

College of Life Science, Northeast Forestry University, Harbin, 150040, China.

College of Computer and Control Engineering, Northeast Forestry University, Harbin, 150040, China.

出版信息

Genome Biol. 2025 Sep 3;26(1):265. doi: 10.1186/s13059-025-03719-y.


DOI:10.1186/s13059-025-03719-y
PMID:40903792
Abstract

BACKGROUND: Differences in data distribution, feature dimensions, and quality between different single-cell modalities pose challenges for clustering. Although clustering algorithms have been developed for single-cell transcriptomic or proteomic data, their performance across different omics data types and integration scenarios remains poorly investigated, which limits the selection of methods and future method development. RESULTS: In this study, we conduct a systematic and comparative benchmark analysis of 28 computational algorithms on 10 paired transcriptomic and proteomic datasets, evaluating their performance across various metrics in terms of clustering, peak memory, and running time. We also discuss the impact of highly variable genes (HVGs) and cell type granularity on clustering performance. Additionally, the robustness of these clustering methods on two kinds of omics is evaluating by using 30 simulated datasets. Furthermore, to explore the benefits of integrating omics information for clustering tasks, we integrate single-cell transcriptomic and proteomic data using 7 state-of-the-art integration methods and assess the performance of existing single-omics clustering schemes on the integrated features. CONCLUSIONS: Our findings reveal modality-specific strengths and limitations, highlight the complementary nature of existing methods, and provide actionable insights to guide the selection of appropriate clustering approaches for specific scenarios. Overall, for top performance across two omics, consider scAIDE, scDCC, and FlowSOM, with FlowSOM also offering excellent robustness. For users prioritizing memory efficiency scDCC and scDeepCluster are recommended, while TSCAN, SHARP, and MarkovHC are recommended for users who prioritize time efficiency, and community detection-based methods offer a balance.

摘要

背景:不同单细胞模态之间的数据分布、特征维度和质量差异给聚类带来了挑战。尽管已经针对单细胞转录组或蛋白质组数据开发了聚类算法,但它们在不同组学数据类型和整合场景下的性能仍未得到充分研究,这限制了方法的选择和未来方法的开发。 结果:在本研究中,我们对10个配对的转录组和蛋白质组数据集上的28种计算算法进行了系统的比较基准分析,从聚类、峰值内存和运行时间等多个指标评估了它们的性能。我们还讨论了高变基因(HVG)和细胞类型粒度对聚类性能的影响。此外,通过使用30个模拟数据集评估了这些聚类方法在两种组学上的稳健性。此外,为了探索整合组学信息对聚类任务的益处,我们使用7种先进的整合方法整合了单细胞转录组和蛋白质组数据,并评估了现有单一组学聚类方案在整合特征上的性能。 结论:我们的研究结果揭示了模态特异性的优势和局限性,突出了现有方法的互补性,并提供了可行的见解,以指导为特定场景选择合适的聚类方法。总体而言,为了在两种组学上获得最佳性能,可以考虑scAIDE、scDCC和FlowSOM,其中FlowSOM也具有出色的稳健性。对于优先考虑内存效率的用户,建议使用scDCC和scDeepCluster,而对于优先考虑时间效率的用户,建议使用TSCAN、SHARP和MarkovHC,基于社区检测的方法则提供了一种平衡。

相似文献

[1]
Comparative benchmarking of single-cell clustering algorithms for transcriptomic and proteomic data.

Genome Biol. 2025-9-3

[2]
Prescription of Controlled Substances: Benefits and Risks

2025-1

[3]
Reference Vector-guided Evolutionary Algorithm for cluster analysis of single-cell transcriptomes.

Comput Methods Programs Biomed. 2025-9

[4]
scMNMF: a novel method for single-cell multi-omics clustering based on matrix factorization.

Brief Bioinform. 2024-3-27

[5]
Genetic determinants of testicular sperm extraction outcomes: insights from a large multicentre study of men with non-obstructive azoospermia.

Hum Reprod Open. 2025-8-29

[6]
Short-Term Memory Impairment

2025-1

[7]
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.

Cochrane Database Syst Rev. 2021-4-19

[8]
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.

Cochrane Database Syst Rev. 2020-1-9

[9]
Multi-omics single-cell data alignment and integration with enhanced contrastive learning and differential attention mechanism.

Bioinformatics. 2025-8-2

[10]
stGRL: spatial domain identification, denoising, and imputation algorithm for spatial transcriptome data based on multi-task graph contrastive representation learning.

BMC Biol. 2025-7-1

本文引用的文献

[1]
Benchmarking algorithms for single-cell multi-omics prediction and integration.

Nat Methods. 2024-11

[2]
scCAD: Cluster decomposition-based anomaly detection for rare cell identification in single-cell expression data.

Nat Commun. 2024-8-31

[3]
SPDB: a comprehensive resource and knowledgebase for proteomic data at the single-cell resolution.

Nucleic Acids Res. 2024-1-5

[4]
Benchmarking algorithms for joint integration of unpaired and paired single-cell RNA-seq and ATAC-seq data.

Genome Biol. 2023-10-24

[5]
Single-cell multi-omics topic embedding reveals cell-type-specific and COVID-19 severity-related immune signatures.

Cell Rep Methods. 2023-8-28

[6]
Significance analysis for clustering with single-cell RNA-sequencing data.

Nat Methods. 2023-8

[7]
Dictionary learning for integrative, multimodal and scalable single-cell analysis.

Nat Biotechnol. 2024-2

[8]
Single-cell analysis targeting the proteome.

Nat Rev Chem. 2020-3

[9]
Single-cell proteomics: challenges and prospects.

Nat Methods. 2023-3

[10]
A multi-use deep learning method for CITE-seq and single-cell RNA-seq data integration with cell surface protein prediction and imputation.

Nat Mach Intell. 2022-11

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索