• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单细胞RNA测序聚类方法的基准测试与参数敏感性分析

Benchmark and Parameter Sensitivity Analysis of Single-Cell RNA Sequencing Clustering Methods.

作者信息

Krzak Monika, Raykov Yordan, Boukouvalas Alexis, Cutillo Luisa, Angelini Claudia

机构信息

Institute for Applied Mathematics "Mauro Picone", Naples, Italy.

Department of Mathematics, Aston University, Birmingham, United Kingdom.

出版信息

Front Genet. 2019 Dec 11;10:1253. doi: 10.3389/fgene.2019.01253. eCollection 2019.

DOI:10.3389/fgene.2019.01253
PMID:31921297
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6918801/
Abstract

Single-cell RNA-seq (scRNAseq) is a powerful tool to study heterogeneity of cells. Recently, several clustering based methods have been proposed to identify distinct cell populations. These methods are based on different statistical models and usually require to perform several additional steps, such as preprocessing or dimension reduction, before applying the clustering algorithm. Individual steps are often controlled by method-specific parameters, permitting the method to be used in different modes on the same datasets, depending on the user choices. The large number of possibilities that these methods provide can intimidate non-expert users, since the available choices are not always clearly documented. In addition, to date, no large studies have invistigated the role and the impact that these choices can have in different experimental contexts. This work aims to provide new insights into the advantages and drawbacks of scRNAseq clustering methods and describe the ranges of possibilities that are offered to users. In particular, we provide an extensive evaluation of several methods with respect to different modes of usage and parameter settings by applying them to real and simulated datasets that vary in terms of dimensionality, number of cell populations or levels of noise. Remarkably, the results presented here show that great variability in the performance of the models is strongly attributed to the choice of the user-specific parameter settings. We describe several tendencies in the performance attributed to their modes of usage and different types of datasets, and identify which methods are strongly affected by data dimensionality in terms of computational time. Finally, we highlight some open challenges in scRNAseq data clustering, such as those related to the identification of the number of clusters.

摘要

单细胞RNA测序(scRNAseq)是研究细胞异质性的强大工具。最近,已提出了几种基于聚类的方法来识别不同的细胞群体。这些方法基于不同的统计模型,并且在应用聚类算法之前通常需要执行几个额外的步骤,例如预处理或降维。各个步骤通常由特定于方法的参数控制,这使得该方法可以根据用户选择以不同模式应用于相同数据集。这些方法提供的大量可能性可能会让非专业用户望而却步,因为可用的选择并不总是有清晰的文档记录。此外,迄今为止,尚无大型研究调查这些选择在不同实验背景下可能发挥的作用和产生的影响。这项工作旨在深入了解scRNAseq聚类方法的优缺点,并描述为用户提供的可能性范围。特别是,我们通过将几种方法应用于在维度、细胞群体数量或噪声水平方面有所不同的真实和模拟数据集,对它们在不同使用模式和参数设置下进行了广泛评估。值得注意的是,此处呈现的结果表明,模型性能的巨大差异很大程度上归因于用户特定参数设置的选择。我们描述了归因于其使用模式和不同类型数据集的性能方面的几种趋势,并确定了哪些方法在计算时间方面受数据维度的影响较大。最后,我们强调了scRNAseq数据聚类中的一些开放挑战,例如与簇数量识别相关的挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/2bd990929f92/fgene-10-01253-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/1e2963bdeed7/fgene-10-01253-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/26afc512be5d/fgene-10-01253-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/6ddc8840813b/fgene-10-01253-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/5bc9cd0b183a/fgene-10-01253-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/3a8808444d64/fgene-10-01253-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/cbcc45cf88e7/fgene-10-01253-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/8a02ca08657c/fgene-10-01253-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/86f6976eca73/fgene-10-01253-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/92ea9d87c76c/fgene-10-01253-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/3a3a473e3046/fgene-10-01253-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/2bd990929f92/fgene-10-01253-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/1e2963bdeed7/fgene-10-01253-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/26afc512be5d/fgene-10-01253-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/6ddc8840813b/fgene-10-01253-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/5bc9cd0b183a/fgene-10-01253-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/3a8808444d64/fgene-10-01253-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/cbcc45cf88e7/fgene-10-01253-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/8a02ca08657c/fgene-10-01253-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/86f6976eca73/fgene-10-01253-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/92ea9d87c76c/fgene-10-01253-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/3a3a473e3046/fgene-10-01253-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6164/6918801/2bd990929f92/fgene-10-01253-g011.jpg

相似文献

1
Benchmark and Parameter Sensitivity Analysis of Single-Cell RNA Sequencing Clustering Methods.单细胞RNA测序聚类方法的基准测试与参数敏感性分析
Front Genet. 2019 Dec 11;10:1253. doi: 10.3389/fgene.2019.01253. eCollection 2019.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis.基于自动编码器的单细胞 RNA-seq 数据分析聚类集成。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 19):660. doi: 10.1186/s12859-019-3179-5.
4
FR-Match: robust matching of cell type clusters from single cell RNA sequencing data using the Friedman-Rafsky non-parametric test.FR-Match:使用 Friedman-Rafsky 非参数检验对单细胞 RNA 测序数据中的细胞类型簇进行稳健匹配。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa339.
5
Hubness reduction improves clustering and trajectory inference in single-cell transcriptomic data.消除冗余性可改善单细胞转录组数据中的聚类和轨迹推断。
Bioinformatics. 2022 Jan 27;38(4):1045-1051. doi: 10.1093/bioinformatics/btab795.
6
Joint learning dimension reduction and clustering of single-cell RNA-sequencing data.单细胞 RNA 测序数据的联合降维和聚类学习。
Bioinformatics. 2020 Jun 1;36(12):3825-3832. doi: 10.1093/bioinformatics/btaa231.
7
Supervised application of internal validation measures to benchmark dimensionality reduction methods in scRNA-seq data.监督应用内部验证措施,以基准化 scRNA-seq 数据的降维方法。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab304.
8
The effect of data transformation on low-dimensional integration of single-cell RNA-seq.数据转换对单细胞 RNA-seq 低维整合的影响。
BMC Bioinformatics. 2024 Apr 30;25(1):171. doi: 10.1186/s12859-024-05788-5.
9
A parameter-free deep embedded clustering method for single-cell RNA-seq data.一种无参数深度嵌入聚类方法,用于单细胞 RNA-seq 数据。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac172.
10
Dimension Reduction and Clustering Models for Single-Cell RNA Sequencing Data: A Comparative Study.降维与聚类模型在单细胞 RNA 测序数据中的应用:一项比较研究。
Int J Mol Sci. 2020 Mar 22;21(6):2181. doi: 10.3390/ijms21062181.

引用本文的文献

1
Comparative benchmarking of single-cell clustering algorithms for transcriptomic and proteomic data.用于转录组学和蛋白质组学数据的单细胞聚类算法的比较基准测试
Genome Biol. 2025 Sep 3;26(1):265. doi: 10.1186/s13059-025-03719-y.
2
gSELECT: A novel pre-analysis machine-learning library enabling early hypothesis testing and predictive gene selection in single-cell data.gSELECT:一个新型的预分析机器学习库,可在单细胞数据中进行早期假设检验和预测性基因选择。
Comput Struct Biotechnol J. 2025 Aug 5;27:3510-3527. doi: 10.1016/j.csbj.2025.07.047. eCollection 2025.
3
A survey of biclustering and clustering methods in clustering different types of single-cell RNA sequencing data.

本文引用的文献

1
ascend: R package for analysis of single-cell RNA-seq data.ascend:用于分析单细胞 RNA-seq 数据的 R 包。
Gigascience. 2019 Aug 1;8(8). doi: 10.1093/gigascience/giz087.
2
Current best practices in single-cell RNA-seq analysis: a tutorial.单细胞 RNA 测序分析的当前最佳实践:教程。
Mol Syst Biol. 2019 Jun 19;15(6):e8746. doi: 10.15252/msb.20188746.
3
SSCC: A Novel Computational Framework for Rapid and Accurate Clustering Large-scale Single Cell RNA-seq Data.SSCC:一种用于快速准确聚类大规模单细胞 RNA-seq 数据的新型计算框架。
关于在对不同类型的单细胞RNA测序数据进行聚类时的双聚类和聚类方法的一项调查。
Brief Funct Genomics. 2025 Jan 15;24. doi: 10.1093/bfgp/elaf010.
4
scEVE: a single-cell RNA-seq ensemble clustering algorithm capitalizing on the differences of predictions between multiple clustering methods.scEVE:一种利用多种聚类方法预测差异的单细胞RNA测序集成聚类算法。
NAR Genom Bioinform. 2025 Jun 9;7(2):lqaf073. doi: 10.1093/nargab/lqaf073. eCollection 2025 Jun.
5
ChromMovie: A Molecular Dynamics Approach for Simultaneous Modeling of Chromatin Conformation Changes from Multiple Single-Cell Hi-C Maps.ChromMovie:一种基于分子动力学的方法,用于从多个单细胞Hi-C图谱中同步建模染色质构象变化
bioRxiv. 2025 May 21:2025.05.16.654550. doi: 10.1101/2025.05.16.654550.
6
The impact of dropouts in scRNAseq dense neighborhood analysis.单细胞RNA测序密集邻域分析中缺失数据的影响。
Comput Struct Biotechnol J. 2025 Mar 24;27:1278-1285. doi: 10.1016/j.csbj.2025.03.033. eCollection 2025.
7
Decoding the mosaic of inflammatory bowel disease: Illuminating insights with single-cell RNA technology.解读炎症性肠病的细胞图谱:利用单细胞RNA技术获得的深刻见解
Comput Struct Biotechnol J. 2024 Jul 11;23:2911-2923. doi: 10.1016/j.csbj.2024.07.011. eCollection 2024 Dec.
8
ESCHR: a hyperparameter-randomized ensemble approach for robust clustering across diverse datasets.ESCHR:一种针对不同数据集的稳健聚类的超参数随机集成方法。
Genome Biol. 2024 Sep 16;25(1):242. doi: 10.1186/s13059-024-03386-5.
9
Binomial models uncover biological variation during feature selection of droplet-based single-cell RNA sequencing.二项式模型在基于液滴的单细胞 RNA 测序的特征选择过程中揭示了生物学变异性。
PLoS Comput Biol. 2024 Sep 6;20(9):e1012386. doi: 10.1371/journal.pcbi.1012386. eCollection 2024 Sep.
10
CASCC: a co-expression-assisted single-cell RNA-seq data clustering method.CASCC:一种基于共表达辅助的单细胞 RNA-seq 数据聚类方法。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae283.
Genomics Proteomics Bioinformatics. 2019 Apr;17(2):201-210. doi: 10.1016/j.gpb.2018.10.003. Epub 2019 Jun 13.
4
Benchmarking single cell RNA-sequencing analysis pipelines using mixture control experiments.使用混合对照实验对标单细胞 RNA 测序分析流程。
Nat Methods. 2019 Jun;16(6):479-487. doi: 10.1038/s41592-019-0425-8. Epub 2019 May 27.
5
Single-Cell RNA-Seq Technologies and Related Computational Data Analysis.单细胞RNA测序技术及相关计算数据分析
Front Genet. 2019 Apr 5;10:317. doi: 10.3389/fgene.2019.00317. eCollection 2019.
6
Single-cell expression profiling reveals dynamic flux of cardiac stromal, vascular and immune cells in health and injury.单细胞表达谱分析揭示了心脏基质、血管和免疫细胞在健康和损伤中的动态变化。
Elife. 2019 Mar 26;8:e43882. doi: 10.7554/eLife.43882.
7
Challenges in unsupervised clustering of single-cell RNA-seq data.无监督单细胞 RNA-seq 数据聚类的挑战。
Nat Rev Genet. 2019 May;20(5):273-282. doi: 10.1038/s41576-018-0088-9.
8
Single-cell RNA-sequencing reveals transcriptional dynamics of estrogen-induced dysplasia in the ovarian surface epithelium.单细胞 RNA 测序揭示了雌激素诱导的卵巢表面上皮细胞发育不良的转录动态。
PLoS Genet. 2018 Nov 12;14(11):e1007788. doi: 10.1371/journal.pgen.1007788. eCollection 2018 Nov.
9
A systematic performance evaluation of clustering methods for single-cell RNA-seq data.单细胞RNA测序数据聚类方法的系统性能评估
F1000Res. 2018 Jul 26;7:1141. doi: 10.12688/f1000research.15666.3. eCollection 2018.
10
Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data.用于中等规模10x基因组学单细胞RNA测序数据的R语言聚类工具比较
F1000Res. 2018 Aug 15;7:1297. doi: 10.12688/f1000research.15809.2. eCollection 2018.