• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单细胞 RNA 测序数据模拟方法的基准研究。

A benchmark study of simulation methods for single-cell RNA sequencing data.

机构信息

Charles Perkins Centre, The University of Sydney, Sydney, Australia.

School of Mathematics and Statistics, The University of Sydney, Sydney, Australia.

出版信息

Nat Commun. 2021 Nov 25;12(1):6911. doi: 10.1038/s41467-021-27130-w.

DOI:10.1038/s41467-021-27130-w
PMID:34824223
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8617278/
Abstract

Single-cell RNA-seq (scRNA-seq) data simulation is critical for evaluating computational methods for analysing scRNA-seq data especially when ground truth is experimentally unattainable. The reliability of evaluation depends on the ability of simulation methods to capture properties of experimental data. However, while many scRNA-seq data simulation methods have been proposed, a systematic evaluation of these methods is lacking. We develop a comprehensive evaluation framework, SimBench, including a kernel density estimation measure to benchmark 12 simulation methods through 35 scRNA-seq experimental datasets. We evaluate the simulation methods on a panel of data properties, ability to maintain biological signals, scalability and applicability. Our benchmark uncovers performance differences among the methods and highlights the varying difficulties in simulating data characteristics. Furthermore, we identify several limitations including maintaining heterogeneity of distribution. These results, together with the framework and datasets made publicly available as R packages, will guide simulation methods selection and their future development.

摘要

单细胞 RNA 测序 (scRNA-seq) 数据模拟对于评估分析 scRNA-seq 数据的计算方法至关重要,特别是当实验无法获得真实数据时。评估的可靠性取决于模拟方法捕捉实验数据特性的能力。然而,尽管已经提出了许多 scRNA-seq 数据模拟方法,但缺乏对这些方法的系统评估。我们开发了一个全面的评估框架 SimBench,包括核密度估计度量标准,通过 35 个 scRNA-seq 实验数据集对 12 种模拟方法进行基准测试。我们在一组数据特性、保持生物学信号的能力、可扩展性和适用性方面评估了这些模拟方法。我们的基准测试揭示了方法之间的性能差异,并突出了模拟数据特征的不同难度。此外,我们还发现了一些限制,包括保持分布的异质性。这些结果,以及作为 R 包公开提供的框架和数据集,将指导模拟方法的选择及其未来的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9464/8617278/3e1ac83ae47b/41467_2021_27130_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9464/8617278/669f374a0a34/41467_2021_27130_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9464/8617278/7ad114d26633/41467_2021_27130_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9464/8617278/f038d796dc0c/41467_2021_27130_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9464/8617278/3e1ac83ae47b/41467_2021_27130_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9464/8617278/669f374a0a34/41467_2021_27130_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9464/8617278/7ad114d26633/41467_2021_27130_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9464/8617278/f038d796dc0c/41467_2021_27130_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9464/8617278/3e1ac83ae47b/41467_2021_27130_Fig4_HTML.jpg

相似文献

1
A benchmark study of simulation methods for single-cell RNA sequencing data.单细胞 RNA 测序数据模拟方法的基准研究。
Nat Commun. 2021 Nov 25;12(1):6911. doi: 10.1038/s41467-021-27130-w.
2
Evaluation of Cell Type Annotation R Packages on Single-cell RNA-seq Data.单细胞 RNA-seq 数据中细胞类型注释 R 包评估。
Genomics Proteomics Bioinformatics. 2021 Apr;19(2):267-281. doi: 10.1016/j.gpb.2020.07.004. Epub 2020 Dec 24.
3
Systematic evaluation with practical guidelines for single-cell and spatially resolved transcriptomics data simulation under multiple scenarios.系统评估及多种场景下单细胞和空间分辨转录组数据模拟的实用指南。
Genome Biol. 2024 Jun 3;25(1):145. doi: 10.1186/s13059-024-03290-y.
4
Data Analysis in Single-Cell Transcriptome Sequencing.单细胞转录组测序中的数据分析
Methods Mol Biol. 2018;1754:311-326. doi: 10.1007/978-1-4939-7717-8_18.
5
Evaluation of cell-cell interaction methods by integrating single-cell RNA sequencing data with spatial information.整合单细胞 RNA 测序数据与空间信息评估细胞间相互作用方法。
Genome Biol. 2022 Oct 17;23(1):218. doi: 10.1186/s13059-022-02783-y.
6
Supervised application of internal validation measures to benchmark dimensionality reduction methods in scRNA-seq data.监督应用内部验证措施,以基准化 scRNA-seq 数据的降维方法。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab304.
7
The shaky foundations of simulating single-cell RNA sequencing data.模拟单细胞 RNA 测序数据的不稳固基础。
Genome Biol. 2023 Mar 29;24(1):62. doi: 10.1186/s13059-023-02904-1.
8
A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies.单细胞 RNA 测序研究中差异表达分析的统计方法综合综述。
Genes (Basel). 2021 Dec 2;12(12):1947. doi: 10.3390/genes12121947.
9
A multi-center cross-platform single-cell RNA sequencing reference dataset.一个多中心跨平台单细胞 RNA 测序参考数据集。
Sci Data. 2021 Feb 2;8(1):39. doi: 10.1038/s41597-021-00809-x.
10
CaSTLe - Classification of single cells by transfer learning: Harnessing the power of publicly available single cell RNA sequencing experiments to annotate new experiments.CASTLe - 通过迁移学习对单细胞进行分类:利用公开的单细胞 RNA 测序实验的力量来注释新的实验。
PLoS One. 2018 Oct 10;13(10):e0205499. doi: 10.1371/journal.pone.0205499. eCollection 2018.

引用本文的文献

1
Integrated ambient modeling and genetic demultiplexing of single-cell RNA+ATAC multiome experiments with Ambimux.使用Ambimux对单细胞RNA+ATAC多组学实验进行集成环境建模和基因解复用。
bioRxiv. 2025 Aug 26:2025.08.21.671671. doi: 10.1101/2025.08.21.671671.
2
Simulating paired and longitudinal single-cell RNA sequencing data with rescueSim.使用rescueSim模拟配对和纵向单细胞RNA测序数据。
Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf442.
3
scEVE: a single-cell RNA-seq ensemble clustering algorithm capitalizing on the differences of predictions between multiple clustering methods.

本文引用的文献

1
Spearheading future omics analyses using dyngen, a multi-modal simulator of single cells.使用dyngen(一种单细胞多模态模拟器)引领未来的组学分析。
Nat Commun. 2021 Jun 24;12(1):3942. doi: 10.1038/s41467-021-24152-2.
2
scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured.scDesign2:一个透明的模拟器,可以生成具有捕获基因相关性的高保真单细胞基因表达计数数据。
Genome Biol. 2021 May 25;22(1):163. doi: 10.1186/s13059-021-02367-2.
3
SERGIO: A Single-Cell Expression Simulator Guided by Gene Regulatory Networks.
scEVE:一种利用多种聚类方法预测差异的单细胞RNA测序集成聚类算法。
NAR Genom Bioinform. 2025 Jun 9;7(2):lqaf073. doi: 10.1093/nargab/lqaf073. eCollection 2025 Jun.
4
The impact of dropouts in scRNAseq dense neighborhood analysis.单细胞RNA测序密集邻域分析中缺失数据的影响。
Comput Struct Biotechnol J. 2025 Mar 24;27:1278-1285. doi: 10.1016/j.csbj.2025.03.033. eCollection 2025.
5
Deep learning in single-cell and spatial transcriptomics data analysis: advances and challenges from a data science perspective.从数据科学视角看深度学习在单细胞和空间转录组学数据分析中的进展与挑战
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf136.
6
MOSim: bulk and single-cell multilayer regulatory network simulator.MOSim:批量和单细胞多层调控网络模拟器。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf110.
7
Crafted experiments to evaluate feature selection methods for single-cell RNA-seq data.精心设计实验以评估单细胞RNA测序数据的特征选择方法。
NAR Genom Bioinform. 2025 Mar 19;7(1):lqaf023. doi: 10.1093/nargab/lqaf023. eCollection 2025 Mar.
8
Multi-task benchmarking of spatially resolved gene expression simulation models.空间分辨基因表达模拟模型的多任务基准测试
Genome Biol. 2025 Mar 17;26(1):57. doi: 10.1186/s13059-025-03505-w.
9
Interpretable single-cell factor decomposition using sciRED.使用sciRED进行可解释的单细胞因子分解。
Nat Commun. 2025 Feb 22;16(1):1878. doi: 10.1038/s41467-025-57157-2.
10
scCobra allows contrastive cell embedding learning with domain adaptation for single cell data integration and harmonization.scCobra支持通过域适应进行对比细胞嵌入学习,以实现单细胞数据整合与归一化。
Commun Biol. 2025 Feb 13;8(1):233. doi: 10.1038/s42003-025-07692-x.
塞尔焦:基于基因调控网络的单细胞表达模拟器。
Cell Syst. 2020 Sep 23;11(3):252-271.e11. doi: 10.1016/j.cels.2020.08.003. Epub 2020 Aug 31.
4
Simulation, power evaluation and sample size recommendation for single-cell RNA-seq.单细胞 RNA-seq 的模拟、效能评估与样本量推荐。
Bioinformatics. 2020 Dec 8;36(19):4860-4868. doi: 10.1093/bioinformatics/btaa607.
5
scClassify: sample size estimation and multiscale classification of cells using single and multiple reference.scClassify:使用单一和多个参考对细胞进行样本量估计和多尺度分类。
Mol Syst Biol. 2020 Jun;16(6):e9389. doi: 10.15252/msb.20199389.
6
Systematic comparison of single-cell and single-nucleus RNA-sequencing methods.单细胞和单细胞核 RNA 测序方法的系统比较。
Nat Biotechnol. 2020 Jun;38(6):737-746. doi: 10.1038/s41587-020-0465-8. Epub 2020 Apr 6.
7
SPsimSeq: semi-parametric simulation of bulk and single-cell RNA-sequencing data.SPsimSeq:批量和单细胞 RNA-seq 数据的半参数模拟。
Bioinformatics. 2020 May 1;36(10):3276-3278. doi: 10.1093/bioinformatics/btaa105.
8
Eleven grand challenges in single-cell data science.单细胞数据科学的 11 大挑战。
Genome Biol. 2020 Feb 7;21(1):31. doi: 10.1186/s13059-020-1926-6.
9
Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks.使用生成对抗网络对单细胞 RNA-seq 数据进行真实的模拟生成和扩充。
Nat Commun. 2020 Jan 9;11(1):166. doi: 10.1038/s41467-019-14018-z.
10
A systematic evaluation of single cell RNA-seq analysis pipelines.单细胞 RNA 测序分析流程的系统评价。
Nat Commun. 2019 Oct 11;10(1):4667. doi: 10.1038/s41467-019-12266-7.