• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单细胞差异表达整合的基准测试

Benchmarking integration of single-cell differential expression.

机构信息

Department of Biological Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea.

Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA.

出版信息

Nat Commun. 2023 Mar 21;14(1):1570. doi: 10.1038/s41467-023-37126-3.

DOI:10.1038/s41467-023-37126-3
PMID:36944632
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10030080/
Abstract

Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes.

摘要

单细胞 RNA 测序数据在不同样本之间的整合一直是分析细胞群体的主要挑战。然而,单细胞数据差异表达分析的整合策略仍未得到充分研究。在这里,我们对 46 种用于多批次单细胞数据差异表达分析的工作流程进行了基准测试。我们表明,批次效应、测序深度和数据稀疏度极大地影响了它们的性能。值得注意的是,我们发现使用经过批次校正的数据很少能改善稀疏数据的分析,而对大量批次效应进行批次协变量建模则可以改善分析。我们表明,对于低深度数据,基于零膨胀模型的单细胞技术会降低性能,而使用 limmatrend、Wilcoxon 检验和固定效应模型对未经校正的数据进行分析则表现良好。我们根据各种模拟和真实数据分析,在不同条件下提出了几种高性能方法。此外,我们证明,针对特定细胞类型的差异表达分析在优先考虑与疾病相关的基因方面优于大规模批量样本数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/58e819f34ddc/41467_2023_37126_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/0e7ed2f8fb32/41467_2023_37126_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/a5f295013a4a/41467_2023_37126_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/0d84f26f6a68/41467_2023_37126_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/f4e7f474a445/41467_2023_37126_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/6f79912c1750/41467_2023_37126_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/39a9347751e0/41467_2023_37126_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/58e819f34ddc/41467_2023_37126_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/0e7ed2f8fb32/41467_2023_37126_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/a5f295013a4a/41467_2023_37126_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/0d84f26f6a68/41467_2023_37126_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/f4e7f474a445/41467_2023_37126_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/6f79912c1750/41467_2023_37126_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/39a9347751e0/41467_2023_37126_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd9e/10030899/58e819f34ddc/41467_2023_37126_Fig7_HTML.jpg

相似文献

1
Benchmarking integration of single-cell differential expression.单细胞差异表达整合的基准测试
Nat Commun. 2023 Mar 21;14(1):1570. doi: 10.1038/s41467-023-37126-3.
2
Empirical assessment of the impact of sample number and read depth on RNA-Seq analysis workflow performance.对样本数量和读取深度对 RNA-Seq 分析工作流程性能的影响进行实证评估。
BMC Bioinformatics. 2018 Nov 14;19(1):423. doi: 10.1186/s12859-018-2445-2.
3
splatPop: simulating population scale single-cell RNA sequencing data.splatPop:模拟群体规模单细胞 RNA 测序数据。
Genome Biol. 2021 Dec 15;22(1):341. doi: 10.1186/s13059-021-02546-1.
4
CellMixS: quantifying and visualizing batch effects in single-cell RNA-seq data.CellMixS:量化和可视化单细胞 RNA-seq 数据中的批次效应。
Life Sci Alliance. 2021 Mar 23;4(6). doi: 10.26508/lsa.202001004. Print 2021 Jun.
5
Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors.通过匹配相互最近邻,纠正单细胞 RNA 测序数据中的批次效应。
Nat Biotechnol. 2018 Jun;36(5):421-427. doi: 10.1038/nbt.4091. Epub 2018 Apr 2.
6
iSMNN: batch effect correction for single-cell RNA-seq data via iterative supervised mutual nearest neighbor refinement.iSMNN:通过迭代监督的互近邻修正对单细胞 RNA-seq 数据进行批次效应校正。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab122.
7
A benchmark of batch-effect correction methods for single-cell RNA sequencing data.单细胞 RNA 测序数据批次效应校正方法的基准测试。
Genome Biol. 2020 Jan 16;21(1):12. doi: 10.1186/s13059-019-1850-9.
8
Semi-supervised integration of single-cell transcriptomics data.单细胞转录组学数据的半监督整合。
Nat Commun. 2024 Jan 29;15(1):872. doi: 10.1038/s41467-024-45240-z.
9
Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods.RNA-Seq 差异表达分析工具的基准测试:基于标准化与基于对数比变换的方法。
BMC Bioinformatics. 2018 Jul 18;19(1):274. doi: 10.1186/s12859-018-2261-8.
10
Benchmarking atlas-level data integration in single-cell genomics.单细胞基因组学中图谱级数据整合的基准测试。
Nat Methods. 2022 Jan;19(1):41-50. doi: 10.1038/s41592-021-01336-8. Epub 2021 Dec 23.

引用本文的文献

1
A practical guide to sequencing in neuropsychiatric research.神经精神医学研究中的测序实用指南。
NPP Digit Psychiatry Neurosci. 2025;3(1):21. doi: 10.1038/s44277-025-00041-0. Epub 2025 Aug 8.
2
scGPT: end-to-end protocol for fine-tuned retinal cell type annotation.scGPT:用于微调视网膜细胞类型注释的端到端协议。
Nat Protoc. 2025 Jul 15. doi: 10.1038/s41596-025-01220-1.
3
SeuratIntegrate: an R package to facilitate the use of integration methods with Seurat.SeuratIntegrate:一个R软件包,便于在Seurat中使用整合方法。

本文引用的文献

1
Benchmarking atlas-level data integration in single-cell genomics.单细胞基因组学中图谱级数据整合的基准测试。
Nat Methods. 2022 Jan;19(1):41-50. doi: 10.1038/s41592-021-01336-8. Epub 2021 Dec 23.
2
Confronting false discoveries in single-cell differential expression.单细胞差异表达中虚假发现的应对策略。
Nat Commun. 2021 Sep 28;12(1):5692. doi: 10.1038/s41467-021-25960-2.
3
Computational principles and challenges in single-cell data integration.单细胞数据整合的计算原理与挑战。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf358.
4
DiSC: a statistical tool for fast differential expression analysis of individual-level single-cell RNA-seq data.DiSC:一种用于个体水平单细胞RNA测序数据快速差异表达分析的统计工具。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf327.
5
mastR: an R package for automated identification of tissue-specific gene signatures in multi-group differential expression analysis.mastR:一个用于在多组差异表达分析中自动识别组织特异性基因特征的R包。
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf114.
6
Exploring and mitigating shortcomings in single-cell differential expression analysis with a new statistical paradigm.用一种新的统计范式探索和缓解单细胞差异表达分析中的缺点。
Genome Biol. 2025 Mar 17;26(1):58. doi: 10.1186/s13059-025-03525-6.
7
A highly resolved integrated single-cell atlas of HPV-negative head and neck cancer.一份高分辨率的HPV阴性头颈癌综合单细胞图谱。
bioRxiv. 2025 Mar 4:2025.03.02.640812. doi: 10.1101/2025.03.02.640812.
8
Towards the Next Generation of Data-Driven Therapeutics Using Spatially Resolved Single-Cell Technologies and Generative AI.迈向使用空间分辨单细胞技术和生成式人工智能的下一代数据驱动疗法。
Eur J Immunol. 2025 Feb;55(2):e202451234. doi: 10.1002/eji.202451234.
9
Transcriptomic analysis of human cartilage identified potential therapeutic targets for hip osteoarthritis.对人类软骨的转录组分析确定了髋骨关节炎的潜在治疗靶点。
Hum Mol Genet. 2025 Feb 17;34(5):444-453. doi: 10.1093/hmg/ddae200.
10
Detecting gene expression in Caenorhabditis elegans.检测秀丽隐杆线虫中的基因表达。
Genetics. 2025 Jan 8;229(1):1-108. doi: 10.1093/genetics/iyae167.
Nat Biotechnol. 2021 Oct;39(10):1202-1215. doi: 10.1038/s41587-021-00895-7. Epub 2021 May 3.
4
Gene Set Knowledge Discovery with Enrichr.基因集知识发现与 Enrichr
Curr Protoc. 2021 Mar;1(3):e90. doi: 10.1002/cpz1.90.
5
Powerful p-value combination methods to detect incomplete association.强大的 p 值组合方法,用于检测不完全关联。
Sci Rep. 2021 Mar 26;11(1):6980. doi: 10.1038/s41598-021-86465-y.
6
Robust integration of multiple single-cell RNA sequencing datasets using a single reference space.使用单个参考空间对多个单细胞 RNA 测序数据集进行稳健整合。
Nat Biotechnol. 2021 Jul;39(7):877-884. doi: 10.1038/s41587-021-00859-x. Epub 2021 Mar 25.
7
COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas.大规模单细胞转录组图谱揭示的 COVID-19 免疫特征。
Cell. 2021 Apr 1;184(7):1895-1913.e19. doi: 10.1016/j.cell.2021.01.053. Epub 2021 Feb 3.
8
Comparative Toxicogenomics Database (CTD): update 2021.比较毒理学基因组学数据库(CTD):2021 年更新。
Nucleic Acids Res. 2021 Jan 8;49(D1):D1138-D1143. doi: 10.1093/nar/gkaa891.
9
Systems biological assessment of immunity to mild versus severe COVID-19 infection in humans.人类对轻度与重度 COVID-19 感染免疫的系统生物学评估。
Science. 2020 Sep 4;369(6508):1210-1220. doi: 10.1126/science.abc6261. Epub 2020 Aug 11.
10
Single-cell RNA sequencing demonstrates the molecular and cellular reprogramming of metastatic lung adenocarcinoma.单细胞 RNA 测序显示转移性肺腺癌的分子和细胞重编程。
Nat Commun. 2020 May 8;11(1):2285. doi: 10.1038/s41467-020-16164-1.