单细胞 RNA 测序数据批次效应校正方法的基准测试。

A benchmark of batch-effect correction methods for single-cell RNA sequencing data.

机构信息

Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, Immunos Building, Level 3, Singapore, 138648, Singapore.

出版信息

Genome Biol. 2020 Jan 16;21(1):12. doi: 10.1186/s13059-019-1850-9.

DOI:10.1186/s13059-019-1850-9

PMID:31948481

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6964114/

Abstract

BACKGROUND

Large-scale single-cell transcriptomic datasets generated using different technologies contain batch-specific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integration with available computational resources is crucial. Here, we perform an in-depth benchmark study on available batch correction methods to determine the most suitable method for batch-effect removal.

RESULTS

We compare 14 methods in terms of computational runtime, the ability to handle large datasets, and batch-effect correction efficacy while preserving cell type purity. Five scenarios are designed for the study: identical cell types with different technologies, non-identical cell types, multiple batches, big data, and simulated data. Performance is evaluated using four benchmarking metrics including kBET, LISI, ASW, and ARI. We also investigate the use of batch-corrected data to study differential gene expression.

CONCLUSION

Based on our results, Harmony, LIGER, and Seurat 3 are the recommended methods for batch integration. Due to its significantly shorter runtime, Harmony is recommended as the first method to try, with the other methods as viable alternatives.

摘要

背景

使用不同技术生成的大规模单细胞转录组数据集包含批次特异性的系统变化，这对批次效应去除和数据集成提出了挑战。随着 scRNA-seq 数据的持续增长，利用可用的计算资源实现有效的批量整合至关重要。在这里，我们对现有的批量校正方法进行了深入的基准研究，以确定最适合去除批次效应的方法。

结果

我们从计算运行时间、处理大数据集的能力以及在保持细胞类型纯度的同时去除批次效应的效果等方面比较了 14 种方法。研究设计了五种情况：不同技术的相同细胞类型、不同细胞类型、多个批次、大数据和模拟数据。使用 kBET、LISI、ASW 和 ARI 等四种基准测试指标来评估性能。我们还研究了使用经过批量校正的数据来研究差异基因表达。

结论

根据我们的结果，Harmony、LIGER 和 Seurat 3 是推荐用于批量整合的方法。由于 Harmony 的运行时间明显更短，因此建议将其作为首选方法，其他方法则作为可行的替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f80/6964114/08c7191348ee/13059_2019_1850_Fig1_HTML.jpg

相似文献

A benchmark of batch-effect correction methods for single-cell RNA sequencing data.单细胞 RNA 测序数据批次效应校正方法的基准测试。

Genome Biol. 2020 Jan 16;21(1):12. doi: 10.1186/s13059-019-1850-9.

deepMNN: Deep Learning-Based Single-Cell RNA Sequencing Data Batch Correction Using Mutual Nearest Neighbors.deepMNN：基于深度学习的使用相互最近邻算法的单细胞RNA测序数据批次校正

Front Genet. 2021 Aug 10;12:708981. doi: 10.3389/fgene.2021.708981. eCollection 2021.

SMNN: batch effect correction for single-cell RNA-seq data via supervised mutual nearest neighbor detection.SMNN：通过有监督的互最近邻检测对单细胞 RNA-seq 数据进行批次效应校正。

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa097.

CellMixS: quantifying and visualizing batch effects in single-cell RNA-seq data.CellMixS：量化和可视化单细胞 RNA-seq 数据中的批次效应。

Life Sci Alliance. 2021 Mar 23;4(6). doi: 10.26508/lsa.202001004. Print 2021 Jun.

iSMNN: batch effect correction for single-cell RNA-seq data via iterative supervised mutual nearest neighbor refinement.iSMNN：通过迭代监督的互近邻修正对单细胞 RNA-seq 数据进行批次效应校正。

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab122.

IMGG: Integrating Multiple Single-Cell Datasets through Connected Graphs and Generative Adversarial Networks.IMGG：通过连接图和生成对抗网络整合多个单细胞数据集。

Int J Mol Sci. 2022 Feb 14;23(4):2082. doi: 10.3390/ijms23042082.

Evaluating batch correction methods for image-based cell profiling.评估基于图像的细胞分析中的批量校正方法。

Nat Commun. 2024 Aug 2;15(1):6516. doi: 10.1038/s41467-024-50613-5.

Flexible comparison of batch correction methods for single-cell RNA-seq using BatchBench.使用 BatchBench 灵活比较单细胞 RNA-seq 的批量校正方法。

Nucleic Acids Res. 2021 Apr 19;49(7):e42. doi: 10.1093/nar/gkab004.

Deep Batch Integration and Denoise of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据的深度批量整合和去噪。

Adv Sci (Weinh). 2024 Aug;11(29):e2308934. doi: 10.1002/advs.202308934. Epub 2024 May 22.

REBET: a method to determine the number of cell clusters based on batch effect removal.REBET：一种基于批次效应去除确定细胞簇数量的方法。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab204.

引用本文的文献

A Benchmark of Semi-Supervised scRNA-seq Integration Methods in Real-World Scenarios.真实场景下半监督单细胞RNA测序整合方法的基准测试

bioRxiv. 2025 Aug 27:2025.08.23.671952. doi: 10.1101/2025.08.23.671952.

Overcoming platinum-resistant ovarian cancer targeting the activated JAK-STAT pathways via extracellular vesicles.通过细胞外囊泡靶向激活的JAK-STAT通路克服铂耐药性卵巢癌

Commun Biol. 2025 Aug 29;8(1):1305. doi: 10.1038/s42003-025-08771-9.

The Advance of Single-Cell RNA Sequencing Applications in Ocular Physiology and Disease Research.单细胞RNA测序在眼生理学和疾病研究中的应用进展

Biomolecules. 2025 Aug 4;15(8):1120. doi: 10.3390/biom15081120.

Molecular architecture of language-related cortical areas revealed by integrative proteomic and connectome analyses.通过综合蛋白质组学和连接组分析揭示语言相关皮质区域的分子结构。

Clin Transl Med. 2025 Sep;15(9):e70449. doi: 10.1002/ctm2.70449.

Notch signaling blockade links transcriptome heterogeneity in quiescent neural stem cells with reactivation routes and potential.Notch信号通路阻断将静止神经干细胞中的转录组异质性与重新激活途径及潜能联系起来。

Sci Adv. 2025 Aug 29;11(35):eadu3189. doi: 10.1126/sciadv.adu3189. Epub 2025 Aug 27.

Single-cell multi-omics in cancer immunotherapy: from tumor heterogeneity to personalized precision treatment.癌症免疫治疗中的单细胞多组学：从肿瘤异质性到个性化精准治疗

Mol Cancer. 2025 Aug 25;24(1):221. doi: 10.1186/s12943-025-02426-3.

Reconstitution of T cell-mediated immunity by umbilical cord-derived mesenchymal stem cells in ulcerative colitis.脐带间充质干细胞重建溃疡性结肠炎中T细胞介导的免疫

Clin Transl Med. 2025 Aug;15(8):e70452. doi: 10.1002/ctm2.70452.

HiCat: a semi-supervised approach for cell type annotation.HiCat：一种用于细胞类型注释的半监督方法。

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf428.

Progress and new challenges in image-based profiling.基于图像的分析技术的进展与新挑战。

ArXiv. 2025 Aug 7:arXiv:2508.05800v1.

Impact of integration on persistent homology clustering and biological signal detection in scRNA-seq data.整合对单细胞RNA测序数据中持久同调聚类和生物信号检测的影响。

bioRxiv. 2025 Jul 29:2025.07.24.666637. doi: 10.1101/2025.07.24.666637.

本文引用的文献

Fast, sensitive and accurate integration of single-cell data with Harmony.利用 Harmony 实现单细胞数据的快速、灵敏和精确整合。

Nat Methods. 2019 Dec;16(12):1289-1296. doi: 10.1038/s41592-019-0619-0. Epub 2019 Nov 18.

BBKNN: fast batch alignment of single cell transcriptomes.BBKNN：快速批量比对单细胞转录组。

Bioinformatics. 2020 Feb 1;36(3):964-965. doi: 10.1093/bioinformatics/btz625.

Comprehensive Integration of Single-Cell Data.单细胞数据的综合整合。

Cell. 2019 Jun 13;177(7):1888-1902.e21. doi: 10.1016/j.cell.2019.05.031. Epub 2019 Jun 6.

Efficient integration of heterogeneous single-cell transcriptomes using Scanorama.使用 Scanorama 实现高效的异质单细胞转录组整合。

Nat Biotechnol. 2019 Jun;37(6):685-691. doi: 10.1038/s41587-019-0113-3. Epub 2019 May 6.

scMerge leverages factor analysis, stable expression, and pseudoreplication to merge multiple single-cell RNA-seq datasets.scMerge 通过因子分析、稳定表达和伪复制来合并多个单细胞 RNA-seq 数据集。

Proc Natl Acad Sci U S A. 2019 May 14;116(20):9775-9784. doi: 10.1073/pnas.1820006116. Epub 2019 Apr 26.

A test metric for assessing single-cell RNA-seq batch correction.一种用于评估单细胞 RNA-seq 批次校正的测试指标。

Nat Methods. 2019 Jan;16(1):43-49. doi: 10.1038/s41592-018-0254-1. Epub 2018 Dec 20.

Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris.单细胞转录组学分析 20 种小鼠器官构建小鼠多器官单细胞图谱。

Nature. 2018 Oct;562(7727):367-372. doi: 10.1038/s41586-018-0590-4. Epub 2018 Oct 3.

Molecular Diversity and Specializations among the Cells of the Adult Mouse Brain.成年老鼠大脑细胞的分子多样性和专业化。

Cell. 2018 Aug 9;174(4):1015-1030.e16. doi: 10.1016/j.cell.2018.07.028.

Integrating single-cell transcriptomic data across different conditions, technologies, and species.整合不同条件、技术和物种的单细胞转录组数据。

Nat Biotechnol. 2018 Jun;36(5):411-420. doi: 10.1038/nbt.4096. Epub 2018 Apr 2.

Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors.通过匹配相互最近邻，纠正单细胞 RNA 测序数据中的批次效应。

Nat Biotechnol. 2018 Jun;36(5):421-427. doi: 10.1038/nbt.4091. Epub 2018 Apr 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

单细胞 RNA 测序数据批次效应校正方法的基准测试。

A benchmark of batch-effect correction methods for single-cell RNA sequencing data.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献