Suppr超能文献

单细胞 RNA 测序数据批次效应校正方法的基准测试。

A benchmark of batch-effect correction methods for single-cell RNA sequencing data.

机构信息

Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, Immunos Building, Level 3, Singapore, 138648, Singapore.

出版信息

Genome Biol. 2020 Jan 16;21(1):12. doi: 10.1186/s13059-019-1850-9.

Abstract

BACKGROUND

Large-scale single-cell transcriptomic datasets generated using different technologies contain batch-specific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integration with available computational resources is crucial. Here, we perform an in-depth benchmark study on available batch correction methods to determine the most suitable method for batch-effect removal.

RESULTS

We compare 14 methods in terms of computational runtime, the ability to handle large datasets, and batch-effect correction efficacy while preserving cell type purity. Five scenarios are designed for the study: identical cell types with different technologies, non-identical cell types, multiple batches, big data, and simulated data. Performance is evaluated using four benchmarking metrics including kBET, LISI, ASW, and ARI. We also investigate the use of batch-corrected data to study differential gene expression.

CONCLUSION

Based on our results, Harmony, LIGER, and Seurat 3 are the recommended methods for batch integration. Due to its significantly shorter runtime, Harmony is recommended as the first method to try, with the other methods as viable alternatives.

摘要

背景

使用不同技术生成的大规模单细胞转录组数据集包含批次特异性的系统变化,这对批次效应去除和数据集成提出了挑战。随着 scRNA-seq 数据的持续增长,利用可用的计算资源实现有效的批量整合至关重要。在这里,我们对现有的批量校正方法进行了深入的基准研究,以确定最适合去除批次效应的方法。

结果

我们从计算运行时间、处理大数据集的能力以及在保持细胞类型纯度的同时去除批次效应的效果等方面比较了 14 种方法。研究设计了五种情况:不同技术的相同细胞类型、不同细胞类型、多个批次、大数据和模拟数据。使用 kBET、LISI、ASW 和 ARI 等四种基准测试指标来评估性能。我们还研究了使用经过批量校正的数据来研究差异基因表达。

结论

根据我们的结果,Harmony、LIGER 和 Seurat 3 是推荐用于批量整合的方法。由于 Harmony 的运行时间明显更短,因此建议将其作为首选方法,其他方法则作为可行的替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f80/6964114/08c7191348ee/13059_2019_1850_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验