Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA.
Bioinformatics. 2020 May 1;36(10):3115-3123. doi: 10.1093/bioinformatics/btaa097.
Batch effect is a frequent challenge in deep sequencing data analysis that can lead to misleading conclusions. Existing methods do not correct batch effects satisfactorily, especially with single-cell RNA sequencing (RNA-seq) data.
We present scBatch, a numerical algorithm for batch-effect correction on bulk and single-cell RNA-seq data with emphasis on improving both clustering and gene differential expression analysis. scBatch is not restricted by assumptions on the mechanism of batch-effect generation. As shown in simulations and real data analyses, scBatch outperforms benchmark batch-effect correction methods.
The R package is available at github.com/tengfei-emory/scBatch. The code to generate results and figures in this article is available at github.com/tengfei-emory/scBatch-paper-scripts.
Supplementary data are available at Bioinformatics online.
批次效应是深度测序数据分析中经常面临的挑战,它可能导致误导性的结论。现有的方法不能很好地纠正批次效应,特别是在单细胞 RNA 测序(RNA-seq)数据中。
我们提出了 scBatch,这是一种用于批量和单细胞 RNA-seq 数据的批次效应校正的数值算法,重点是改进聚类和基因差异表达分析。scBatch 不受批次效应产生机制的假设的限制。正如模拟和真实数据分析所示,scBatch 优于基准批次效应校正方法。
R 包可在 github.com/tengfei-emory/scBatch 获得。本文中生成结果和图的代码可在 github.com/tengfei-emory/scBatch-paper-scripts 获得。
补充数据可在《生物信息学》在线获得。