Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Precision Cardiology Laboratory (PCL), Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Nat Methods. 2023 Sep;20(9):1323-1335. doi: 10.1038/s41592-023-01943-7. Epub 2023 Aug 7.
Droplet-based single-cell assays, including single-cell RNA sequencing (scRNA-seq), single-nucleus RNA sequencing (snRNA-seq) and cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), generate considerable background noise counts, the hallmark of which is nonzero counts in cell-free droplets and off-target gene expression in unexpected cell types. Such systematic background noise can lead to batch effects and spurious differential gene expression results. Here we develop a deep generative model based on the phenomenology of noise generation in droplet-based assays. The proposed model accurately distinguishes cell-containing droplets from cell-free droplets, learns the background noise profile and provides noise-free quantification in an end-to-end fashion. We implement this approach in the scalable and robust open-source software package CellBender. Analysis of simulated data demonstrates that CellBender operates near the theoretically optimal denoising limit. Extensive evaluations using real datasets and experimental benchmarks highlight enhanced concordance between droplet-based single-cell data and established gene expression patterns, while the learned background noise profile provides evidence of degraded or uncaptured cell types.
基于液滴的单细胞分析,包括单细胞 RNA 测序(scRNA-seq)、单核 RNA 测序(snRNA-seq)和通过测序对转录组和表位进行细胞索引(CITE-seq),会产生大量的背景噪声计数,其特征是无细胞液滴中的非零计数和预期细胞类型中的非靶向基因表达。这种系统性的背景噪声可能导致批次效应和虚假的差异基因表达结果。在这里,我们基于液滴分析中噪声产生的现象学,开发了一种深度生成模型。所提出的模型能够准确地区分含有细胞的液滴和无细胞的液滴,学习背景噪声分布,并以端到端的方式提供无噪声的定量。我们在可扩展且强大的开源软件包 CellBender 中实现了这种方法。模拟数据分析表明,CellBender 接近理论上的最优去噪极限。使用真实数据集和实验基准的广泛评估突出了基于液滴的单细胞数据与既定基因表达模式之间的增强一致性,而学习到的背景噪声分布则提供了细胞类型退化或未捕获的证据。
Brief Bioinform. 2022-7-18
Brief Bioinform. 2023-3-19
Methods Mol Biol. 2025
Nat Biotechnol. 2022-10
Genome Biol. 2022-1-21
Nat Cell Biol. 2021-12
Nature. 2021-7