Centre of Biomedical Systems and Informatics, International Campus, ZJU-UoE Institute, Zhejiang University School of Medicine, Zhejiang University, Haining, Zhejiang, 314400, China.
Department of Statistics and Data Science, University of California, Los Angeles, CA, 90095, USA.
Genome Biol. 2024 May 23;25(1):136. doi: 10.1186/s13059-024-03284-w.
In droplet-based single-cell and single-nucleus RNA-seq assays, systematic contamination of ambient RNA molecules biases the quantification of gene expression levels. Existing methods correct the contamination for all genes globally. However, there lacks specific evaluation of correction efficacy for varying contamination levels. Here, we show that DecontX and CellBender under-correct highly contaminating genes, while SoupX and scAR over-correct lowly/non-contaminating genes. Here, we develop scCDC as the first method to detect the contamination-causing genes and only correct expression levels of these genes, some of which are cell-type markers. Compared with existing decontamination methods, scCDC excels in decontaminating highly contaminating genes while avoiding over-correction of other genes.
在基于液滴的单细胞和单细胞核 RNA 测序分析中,环境 RNA 分子的系统性污染会影响基因表达水平的定量。现有的方法可以全局纠正所有基因的污染。然而,对于不同的污染水平,缺乏对校正效果的具体评估。在这里,我们发现 DecontX 和 CellBender 对高度污染的基因校正不足,而 SoupX 和 scAR 对低度/非污染的基因校正过度。在这里,我们开发了 scCDC,作为第一个检测引起污染的基因并仅校正这些基因表达水平的方法,其中一些是细胞类型标志物。与现有的去污染方法相比,scCDC 擅长于去除高度污染的基因,同时避免对其他基因的过度校正。
Genome Biol. 2020-3-5
Methods Mol Biol. 2021
Gigascience. 2020-12-26
Brief Bioinform. 2025-7-2
Proc Natl Acad Sci U S A. 2022-4-12