Department of Statistics, University of California, Los Angeles, CA 90095-1554, USA.
Department of Statistics, University of California, Los Angeles, CA 90095-1554, USA; Department of Human Genetics, University of California, Los Angeles, CA 90095-7088, USA; Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766, USA.
Cell Syst. 2021 Feb 17;12(2):176-194.e6. doi: 10.1016/j.cels.2020.11.008. Epub 2020 Dec 17.
In single-cell RNA sequencing (scRNA-seq), doublets form when two cells are encapsulated into one reaction volume. The existence of doublets, which appear to be-but are not-real cells, is a key confounder in scRNA-seq data analysis. Computational methods have been developed to detect doublets in scRNA-seq data; however, the scRNA-seq field lacks a comprehensive benchmarking of these methods, making it difficult for researchers to choose an appropriate method for specific analyses. We conducted a systematic benchmark study of nine cutting-edge computational doublet-detection methods. Our study included 16 real datasets, which contained experimentally annotated doublets, and 112 realistic synthetic datasets. We compared doublet-detection methods regarding detection accuracy under various experimental settings, impacts on downstream analyses, and computational efficiencies. Our results show that existing methods exhibited diverse performance and distinct advantages in different aspects. Overall, the DoubletFinder method has the best detection accuracy, and the cxds method has the highest computational efficiency. A record of this paper's transparent peer review process is included in the Supplemental Information.
在单细胞 RNA 测序 (scRNA-seq) 中,当两个细胞被包裹在一个反应体积中时,就会形成二聚体。二聚体的存在(看起来是细胞,但实际上不是)是 scRNA-seq 数据分析中的一个关键混杂因素。已经开发了计算方法来检测 scRNA-seq 数据中的二聚体;然而,scRNA-seq 领域缺乏对这些方法的全面基准测试,使得研究人员难以为特定分析选择合适的方法。我们对九种前沿的计算二聚体检测方法进行了系统的基准研究。我们的研究包括 16 个真实数据集,其中包含实验注释的二聚体,以及 112 个现实的合成数据集。我们比较了二聚体检测方法在不同实验设置下的检测准确性、对下游分析的影响以及计算效率。我们的结果表明,现有的方法在不同方面表现出不同的性能和优势。总的来说,DoubletFinder 方法具有最高的检测准确性,而 cxds 方法具有最高的计算效率。本文的透明同行评审过程记录包含在补充信息中。