Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA.
Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA; Centre de Mathématiques Appliquées, École polytechnique, Palaiseau 91120, France.
Cell Syst. 2019 Apr 24;8(4):281-291.e9. doi: 10.1016/j.cels.2018.11.005. Epub 2019 Apr 3.
Single-cell RNA-sequencing has become a widely used, powerful approach for studying cell populations. However, these methods often generate multiplet artifacts, where two or more cells receive the same barcode, resulting in a hybrid transcriptome. In most experiments, multiplets account for several percent of transcriptomes and can confound downstream data analysis. Here, we present Single-Cell Remover of Doublets (Scrublet), a framework for predicting the impact of multiplets in a given analysis and identifying problematic multiplets. Scrublet avoids the need for expert knowledge or cell clustering by simulating multiplets from the data and building a nearest neighbor classifier. To demonstrate the utility of this approach, we test Scrublet on several datasets that include independent knowledge of cell multiplets. Scrublet is freely available for download at github.com/AllonKleinLab/scrublet.
单细胞 RNA 测序已经成为研究细胞群体的一种广泛使用的强大方法。然而,这些方法经常产生多重伪影,其中两个或更多细胞接收到相同的条形码,从而导致混合转录组。在大多数实验中,多重伪影占转录组的几个百分点,并可能混淆下游数据分析。在这里,我们提出了 Scrublet,这是一种用于预测给定分析中多重伪影的影响并识别有问题的多重伪影的框架。Scrublet 通过从数据中模拟多重伪影并构建最近邻分类器来避免对专家知识或细胞聚类的需求。为了证明这种方法的实用性,我们在包括细胞多重伪影的独立知识的几个数据集上测试了 Scrublet。Scrublet 可在 github.com/AllonKleinLab/scrublet 上免费下载。