Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA.
Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA; Chan Zuckerbeg Biohub, University of California, San Francisco, San Francisco, CA, USA; Center for Cellular Construction, University of California, San Francisco, San Francisco, CA, USA.
Cell Syst. 2019 Apr 24;8(4):329-337.e4. doi: 10.1016/j.cels.2019.03.003. Epub 2019 Apr 3.
Single-cell RNA sequencing (scRNA-seq) data are commonly affected by technical artifacts known as "doublets," which limit cell throughput and lead to spurious biological conclusions. Here, we present a computational doublet detection tool-DoubletFinder-that identifies doublets using only gene expression data. DoubletFinder predicts doublets according to each real cell's proximity in gene expression space to artificial doublets created by averaging the transcriptional profile of randomly chosen cell pairs. We first use scRNA-seq datasets where the identity of doublets is known to show that DoubletFinder identifies doublets formed from transcriptionally distinct cells. When these doublets are removed, the identification of differentially expressed genes is enhanced. Second, we provide a method for estimating DoubletFinder input parameters, allowing its application across scRNA-seq datasets with diverse distributions of cell types. Lastly, we present "best practices" for DoubletFinder applications and illustrate that DoubletFinder is insensitive to an experimentally validated kidney cell type with "hybrid" expression features.
单细胞 RNA 测序 (scRNA-seq) 数据通常受到被称为“二聚体”的技术伪影的影响,这限制了细胞的通量,并导致虚假的生物学结论。在这里,我们提出了一种计算二聚体检测工具-DoubletFinder-它仅使用基因表达数据来识别二聚体。DoubletFinder 根据每个真实细胞在基因表达空间中与通过平均随机选择的细胞对的转录谱创建的人工二聚体的接近程度来预测二聚体。我们首先使用已知二聚体身份的 scRNA-seq 数据集来表明 DoubletFinder 可以识别来自转录上不同细胞的二聚体。当去除这些二聚体时,差异表达基因的识别得到增强。其次,我们提供了一种估计 DoubletFinder 输入参数的方法,允许其在具有不同细胞类型分布的 scRNA-seq 数据集上应用。最后,我们提出了 DoubletFinder 应用的“最佳实践”,并说明了 DoubletFinder 对具有“混合”表达特征的经过实验验证的肾脏细胞类型不敏感。