Calico Life Sciences LLC, South San Francisco, CA, USA.
Calico Life Sciences LLC, South San Francisco, CA, USA.
Cell Syst. 2020 Jul 22;11(1):95-101.e5. doi: 10.1016/j.cels.2020.05.010. Epub 2020 Jun 26.
Single-cell RNA sequencing (scRNA-seq) measurements of gene expression enable an unprecedented high-resolution view into cellular state. However, current methods often result in two or more cells that share the same cell-identifying barcode; these "doublets" violate the fundamental premise of single-cell technology and can lead to incorrect inferences. Here, we describe Solo, a semi-supervised deep learning approach that identifies doublets with greater accuracy than existing methods. Solo embeds cells unsupervised using a variational autoencoder and then appends a feed-forward neural network layer to the encoder to form a supervised classifier. We train this classifier to distinguish simulated doublets from the observed data. Solo can be applied in combination with experimental doublet detection methods to further purify scRNA-seq data to true single cells. It is freely available from https://github.com/calico/solo. A record of this paper's transparent peer review process is included in the Supplemental Information.
单细胞 RNA 测序(scRNA-seq)测量基因表达,使人们能够以前所未有的高分辨率观察细胞状态。然而,目前的方法通常会导致两个或更多具有相同细胞识别条码的细胞;这些“双细胞”违反了单细胞技术的基本前提,可能导致错误的推断。在这里,我们描述了 Solo,这是一种半监督深度学习方法,其识别双细胞的准确性高于现有方法。Solo 使用变分自动编码器对细胞进行无监督嵌入,然后在编码器上附加一个前馈神经网络层,形成一个有监督的分类器。我们训练这个分类器来区分模拟的双细胞和观察到的数据。Solo 可以与实验性的双细胞检测方法结合使用,以进一步纯化 scRNA-seq 数据,得到真正的单细胞。它可以从 https://github.com/calico/solo 免费获得。本论文的透明同行评审过程记录包含在补充信息中。