Department of Computer Science, Stanford University, Stanford, CA, USA.
Department of Biomedical Informatics, Harvard University, Boston, MA, USA.
Nat Methods. 2020 Dec;17(12):1200-1206. doi: 10.1038/s41592-020-00979-3. Epub 2020 Oct 19.
Although tremendous effort has been put into cell-type annotation, identification of previously uncharacterized cell types in heterogeneous single-cell RNA-seq data remains a challenge. Here we present MARS, a meta-learning approach for identifying and annotating known as well as new cell types. MARS overcomes the heterogeneity of cell types by transferring latent cell representations across multiple datasets. MARS uses deep learning to learn a cell embedding function as well as a set of landmarks in the cell embedding space. The method has a unique ability to discover cell types that have never been seen before and annotate experiments that are as yet unannotated. We apply MARS to a large mouse cell atlas and show its ability to accurately identify cell types, even when it has never seen them before. Further, MARS automatically generates interpretable names for new cell types by probabilistically defining a cell type in the embedding space.
尽管在细胞类型注释方面已经付出了巨大的努力,但在异质单细胞 RNA-seq 数据中识别以前未表征的细胞类型仍然是一个挑战。在这里,我们提出了 MARS,这是一种用于识别和注释已知和新细胞类型的元学习方法。MARS 通过在多个数据集之间转移潜在的细胞表示来克服细胞类型的异质性。MARS 使用深度学习来学习细胞嵌入函数以及细胞嵌入空间中的一组地标。该方法具有独特的能力,可以发现以前从未见过的细胞类型,并注释尚未注释的实验。我们将 MARS 应用于大型小鼠细胞图谱,并展示了其即使在从未见过它们的情况下也能准确识别细胞类型的能力。此外,MARS 通过在嵌入空间中概率地定义细胞类型,自动为新的细胞类型生成可解释的名称。