Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Howard Hughes Medical Institute, Koch Institute of Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
Nat Commun. 2021 May 5;12(1):2554. doi: 10.1038/s41467-021-22851-4.
Single-cell RNA-Seq (scRNA-seq) is invaluable for studying biological systems. Dimensionality reduction is a crucial step in interpreting the relation between cells in scRNA-seq data. However, current dimensionality reduction methods are often confounded by multiple simultaneous technical and biological variability, result in "crowding" of cells in the center of the latent space, or inadequately capture temporal relationships. Here, we introduce scPhere, a scalable deep generative model to embed cells into low-dimensional hyperspherical or hyperbolic spaces to accurately represent scRNA-seq data. ScPhere addresses multi-level, complex batch factors, facilitates the interactive visualization of large datasets, resolves cell crowding, and uncovers temporal trajectories. We demonstrate scPhere on nine large datasets in complex tissue from human patients or animal development. Our results show how scPhere facilitates the interpretation of scRNA-seq data by generating batch-invariant embeddings to map data from new individuals, identifies cell types affected by biological variables, infers cells' spatial positions in pre-defined biological specimens, and highlights complex cellular relations.
单细胞 RNA 测序 (scRNA-seq) 在研究生物系统方面具有重要价值。降维是解释 scRNA-seq 数据中细胞之间关系的关键步骤。然而,当前的降维方法常常受到多个同时存在的技术和生物学变异性的影响,导致细胞在潜在空间的中心拥挤,或者不能充分捕捉到时间关系。在这里,我们介绍了 scPhere,这是一种可扩展的深度生成模型,可以将细胞嵌入到低维超球或双曲空间中,以准确地表示 scRNA-seq 数据。scPhere 解决了多层次、复杂的批次因素,促进了大型数据集的交互式可视化,解决了细胞拥挤问题,并揭示了时间轨迹。我们在来自人类患者或动物发育的复杂组织中的九个大型数据集上展示了 scPhere。我们的结果表明,scPhere 通过生成批次不变的嵌入来映射来自新个体的数据,从而有助于解释 scRNA-seq 数据,识别受生物变量影响的细胞类型,推断预定义生物样本中细胞的空间位置,并突出复杂的细胞关系。