Department of Bioengineering, University of California, San Diego, San Diego, CA, USA.
Moores Cancer Center, University of California, San Diego, San Diego, CA, USA; School of Medicine, University of California, San Diego, San Diego, CA, USA.
Cell Syst. 2018 Dec 26;7(6):656-666.e4. doi: 10.1016/j.cels.2018.10.015. Epub 2018 Dec 5.
High-throughput single-cell gene expression profiling has enabled the definition of new cell types and developmental trajectories. Visualizing these datasets is crucial to biological interpretation, and a popular method is t-stochastic neighbor embedding (t-SNE), which visualizes local patterns well but distorts global structure, such as distances between clusters. We developed similarity weighted nonnegative embedding (SWNE), which enhances interpretation of datasets by embedding the genes and factors that separate cell states on the visualization alongside the cells and maintains fidelity when visualizing local and global structure for both developmental trajectories and discrete cell types. SWNE uses nonnegative matrix factorization to decompose the gene expression matrix into biologically relevant factors; embeds the cells, genes, and factors in a 2D visualization; and uses a similarity matrix to smooth the embeddings. We demonstrate SWNE on single-cell RNA-seq data from hematopoietic progenitors and human brain cells. SWNE is available as an R package at github.com/yanwu2014/swne.
高通量单细胞基因表达谱分析使新的细胞类型和发育轨迹的定义成为可能。可视化这些数据集对于生物学解释至关重要,一种流行的方法是 t-随机邻居嵌入(t-SNE),它可以很好地可视化局部模式,但会扭曲全局结构,例如簇之间的距离。我们开发了相似性加权非负嵌入(SWNE),它通过在可视化中嵌入分离细胞状态的基因和因子,以及在可视化局部和全局结构时保持保真度,来增强对数据集的解释,无论是对发育轨迹还是离散细胞类型都是如此。SWNE 使用非负矩阵分解将基因表达矩阵分解为生物学上相关的因子;将细胞、基因和因子嵌入到 2D 可视化中;并使用相似性矩阵对嵌入进行平滑处理。我们在造血祖细胞和人类脑细胞的单细胞 RNA-seq 数据上展示了 SWNE。SWNE 可在 github.com/yanwu2014/swne 上作为 R 包获得。