Pu Juhua, Wang Bingchen, Liu Xingwu, Chen Lingxi, Li Shuai Cheng
State Key Laboratory of Software Development Environment, Beihang University, Beijing, China.
Beihang Hangzhou Innovation Institute Yuhang, Xixi Octagon City, Yuhang District, Hangzhou 310023, China.
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad026.
The advance in single-cell RNA-sequencing (scRNA-seq) sheds light on cell-specific transcriptomic studies of cell developments, complex diseases and cancers. Nevertheless, scRNA-seq techniques suffer from 'dropout' events, and imputation tools are proposed to address the sparsity. Here, rather than imputation, we propose a tool, SMURF, to extract the low-dimensional embeddings from cells and genes utilizing matrix factorization with a mixture of Poisson-Gamma divergent as objective while preserving self-consistency. SMURF exhibits feasible cell subpopulation discovery efficacy with obtained cell embeddings on replicated in silico and eight web lab scRNA datasets with ground truth cell types. Furthermore, SMURF can reduce the cell embedding to a 1D-oval space to recover the time course of cell cycle. SMURF can also serve as an imputation tool; the in silico data assessment shows that SMURF parades the most robust gene expression recovery power with low root mean square error and high Pearson correlation. Moreover, SMURF recovers the gene distribution for the WM989 Drop-seq data. SMURF is available at https://github.com/deepomicslab/SMURF.
单细胞RNA测序(scRNA-seq)技术的进步为细胞发育、复杂疾病和癌症的细胞特异性转录组学研究提供了线索。然而,scRNA-seq技术存在“缺失”事件,因此人们提出了插补工具来解决数据稀疏性问题。在此,我们提出了一种名为SMURF的工具,它不是进行插补,而是利用矩阵分解从细胞和基因中提取低维嵌入,以泊松-伽马散度混合作为目标,同时保持自一致性。在具有真实细胞类型的模拟复制和八个网络实验室scRNA数据集中,SMURF通过获得的细胞嵌入展现出可行的细胞亚群发现效果。此外,SMURF可以将细胞嵌入简化到一维椭圆空间,以恢复细胞周期的时间进程。SMURF还可以用作插补工具;模拟数据评估表明,SMURF具有最强的基因表达恢复能力,均方根误差低,皮尔逊相关性高。此外,SMURF恢复了WM989 Drop-seq数据的基因分布。可通过https://github.com/deepomicslab/SMURF获取SMURF。