IEEE/ACM Trans Comput Biol Bioinform. 2024 Sep-Oct;21(5):1492-1503. doi: 10.1109/TCBB.2024.3405731. Epub 2024 Oct 9.
Single-cell RNA sequencing (scRNA-seq) is a potent advancement for analyzing gene expression at the individual cell level, allowing for the identification of cellular heterogeneity and subpopulations. However, it suffers from technical limitations that result in sparse and heterogeneous data. Here, we propose scVSC, an unsupervised clustering algorithm built on deep representation neural networks. The method incorporates the variational inference into the subspace model, which imposes regularization constraints on the latent space and further prevents overfitting. In a series of experiments across multiple datasets, scVSC outperforms existing state-of-the-art unsupervised and semi-supervised clustering tools regarding clustering accuracy and running efficiency. Moreover, the study indicates that scVSC could visually reveal the state of trajectory differentiation, accurately identify differentially expressed genes, and further discover biologically critical pathways.
单细胞 RNA 测序 (scRNA-seq) 是分析单个细胞水平基因表达的有力手段,可用于识别细胞异质性和亚群。然而,它受到技术限制,导致数据稀疏且不均匀。在这里,我们提出了 scVSC,这是一种基于深度表示神经网络的无监督聚类算法。该方法将变分推理纳入子空间模型中,对子空间施加正则化约束,进一步防止过拟合。在一系列跨多个数据集的实验中,scVSC 在聚类准确性和运行效率方面优于现有的最先进的无监督和半监督聚类工具。此外,研究表明,scVSC 可以直观地揭示轨迹分化状态,准确识别差异表达基因,并进一步发现生物学关键途径。