School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China.
School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong 518055, China.
Comput Biol Chem. 2021 Feb;90:107415. doi: 10.1016/j.compbiolchem.2020.107415. Epub 2020 Nov 18.
Accurate clustering of cells from single-cell RNA sequencing (scRNA-seq) data is an essential step for biological analysis such as putative cell type identification. However, scRNA-seq data has high dimension and high sparsity, which makes traditional clustering methods less effective to reflect the similarity between cells. Since genetic network fundamentally defines the functions of cell and deep learning shows strong advantages in network representation learning, we propose a novel scRNA-seq clustering framework ScGSLC based on graph similarity learning. ScGSLC effectively integrates scRNA-seq data and protein-protein interaction network to a graph. Then graph convolution network is employed by ScGSLC to embedding graph and clustering the cells by the calculated similarity between graphs. Unsupervised clustering results of nine public data sets demonstrate that ScGSLC shows better performance than the state-of-the-art methods.
准确地对单细胞 RNA 测序 (scRNA-seq) 数据中的细胞进行聚类是进行生物学分析(如推测细胞类型识别)的重要步骤。然而,scRNA-seq 数据具有高维度和高稀疏性,这使得传统的聚类方法在反映细胞之间的相似性方面效果较差。由于遗传网络从根本上定义了细胞的功能,而深度学习在网络表示学习方面具有强大的优势,因此我们提出了一种基于图相似性学习的 scRNA-seq 聚类框架 ScGSLC。ScGSLC 将 scRNA-seq 数据和蛋白质-蛋白质相互作用网络有效地整合到一个图中。然后,ScGSLC 通过图卷积网络对图进行嵌入,并通过计算图之间的相似性来对细胞进行聚类。对九个公共数据集的无监督聚类结果表明,ScGSLC 比最先进的方法表现更好。