IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2473-2483. doi: 10.1109/TNNLS.2022.3190289. Epub 2024 Feb 5.
Single-cell RNA sequencing (scRNA-seq) technology is famous for providing a microscopic view to help capture cellular heterogeneity. This characteristic has advanced the field of genomics by enabling the delicate differentiation of cell types. However, the properties of single-cell datasets, such as high dropout events, noise, and high dimensionality, are still a research challenge in the single-cell field. To utilize single-cell data more efficiently and to better explore the heterogeneity among cells, a new graph autoencoder (GAE)-based consensus-guided model (scGAC) is proposed in this article. The data are preprocessed into multiple top-level feature datasets. Then, feature learning is performed by using GAEs to generate new feature matrices, followed by similarity learning based on distance fusion methods. The learned similarity matrices are fed back to the GAEs to guide their feature learning process. Finally, the abovementioned steps are iterated continuously to integrate the final consistent similarity matrix and perform other related downstream analyses. The scGAC model can accurately identify critical features and effectively preserve the internal structure of the data. This can further improve the accuracy of cell type identification.
单细胞 RNA 测序 (scRNA-seq) 技术以提供微观视角而闻名,有助于捕捉细胞异质性。这一特性通过精细区分细胞类型,推动了基因组学领域的发展。然而,单细胞数据集的特性,如高缺失事件、噪声和高维性,仍然是单细胞领域的研究挑战。为了更有效地利用单细胞数据,并更好地探索细胞间的异质性,本文提出了一种基于图自动编码器 (GAE) 的共识引导模型 (scGAC)。首先将数据预处理成多个顶级特征数据集,然后使用 GAEs 进行特征学习,生成新的特征矩阵,再基于距离融合方法进行相似性学习。学习到的相似性矩阵被反馈到 GAEs 中,以指导其特征学习过程。最后,迭代执行上述步骤,以整合最终一致的相似性矩阵并执行其他相关下游分析。scGAC 模型可以准确识别关键特征,并有效地保留数据的内部结构,从而进一步提高细胞类型识别的准确性。