College of Computer Science and Technology, Harbin Engineering University, Harbin, 150001, Heilongjiang, China.
College of Computer Science and Technology, Harbin Engineering University, Harbin, 150001, Heilongjiang, China.
Comput Biol Med. 2024 Sep;179:108921. doi: 10.1016/j.compbiomed.2024.108921. Epub 2024 Jul 25.
Single-cell RNA sequencing (scRNA-seq) is the sequencing technology of a single cell whose expression reflects the overall characteristics of the individual cell, facilitating the research of problems at the cellular level. However, the problems of scRNA-seq such as dimensionality reduction processing of massive data, technical noise in data, and visualization of single-cell type clustering cause great difficulties for analyzing and processing scRNA-seq data. In this paper, we propose a new single-cell data analysis model using denoising autoencoder and multi-type graph neural networks (scDMG), which learns cell-cell topology information and latent representation of scRNA-seq data. scDMG introduces the zero-inflated negative binomial (ZINB) model into a denoising autoencoder (DAE) to perform dimensionality reduction and denoising on the raw data. scDMG integrates multiple-type graph neural networks as the encoder to further train the preprocessed data, which better deals with various types of scRNA-seq datasets, resolves dropout events in scRNA-seq data, and enables preliminary classification of scRNA-seq data. By employing TSNE and PCA algorithms for the trained data and invoking Louvain algorithm, scDMG has better dimensionality reduction and clustering optimization. Compared with other mainstream scRNA-seq clustering algorithms, scDMG outperforms other state-of-the-art methods in various clustering performance metrics and shows better scalability, shorter runtime, and great clustering results.
单细胞 RNA 测序(scRNA-seq)是对单个细胞进行测序的技术,其表达反映了单个细胞的整体特征,有利于研究细胞水平的问题。然而,scRNA-seq 存在大量数据的降维处理、数据技术噪声和单细胞类型聚类的可视化等问题,给 scRNA-seq 数据分析和处理带来了很大的困难。在本文中,我们提出了一种使用去噪自动编码器和多类型图神经网络(scDMG)的新的单细胞数据分析模型,该模型学习细胞间拓扑信息和 scRNA-seq 数据的潜在表示。scDMG 将零膨胀负二项式(ZINB)模型引入去噪自动编码器(DAE)中,对原始数据进行降维和去噪。scDMG 将多种类型的图神经网络集成作为编码器,进一步训练预处理后的数据,从而更好地处理各种类型的 scRNA-seq 数据集,解决 scRNA-seq 数据中的dropout 事件,并对 scRNA-seq 数据进行初步分类。通过对训练后的数据运用 TSNE 和 PCA 算法,并调用 Louvain 算法,scDMG 具有更好的降维和聚类优化效果。与其他主流 scRNA-seq 聚类算法相比,scDMG 在各种聚类性能指标上都优于其他最先进的方法,并且具有更好的可扩展性、更短的运行时间和更好的聚类结果。