Yuan Lin, Sun Shengguo, Jiang Yufeng, Zhang Qinhu, Ye Lan, Zheng Chun-Hou, Huang De-Shuang
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China.
Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), 3501 Daxue Road, 250353, Shandong, China.
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae662.
Cell type annotation is a critical step in analyzing single-cell RNA sequencing (scRNA-seq) data. A large number of deep learning (DL)-based methods have been proposed to annotate cell types of scRNA-seq data and have achieved impressive results. However, there are several limitations to these methods. First, they do not fully exploit cell-to-cell differential features. Second, they are developed based on shallow features and lack of flexibility in integrating high-order features in the data. Finally, the low-dimensional gene features may lead to overfitting in neural networks. To overcome those limitations, we propose a novel DL-based model, cell type annotation of single-cell RNA-seq data using residual graph convolutional neural network with contrastive learning (scRGCL), based on residual graph convolutional neural network and contrastive learning for cell type annotation of single-cell RNA-seq data. scRGCL mainly consists of a residual graph convolutional neural network, contrastive learning, and weight freezing. A residual graph convolutional neural network is utilized to extract complex high-order features from data. Contrastive learning can help the model learn meaningful cell-to-cell differential features. Weight freezing can avoid overfitting and help the model discover the impact of specific gene expression on cell type annotation. To verify the effectiveness of scRGCL, we compared its performance with six methods (three shallow learning algorithms and three state-of-the-art DL-based methods) on eight single-cell benchmark datasets from two species (seven in human and one in mouse). Experimental results not only show that scRGCL outperforms competing methods but also demonstrate the generalizability of scRGCL for cell type annotation. scRGCL is available at https://github.com/nathanyl/scRGCL.
细胞类型注释是分析单细胞RNA测序(scRNA-seq)数据的关键步骤。已经提出了大量基于深度学习(DL)的方法来注释scRNA-seq数据的细胞类型,并取得了令人瞩目的成果。然而,这些方法存在一些局限性。首先,它们没有充分利用细胞间的差异特征。其次,它们是基于浅层特征开发的,在整合数据中的高阶特征方面缺乏灵活性。最后,低维基因特征可能导致神经网络中的过拟合。为了克服这些局限性,我们提出了一种新颖的基于DL的模型,即使用带有对比学习的残差图卷积神经网络对单细胞RNA-seq数据进行细胞类型注释(scRGCL),该模型基于残差图卷积神经网络和对比学习用于单细胞RNA-seq数据的细胞类型注释。scRGCL主要由残差图卷积神经网络、对比学习和权重冻结组成。残差图卷积神经网络用于从数据中提取复杂的高阶特征。对比学习可以帮助模型学习有意义的细胞间差异特征。权重冻结可以避免过拟合,并帮助模型发现特定基因表达对细胞类型注释的影响。为了验证scRGCL的有效性,我们在来自两个物种的八个单细胞基准数据集(七个来自人类,一个来自小鼠)上,将其性能与六种方法(三种浅层学习算法和三种基于DL的先进方法)进行了比较。实验结果不仅表明scRGCL优于竞争方法,还证明了scRGCL在细胞类型注释方面的通用性。scRGCL可在https://github.com/nathanyl/scRGCL上获取。