Dong Xishuang, Chowdhury Shanta, Victor Uboho, Li Xiangfang, Qian Lijun
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1492-1505. doi: 10.1109/TCBB.2022.3173587. Epub 2023 Apr 3.
Cell type identification from single-cell transcriptomic data is a common goal of single-cell RNA sequencing (scRNAseq) data analysis. Deep neural networks have been employed to identify cell types from scRNAseq data with high performance. However, it requires a large mount of individual cells with accurate and unbiased annotated types to train the identification models. Unfortunately, labeling the scRNAseq data is cumbersome and time-consuming as it involves manual inspection of marker genes. To overcome this challenge, we propose a semi-supervised learning model "SemiRNet" to use unlabeled scRNAseq cells and a limited amount of labeled scRNAseq cells to implement cell identification. The proposed model is based on recurrent convolutional neural networks (RCNN), which includes a shared network, a supervised network and an unsupervised network. The proposed model is evaluated on two large scale single-cell transcriptomic datasets. It is observed that the proposed model is able to achieve encouraging performance by learning on the very limited amount of labeled scRNAseq cells together with a large number of unlabeled scRNAseq cells.
从单细胞转录组数据中识别细胞类型是单细胞RNA测序(scRNAseq)数据分析的一个常见目标。深度神经网络已被用于从scRNAseq数据中高效识别细胞类型。然而,它需要大量具有准确且无偏差注释类型的单个细胞来训练识别模型。不幸的是,标记scRNAseq数据既繁琐又耗时,因为这涉及到对标记基因的人工检查。为了克服这一挑战,我们提出了一种半监督学习模型“SemiRNet”,以使用未标记的scRNAseq细胞和有限数量的标记scRNAseq细胞来实现细胞识别。所提出的模型基于循环卷积神经网络(RCNN),它包括一个共享网络、一个监督网络和一个无监督网络。所提出的模型在两个大规模单细胞转录组数据集上进行了评估。结果表明,通过在非常有限数量的标记scRNAseq细胞以及大量未标记scRNAseq细胞上进行学习,所提出的模型能够取得令人鼓舞的性能。