College of Computer Science and Software Engineering, Shenzhen University, Guangdong 518057, China.
College of Future Technology, HKUST(GZ), Guangdong 510641, China.
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae555.
The increasing single-cell RNA sequencing (scRNA-seq) data enable researchers to explore cellular heterogeneity and gene expression profiles, offering a high-resolution view of the transcriptome at the single-cell level. However, the dropout events, which are often present in scRNA-seq data, remaining challenges for downstream analysis. Although a number of studies have been developed to recover single-cell expression profiles, their performance may be hindered due to not fully exploring the inherent relations between genes. To address the issue, we propose scDTL, a deep transfer learning based approach for scRNA-seq data imputation by harnessing the bulk RNA-sequencing information. We firstly employ a denoising autoencoder trained on bulk RNA-seq data as the initial imputation model, and then leverage a domain adaptation framework that transfers the knowledge learned by the bulk imputation model to scRNA-seq learning task. In addition, scDTL employs a parallel operation with a 1D U-Net denoising model to provide gene representations of varying granularity, capturing both coarse and fine features of the scRNA-seq data. Finally, we utilize a cross-channel attention mechanism to fuse the features learned from the transferred bulk imputation model and U-Net model. In the evaluation, we conduct extensive experiments to demonstrate that scDTL could outperform other state-of-the-art methods in the quantitative comparison and downstream analyses.
单细胞 RNA 测序 (scRNA-seq) 数据的不断增加,使研究人员能够探索细胞异质性和基因表达谱,提供单细胞水平转录组的高分辨率视图。然而,在 scRNA-seq 数据中经常存在的缺失事件仍然是下游分析的挑战。尽管已经有许多研究致力于恢复单细胞表达谱,但由于未能充分挖掘基因之间的内在关系,其性能可能会受到阻碍。为了解决这个问题,我们提出了 scDTL,这是一种基于深度迁移学习的方法,通过利用批量 RNA-seq 信息来进行 scRNA-seq 数据插补。我们首先使用在批量 RNA-seq 数据上训练的去噪自动编码器作为初始插补模型,然后利用域自适应框架将批量插补模型学习到的知识转移到 scRNA-seq 学习任务中。此外,scDTL 采用并行操作,使用 1D U-Net 去噪模型提供不同粒度的基因表示,同时捕获 scRNA-seq 数据的粗粒度和细粒度特征。最后,我们利用交叉通道注意力机制融合从转移的批量插补模型和 U-Net 模型中学习到的特征。在评估中,我们进行了广泛的实验,以证明 scDTL 在定量比较和下游分析中可以优于其他最先进的方法。