College of Computer Science, Nankai University, Tianjin 300350, China.
Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300350, China.
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac398.
Single-cell sequencing technologies are widely used to discover the evolutionary relationships and the differences in cells. Since dropout events may frustrate the analysis, many imputation approaches for single-cell RNA-seq data have appeared in previous attempts. However, previous imputation attempts usually suffer from the over-smooth problem, which may bring limited improvement or negative effect for the downstream analysis of single-cell RNA-seq data. To solve this difficulty, we propose a novel two-stage diffusion-denoising method called SCDD for large-scale single-cell RNA-seq imputation in this paper. We introduce the diffusion i.e. a direct imputation strategy using the expression of similar cells for potential dropout sites, to perform the initial imputation at first. After the diffusion, a joint model integrated with graph convolutional neural network and contractive autoencoder is developed to generate superposition states of similar cells, from which we restore the original states and remove the noise introduced by the diffusion. The final experimental results indicate that SCDD could effectively suppress the over-smooth problem and remarkably improve the effect of single-cell RNA-seq downstream analysis, including clustering and trajectory analysis.
单细胞测序技术被广泛用于发现细胞的进化关系和差异。由于缺失事件可能会阻碍分析,因此在之前的尝试中出现了许多用于单细胞 RNA-seq 数据的插补方法。然而,之前的插补尝试通常存在过度平滑的问题,这可能会给单细胞 RNA-seq 数据的下游分析带来有限的改进或负面影响。为了解决这个困难,我们提出了一种新的两阶段扩散去噪方法 SCDD,用于大规模单细胞 RNA-seq 插补。我们引入了扩散,即使用相似细胞的表达对潜在的缺失位点进行直接插补的策略,首先进行初始插补。在扩散之后,开发了一个集成图卷积神经网络和收缩自动编码器的联合模型,以生成相似细胞的叠加状态,从中我们可以恢复原始状态并去除扩散引入的噪声。最终的实验结果表明,SCDD 可以有效地抑制过度平滑问题,并显著提高单细胞 RNA-seq 下游分析的效果,包括聚类和轨迹分析。