Huang Yixiang, Jiang Hao, Ching Wai-Ki, Shen Dong
Department of Information and Computing Sciences, School of Mathematics, Renmin University of China, No. 59 Zhongguancun Street, Haidian District, Beijing 100872, China.
Department of Mathematics, The University of Hong Kong, Pokfulam Road, Hong Kong.
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf346.
Single-cell transcriptomics characterizes gene expression profiles at the single-cell level, offering an unprecedented opportunity to understand cellular systems. As a fundamental task in single-cell data analysis, cell clustering significantly contributes to identifying cellular heterogeneity, thereby affecting downstream analyses. A number of deep learning methods have been proposed for clustering single-cell RNA sequencing (scRNA-seq) data. However, the large parameter space makes these methods sensitive to parameter settings. To leverage the strong capabilities of deep learning in capturing complex structures in single-cell data while ensuring algorithmic robustness, we propose a contrastive ensemble learning method named scRECL for scRNA-seq data clustering. In our approach, Siamese neural networks are trained under various $k$-nearest neighbors partitions to obtain low-dimensional embeddings of the scRNA-seq data. Multiplex graphs in representative element selection help filter out noisy and redundant cells. Consequently, contrastive ensemble learning is performed for efficient and effective latent embedding, as well as robust analysis of cellular heterogeneity in scRNA-seq data.
单细胞转录组学在单细胞水平上表征基因表达谱,为理解细胞系统提供了前所未有的机会。作为单细胞数据分析中的一项基本任务,细胞聚类对于识别细胞异质性有显著贡献,进而影响下游分析。已经提出了许多深度学习方法用于对单细胞RNA测序(scRNA-seq)数据进行聚类。然而,巨大的参数空间使得这些方法对参数设置敏感。为了在确保算法鲁棒性的同时利用深度学习在捕捉单细胞数据复杂结构方面的强大能力,我们提出了一种用于scRNA-seq数据聚类的对比集成学习方法,名为scRECL。在我们的方法中,连体神经网络在各种k近邻分区下进行训练,以获得scRNA-seq数据的低维嵌入。代表性元素选择中的多重图有助于滤除噪声和冗余细胞。因此,进行对比集成学习以实现高效有效的潜在嵌入,以及对scRNA-seq数据中的细胞异质性进行稳健分析。