Suppr超能文献

ScCCL:基于自监督对比学习的单细胞数据聚类。

ScCCL: Single-Cell Data Clustering Based on Self-Supervised Contrastive Learning.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):2233-2241. doi: 10.1109/TCBB.2023.3241129. Epub 2023 Jun 5.

Abstract

The growing maturity of single-cell RNA-sequencing (scRNA-seq) technology allows us to explore the heterogeneity of tissues, organisms, and complex diseases at cellular level. In single-cell data analysis, clustering calculation is very important. However, the high dimensionality of scRNA-seq data, the ever-increasing number of cells, and the unavoidable technical noise bring great challenges to clustering calculations. Motivated by the good performance of contrastive learning in multiple domains, we propose ScCCL, a novel self-supervised contrastive learning method for clustering of scRNA-seq data. ScCCL first randomly masks the gene expression of each cell twice and adds a small amount of Gaussian noise, and then uses the momentum encoder structure to extract features from the enhanced data. Contrastive learning is then applied in the instance-level contrastive learning module and the cluster-level contrastive learning module, respectively. After training, a representation model that can efficiently extract high-order embeddings of single cells is obtained. We selected two evaluation metrics, ARI and NMI, to conduct experiments on multiple public datasets. The results show that ScCCL improves the clustering effect compared with the benchmark algorithms. Notably, since ScCCL does not depend on a specific type of data, it can also be helpful in clustering analysis of single-cell multi-omics data.

摘要

单细胞 RNA 测序 (scRNA-seq) 技术的日益成熟使我们能够在细胞水平上探索组织、生物体和复杂疾病的异质性。在单细胞数据分析中,聚类计算非常重要。然而,scRNA-seq 数据的高维性、细胞数量的不断增加以及不可避免的技术噪声给聚类计算带来了巨大的挑战。受对比学习在多个领域的优异性能的启发,我们提出了 ScCCL,这是一种用于 scRNA-seq 数据聚类的新型自监督对比学习方法。ScCCL 首先随机屏蔽每个细胞的基因表达两次,并添加少量高斯噪声,然后使用动量编码器结构从增强数据中提取特征。然后分别在实例级对比学习模块和簇级对比学习模块中应用对比学习。经过训练,得到了一个能够有效提取单细胞高阶嵌入的表示模型。我们选择了两个评估指标,ARI 和 NMI,在多个公共数据集上进行了实验。结果表明,与基准算法相比,ScCCL 提高了聚类效果。值得注意的是,由于 ScCCL 不依赖于特定类型的数据,因此它也可以有助于单细胞多组学数据的聚类分析。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验