• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单细胞 RNA 测序数据的对比自监督聚类。

Contrastive self-supervised clustering of scRNA-seq data.

机构信息

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles, Brussels, Belgium.

出版信息

BMC Bioinformatics. 2021 May 27;22(1):280. doi: 10.1186/s12859-021-04210-8.

DOI:10.1186/s12859-021-04210-8
PMID:34044773
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8157426/
Abstract

BACKGROUND

Single-cell RNA sequencing (scRNA-seq) has emerged has a main strategy to study transcriptional activity at the cellular level. Clustering analysis is routinely performed on scRNA-seq data to explore, recognize or discover underlying cell identities. The high dimensionality of scRNA-seq data and its significant sparsity accentuated by frequent dropout events, introducing false zero count observations, make the clustering analysis computationally challenging. Even though multiple scRNA-seq clustering techniques have been proposed, there is no consensus on the best performing approach. On a parallel research track, self-supervised contrastive learning recently achieved state-of-the-art results on images clustering and, subsequently, image classification.

RESULTS

We propose contrastive-sc, a new unsupervised learning method for scRNA-seq data that perform cell clustering. The method consists of two consecutive phases: first, an artificial neural network learns an embedding for each cell through a representation training phase. The embedding is then clustered in the second phase with a general clustering algorithm (i.e. KMeans or Leiden community detection). The proposed representation training phase is a new adaptation of the self-supervised contrastive learning framework, initially proposed for image processing, to scRNA-seq data. contrastive-sc has been compared with ten state-of-the-art techniques. A broad experimental study has been conducted on both simulated and real-world datasets, assessing multiple external and internal clustering performance metrics (i.e. ARI, NMI, Silhouette, Calinski scores). Our experimental analysis shows that constastive-sc compares favorably with state-of-the-art methods on both simulated and real-world datasets.

CONCLUSION

On average, our method identifies well-defined clusters in close agreement with ground truth annotations. Our method is computationally efficient, being fast to train and having a limited memory footprint. contrastive-sc maintains good performance when only a fraction of input cells is provided and is robust to changes in hyperparameters or network architecture. The decoupling between the creation of the embedding and the clustering phase allows the flexibility to choose a suitable clustering algorithm (i.e. KMeans when the number of expected clusters is known, Leiden otherwise) or to integrate the embedding with other existing techniques.

摘要

背景

单细胞 RNA 测序 (scRNA-seq) 已成为研究细胞水平转录活性的主要策略。聚类分析通常在 scRNA-seq 数据上进行,以探索、识别或发现潜在的细胞身份。scRNA-seq 数据的高维度及其由于频繁的缺失事件而显著稀疏,引入了虚假的零计数观察值,使得聚类分析具有计算挑战性。尽管已经提出了多种 scRNA-seq 聚类技术,但对于最佳表现方法尚无共识。在平行的研究轨道上,自监督对比学习最近在图像聚类方面取得了最先进的结果,随后在图像分类方面也取得了最先进的结果。

结果

我们提出了 contrastive-sc,这是一种用于 scRNA-seq 数据的新无监督学习方法,可进行细胞聚类。该方法由两个连续的阶段组成:首先,通过表示训练阶段,人工神经网络为每个细胞学习一个嵌入。然后,在第二阶段,使用通用聚类算法(即 KMeans 或 Leiden 社区检测)对嵌入进行聚类。所提出的表示训练阶段是一种自我监督对比学习框架的新适应,最初是为图像处理提出的,现在也适用于 scRNA-seq 数据。contrastive-sc 已与十种最先进的技术进行了比较。在模拟和真实数据集上进行了广泛的实验研究,评估了多种外部和内部聚类性能指标(即 ARI、NMI、Silhouette、Calinski 分数)。我们的实验分析表明,在模拟和真实数据集上,contrastive-sc 与最先进的方法相比表现出色。

结论

平均而言,我们的方法可以识别定义明确的聚类,与地面真实注释非常吻合。我们的方法计算效率高,训练速度快,内存占用有限。当仅提供输入细胞的一小部分时,contrastive-sc 保持良好的性能,并且对超参数或网络架构的变化具有鲁棒性。嵌入的创建和解耦与聚类阶段的分离允许灵活选择合适的聚类算法(即当已知预期聚类数时选择 KMeans,否则选择 Leiden),或者将嵌入与其他现有技术集成。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/71a1ca2fbcce/12859_2021_4210_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/4272bbe6a88c/12859_2021_4210_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/4104940edd0a/12859_2021_4210_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/1a2898db9a50/12859_2021_4210_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/ad03604fdfc0/12859_2021_4210_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/819cb3ff51e8/12859_2021_4210_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/2c44d0f8b060/12859_2021_4210_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/de8b97657b51/12859_2021_4210_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/de96b5e294f0/12859_2021_4210_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/742be0f058fa/12859_2021_4210_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/9904b8b9e013/12859_2021_4210_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/71a1ca2fbcce/12859_2021_4210_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/4272bbe6a88c/12859_2021_4210_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/4104940edd0a/12859_2021_4210_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/1a2898db9a50/12859_2021_4210_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/ad03604fdfc0/12859_2021_4210_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/819cb3ff51e8/12859_2021_4210_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/2c44d0f8b060/12859_2021_4210_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/de8b97657b51/12859_2021_4210_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/de96b5e294f0/12859_2021_4210_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/742be0f058fa/12859_2021_4210_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/9904b8b9e013/12859_2021_4210_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e09/8157426/71a1ca2fbcce/12859_2021_4210_Fig11_HTML.jpg

相似文献

1
Contrastive self-supervised clustering of scRNA-seq data.单细胞 RNA 测序数据的对比自监督聚类。
BMC Bioinformatics. 2021 May 27;22(1):280. doi: 10.1186/s12859-021-04210-8.
2
Deep enhanced constraint clustering based on contrastive learning for scRNA-seq data.基于对比学习的深度增强约束聚类算法在单细胞 RNA-seq 数据分析中的应用。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad222.
3
GNN-based embedding for clustering scRNA-seq data.基于图神经网络的 scRNA-seq 数据聚类嵌入方法。
Bioinformatics. 2022 Jan 27;38(4):1037-1044. doi: 10.1093/bioinformatics/btab787.
4
nsDCC: dual-level contrastive clustering with nonuniform sampling for scRNA-seq data analysis.nsDCC:基于非均匀采样的双层对比聚类算法,用于 scRNA-seq 数据分析。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae477.
5
scZAG: Integrating ZINB-Based Autoencoder with Adaptive Data Augmentation Graph Contrastive Learning for scRNA-seq Clustering.scZAG:基于 ZINB 的自动编码器与自适应数据增强图对比学习在 scRNA-seq 聚类中的整合。
Int J Mol Sci. 2024 May 29;25(11):5976. doi: 10.3390/ijms25115976.
6
scDCCA: deep contrastive clustering for single-cell RNA-seq data based on auto-encoder network.scDCCA:基于自动编码器网络的单细胞RNA测序数据深度对比聚类
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac625.
7
ScCCL: Single-Cell Data Clustering Based on Self-Supervised Contrastive Learning.ScCCL:基于自监督对比学习的单细胞数据聚类。
IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):2233-2241. doi: 10.1109/TCBB.2023.3241129. Epub 2023 Jun 5.
8
scGCC: Graph Contrastive Clustering With Neighborhood Augmentations for scRNA-Seq Data Analysis.scGCC:基于邻域增强的图对比聚类在 scRNA-Seq 数据分析中的应用。
IEEE J Biomed Health Inform. 2023 Dec;27(12):6133-6143. doi: 10.1109/JBHI.2023.3319551. Epub 2023 Dec 5.
9
Learning deep features and topological structure of cells for clustering of scRNA-sequencing data.学习 scRNA-seq 数据聚类的细胞深度特征和拓扑结构。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac068.
10
Attention-based deep clustering method for scRNA-seq cell type identification.基于注意力机制的深度聚类方法在 scRNA-seq 细胞类型鉴定中的应用。
PLoS Comput Biol. 2023 Nov 10;19(11):e1011641. doi: 10.1371/journal.pcbi.1011641. eCollection 2023 Nov.

引用本文的文献

1
Deep clustering of single-cell RNA-seq using adversarial graph contrastive learning.使用对抗性图对比学习对单细胞RNA测序进行深度聚类。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf423.
2
Less is more: improving cell-type identification with augmentation-free single-cell RNA-Seq contrastive learning.少即是多:通过无增强单细胞RNA测序对比学习改进细胞类型识别
Bioinformatics. 2025 Sep 1;41(9). doi: 10.1093/bioinformatics/btaf437.
3
CYCLONE: recycle contrastive learning for integrating single-cell gene expression data.CYCLONE:用于整合单细胞基因表达数据的循环对比学习

本文引用的文献

1
Learning From Noisy Labels With Deep Neural Networks: A Survey.基于深度神经网络从噪声标签中学习:一项综述。
IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):8135-8153. doi: 10.1109/TNNLS.2022.3152527. Epub 2023 Oct 27.
2
Deep soft -means clustering with self-training for single-cell RNA sequence data.用于单细胞RNA序列数据的基于自训练的深度软均值聚类
NAR Genom Bioinform. 2020 May 25;2(2):lqaa039. doi: 10.1093/nargab/lqaa039. eCollection 2020 Jun.
3
Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis.
BMC Bioinformatics. 2025 Jul 30;26(1):202. doi: 10.1186/s12859-025-06214-0.
4
Soft graph clustering for single-cell RNA sequencing data.用于单细胞RNA测序数据的软图聚类
BMC Bioinformatics. 2025 Jul 25;26(1):195. doi: 10.1186/s12859-025-06231-z.
5
IGCLAPS: an interpretable graph contrastive learning method with adaptive positive sampling for scRNA-seq data analysis.IGCLAPS:一种用于单细胞RNA测序数据分析的具有自适应正样本采样的可解释图对比学习方法。
Bioinformatics. 2025 Jul 21. doi: 10.1093/bioinformatics/btaf411.
6
ECT2 cell group acts as cancer stem cell in malignant pleomorphic adenoma.ECT2细胞群在恶性多形性腺瘤中充当癌症干细胞。
NPJ Precis Oncol. 2025 Jun 17;9(1):189. doi: 10.1038/s41698-025-00974-x.
7
Differentiable graph clustering with structural grouping for single-cell RNA-seq data.用于单细胞RNA测序数据的具有结构分组的可微图聚类
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf347.
8
Revealing a coherent cell state landscape across single cell datasets with CONCORD.利用CONCORD揭示单细胞数据集中连贯的细胞状态图谱。
bioRxiv. 2025 Apr 11:2025.03.13.643146. doi: 10.1101/2025.03.13.643146.
9
scMUSCL: multi-source transfer learning for clustering scRNA-seq data.scMUSCL:用于单细胞RNA测序数据聚类的多源迁移学习
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf137.
10
scSAMAC: saliency-adjusted masking induced attention contrastive learning for single-cell clustering.scSAMAC:用于单细胞聚类的显著性调整掩膜诱导注意力对比学习
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf128.
深度学习能够实现单细胞 RNA-seq 分析中具有批次效应去除功能的精确聚类。
Nat Commun. 2020 May 11;11(1):2338. doi: 10.1038/s41467-020-15851-3.
4
Scedar: A scalable Python package for single-cell RNA-seq exploratory data analysis.Scedar:一个用于单细胞 RNA-seq 探索性数据分析的可扩展 Python 包。
PLoS Comput Biol. 2020 Apr 27;16(4):e1007794. doi: 10.1371/journal.pcbi.1007794. eCollection 2020 Apr.
5
Using transfer learning from prior reference knowledge to improve the clustering of single-cell RNA-Seq data.利用先验参考知识的迁移学习来改进单细胞 RNA-Seq 数据的聚类。
Sci Rep. 2019 Dec 30;9(1):20353. doi: 10.1038/s41598-019-56911-z.
6
Clustering and classification methods for single-cell RNA-sequencing data.单细胞 RNA 测序数据的聚类和分类方法。
Brief Bioinform. 2020 Jul 15;21(4):1196-1208. doi: 10.1093/bib/bbz062.
7
From Louvain to Leiden: guaranteeing well-connected communities.从鲁汶到莱顿:保障互联互通的社区。
Sci Rep. 2019 Mar 26;9(1):5233. doi: 10.1038/s41598-019-41695-z.
8
Single-cell RNA-seq denoising using a deep count autoencoder.基于深度计数自编码器的单细胞 RNA-seq 去噪。
Nat Commun. 2019 Jan 23;10(1):390. doi: 10.1038/s41467-018-07931-2.
9
Challenges in unsupervised clustering of single-cell RNA-seq data.无监督单细胞 RNA-seq 数据聚类的挑战。
Nat Rev Genet. 2019 May;20(5):273-282. doi: 10.1038/s41576-018-0088-9.
10
Semisoft clustering of single-cell data.单细胞数据的半软聚类。
Proc Natl Acad Sci U S A. 2019 Jan 8;116(2):466-471. doi: 10.1073/pnas.1817715116. Epub 2018 Dec 26.