• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于单细胞RNA序列数据的基于自训练的深度软均值聚类

Deep soft -means clustering with self-training for single-cell RNA sequence data.

作者信息

Chen Liang, Wang Weinan, Zhai Yuyao, Deng Minghua

机构信息

School of Mathematical Sciences, Peking University, Beijing 100871, China.

Mathematical and Statistical Institute, Northeast Normal University, Changchun 130024, China.

出版信息

NAR Genom Bioinform. 2020 May 25;2(2):lqaa039. doi: 10.1093/nargab/lqaa039. eCollection 2020 Jun.

DOI:10.1093/nargab/lqaa039
PMID:33575592
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7671315/
Abstract

Single-cell RNA sequencing (scRNA-seq) allows researchers to study cell heterogeneity at the cellular level. A crucial step in analyzing scRNA-seq data is to cluster cells into subpopulations to facilitate subsequent downstream analysis. However, frequent dropout events and increasing size of scRNA-seq data make clustering such high-dimensional, sparse and massive transcriptional expression profiles challenging. Although some existing deep learning-based clustering algorithms for single cells combine dimensionality reduction with clustering, they either ignore the distance and affinity constraints between similar cells or make some additional latent space assumptions like mixture Gaussian distribution, failing to learn cluster-friendly low-dimensional space. Therefore, in this paper, we combine the deep learning technique with the use of a denoising autoencoder to characterize scRNA-seq data while propose a soft self-training -means algorithm to cluster the cell population in the learned latent space. The self-training procedure can effectively aggregate the similar cells and pursue more cluster-friendly latent space. Our method, called 'scziDesk', alternately performs data compression, data reconstruction and soft clustering iteratively, and the results exhibit excellent compatibility and robustness in both simulated and real data. Moreover, our proposed method has perfect scalability in line with cell size on large-scale datasets.

摘要

单细胞RNA测序(scRNA-seq)使研究人员能够在细胞水平上研究细胞异质性。分析scRNA-seq数据的一个关键步骤是将细胞聚类成亚群,以便于后续的下游分析。然而,频繁的缺失事件以及scRNA-seq数据量的不断增加,使得对如此高维、稀疏且海量的转录表达谱进行聚类具有挑战性。尽管现有的一些基于深度学习的单细胞聚类算法将降维与聚类相结合,但它们要么忽略了相似细胞之间的距离和亲和约束,要么做出一些额外的潜在空间假设,如混合高斯分布,未能学习到有利于聚类的低维空间。因此,在本文中,我们将深度学习技术与去噪自编码器的使用相结合来表征scRNA-seq数据,同时提出一种软自训练均值算法,以便在学习到的潜在空间中对细胞群体进行聚类。自训练过程可以有效地聚集相似细胞,并追求更有利于聚类的潜在空间。我们的方法称为“scziDesk”,它交替迭代地执行数据压缩、数据重建和软聚类,并且在模拟数据和真实数据中结果都表现出优异的兼容性和鲁棒性。此外,我们提出的方法在大规模数据集上与细胞大小相关的方面具有完美的可扩展性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/b8b45b6ce5d0/lqaa039fig11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/e28c4a903a70/lqaa039fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/1537137b4d4c/lqaa039fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/039674ac4f74/lqaa039fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/4da29090605b/lqaa039fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/ee1ea755179a/lqaa039fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/f1694d7f81fc/lqaa039fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/6b542a596461/lqaa039fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/0e29360fe93e/lqaa039fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/61ab9c07534e/lqaa039fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/80f80915e8b4/lqaa039fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/b8b45b6ce5d0/lqaa039fig11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/e28c4a903a70/lqaa039fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/1537137b4d4c/lqaa039fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/039674ac4f74/lqaa039fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/4da29090605b/lqaa039fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/ee1ea755179a/lqaa039fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/f1694d7f81fc/lqaa039fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/6b542a596461/lqaa039fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/0e29360fe93e/lqaa039fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/61ab9c07534e/lqaa039fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/80f80915e8b4/lqaa039fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe8/7671315/b8b45b6ce5d0/lqaa039fig11.jpg

相似文献

1
Deep soft -means clustering with self-training for single-cell RNA sequence data.用于单细胞RNA序列数据的基于自训练的深度软均值聚类
NAR Genom Bioinform. 2020 May 25;2(2):lqaa039. doi: 10.1093/nargab/lqaa039. eCollection 2020 Jun.
2
ScCAEs: deep clustering of single-cell RNA-seq via convolutional autoencoder embedding and soft K-means.ScCAEs:基于卷积自动编码器嵌入和软 K-means 的单细胞 RNA-seq 深度聚类。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab321.
3
scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.scBGEDA:基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。
Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.
4
Denoising adaptive deep clustering with self-attention mechanism on single-cell sequencing data.基于自注意力机制的单细胞测序数据去噪自适应深度聚类
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad021.
5
Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network.基于自动编码器和图神经网络的单细胞 RNA-seq 数据深度结构聚类。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac018.
6
Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis.基于自动编码器的单细胞 RNA-seq 数据分析聚类集成。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 19):660. doi: 10.1186/s12859-019-3179-5.
7
Contrastive self-supervised clustering of scRNA-seq data.单细胞 RNA 测序数据的对比自监督聚类。
BMC Bioinformatics. 2021 May 27;22(1):280. doi: 10.1186/s12859-021-04210-8.
8
JOINT for large-scale single-cell RNA-sequencing analysis via soft-clustering and parallel computing.通过软聚类和并行计算进行大规模单细胞RNA测序分析的JOINT
BMC Genomics. 2021 Jan 11;22(1):47. doi: 10.1186/s12864-020-07302-6.
9
Deep Multi-Constraint Soft Clustering Analysis for Single-Cell RNA-Seq Data via Zero-Inflated Autoencoder Embedding.基于零膨胀自动编码器嵌入的单细胞 RNA-Seq 数据深度多约束软聚类分析。
IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):2254-2265. doi: 10.1109/TCBB.2023.3240253. Epub 2023 Jun 5.
10
scDCCA: deep contrastive clustering for single-cell RNA-seq data based on auto-encoder network.scDCCA:基于自动编码器网络的单细胞RNA测序数据深度对比聚类
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac625.

引用本文的文献

1
Comparative benchmarking of single-cell clustering algorithms for transcriptomic and proteomic data.用于转录组学和蛋白质组学数据的单细胞聚类算法的比较基准测试
Genome Biol. 2025 Sep 3;26(1):265. doi: 10.1186/s13059-025-03719-y.
2
Deep clustering of single-cell RNA-seq using adversarial graph contrastive learning.使用对抗性图对比学习对单细胞RNA测序进行深度聚类。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf423.
3
Less is more: improving cell-type identification with augmentation-free single-cell RNA-Seq contrastive learning.

本文引用的文献

1
scVAE: variational auto-encoders for single-cell gene expression data.scVAE:用于单细胞基因表达数据的变分自动编码器。
Bioinformatics. 2020 Aug 15;36(16):4415-4422. doi: 10.1093/bioinformatics/btaa293.
2
Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis.深度学习能够实现单细胞 RNA-seq 分析中具有批次效应去除功能的精确聚类。
Nat Commun. 2020 May 11;11(1):2338. doi: 10.1038/s41467-020-15851-3.
3
Droplet scRNA-seq is not zero-inflated.液滴单细胞RNA测序不存在零膨胀问题。
少即是多:通过无增强单细胞RNA测序对比学习改进细胞类型识别
Bioinformatics. 2025 Sep 1;41(9). doi: 10.1093/bioinformatics/btaf437.
4
Exploring machine learning strategies for single-cell transcriptomic analysis in wound healing.探索用于伤口愈合单细胞转录组分析的机器学习策略。
Burns Trauma. 2025 May 13;13:tkaf032. doi: 10.1093/burnst/tkaf032. eCollection 2025.
5
Soft graph clustering for single-cell RNA sequencing data.用于单细胞RNA测序数据的软图聚类
BMC Bioinformatics. 2025 Jul 25;26(1):195. doi: 10.1186/s12859-025-06231-z.
6
PhytoCluster: a generative deep learning model for clustering plant single-cell RNA-seq data.植物聚类:一种用于对植物单细胞RNA测序数据进行聚类的生成式深度学习模型。
aBIOTECH. 2025 Feb 20;6(2):189-201. doi: 10.1007/s42994-025-00196-6. eCollection 2025 Jun.
7
Differentiable graph clustering with structural grouping for single-cell RNA-seq data.用于单细胞RNA测序数据的具有结构分组的可微图聚类
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf347.
8
Navigating single-cell RNA-sequencing: protocols, tools, databases, and applications.探索单细胞RNA测序:方案、工具、数据库及应用
Genomics Inform. 2025 May 17;23(1):13. doi: 10.1186/s44342-025-00044-5.
9
An overview of computational methods in single-cell transcriptomic cell type annotation.单细胞转录组细胞类型注释中的计算方法概述。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf207.
10
FactVAE: a factorized variational autoencoder for single-cell multi-omics data integration analysis.FactVAE:用于单细胞多组学数据整合分析的因子分解变分自编码器。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf157.
Nat Biotechnol. 2020 Feb;38(2):147-150. doi: 10.1038/s41587-019-0379-5.
4
Combined single-cell and spatial transcriptomics reveal the molecular, cellular and spatial bone marrow niche organization.单细胞与空间转录组学联合分析揭示了骨髓生态位的分子、细胞和空间组织。
Nat Cell Biol. 2020 Jan;22(1):38-48. doi: 10.1038/s41556-019-0439-6. Epub 2019 Dec 23.
5
SCALE method for single-cell ATAC-seq analysis via latent feature extraction.基于潜在特征提取的单细胞 ATAC-seq 分析的 SCALE 方法。
Nat Commun. 2019 Oct 8;10(1):4576. doi: 10.1038/s41467-019-12630-7.
6
Exploring single-cell data with deep multitasking neural networks.用深度多任务神经网络探索单细胞数据。
Nat Methods. 2019 Nov;16(11):1139-1145. doi: 10.1038/s41592-019-0576-7. Epub 2019 Oct 7.
7
Deep learning: new computational modelling techniques for genomics.深度学习:基因组学的新计算建模技术。
Nat Rev Genet. 2019 Jul;20(7):389-403. doi: 10.1038/s41576-019-0122-6.
8
Cell-specific network constructed by single-cell RNA sequencing data.基于单细胞 RNA 测序数据构建的细胞特异性网络。
Nucleic Acids Res. 2019 Jun 20;47(11):e62. doi: 10.1093/nar/gkz172.
9
Single-cell RNA-seq denoising using a deep count autoencoder.基于深度计数自编码器的单细胞 RNA-seq 去噪。
Nat Commun. 2019 Jan 23;10(1):390. doi: 10.1038/s41467-018-07931-2.
10
Semisoft clustering of single-cell data.单细胞数据的半软聚类。
Proc Natl Acad Sci U S A. 2019 Jan 8;116(2):466-471. doi: 10.1073/pnas.1817715116. Epub 2018 Dec 26.