• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

scTPC:一种用于 scRNA-seq 数据的新型半监督深度聚类模型。

scTPC: a novel semisupervised deep clustering model for scRNA-seq data.

机构信息

School of Mathematical Sciences, Shenzhen University, Shenzhen, Guangdong 518000, China.

School of Mathematics, Renmin University of China, Haidian District, Beijing 100872, China.

出版信息

Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae293.

DOI:10.1093/bioinformatics/btae293
PMID:38684178
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11091743/
Abstract

MOTIVATION

Continuous advancements in single-cell RNA sequencing (scRNA-seq) technology have enabled researchers to further explore the study of cell heterogeneity, trajectory inference, identification of rare cell types, and neurology. Accurate scRNA-seq data clustering is crucial in single-cell sequencing data analysis. However, the high dimensionality, sparsity, and presence of "false" zero values in the data can pose challenges to clustering. Furthermore, current unsupervised clustering algorithms have not effectively leveraged prior biological knowledge, making cell clustering even more challenging.

RESULTS

This study investigates a semisupervised clustering model called scTPC, which integrates the triplet constraint, pairwise constraint, and cross-entropy constraint based on deep learning. Specifically, the model begins by pretraining a denoising autoencoder based on a zero-inflated negative binomial distribution. Deep clustering is then performed in the learned latent feature space using triplet constraints and pairwise constraints generated from partial labeled cells. Finally, to address imbalanced cell-type datasets, a weighted cross-entropy loss is introduced to optimize the model. A series of experimental results on 10 real scRNA-seq datasets and five simulated datasets demonstrate that scTPC achieves accurate clustering with a well-designed framework.

AVAILABILITY AND IMPLEMENTATION

scTPC is a Python-based algorithm, and the code is available from https://github.com/LF-Yang/Code or https://zenodo.org/records/10951780.

摘要

动机

单细胞 RNA 测序 (scRNA-seq) 技术的不断进步,使研究人员能够进一步探索细胞异质性、轨迹推断、稀有细胞类型的鉴定和神经科学的研究。准确的 scRNA-seq 数据聚类在单细胞测序数据分析中至关重要。然而,数据的高维性、稀疏性和“假”零值的存在给聚类带来了挑战。此外,当前的无监督聚类算法尚未有效利用先验生物学知识,使得细胞聚类更加困难。

结果

本研究调查了一种称为 scTPC 的半监督聚类模型,该模型基于深度学习整合了三元组约束、成对约束和交叉熵约束。具体来说,该模型首先基于零膨胀负二项分布预训练去噪自动编码器。然后,在学习到的潜在特征空间中使用来自部分标记细胞的三元组约束和成对约束进行深度聚类。最后,为了解决细胞类型数据集不平衡的问题,引入加权交叉熵损失来优化模型。在 10 个真实的 scRNA-seq 数据集和 5 个模拟数据集上进行的一系列实验结果表明,scTPC 实现了基于精心设计框架的准确聚类。

可用性和实现

scTPC 是一个基于 Python 的算法,代码可在 https://github.com/LF-Yang/Code 或 https://zenodo.org/records/10951780 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/385d2db2d6de/btae293f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/94678ac05fe2/btae293f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/0cc1867c2e20/btae293f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/cbf9d469634b/btae293f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/8414dcf4a384/btae293f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/8160d4ac392f/btae293f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/63f6a7ccc6dd/btae293f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/8decc4b675e4/btae293f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/385d2db2d6de/btae293f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/94678ac05fe2/btae293f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/0cc1867c2e20/btae293f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/cbf9d469634b/btae293f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/8414dcf4a384/btae293f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/8160d4ac392f/btae293f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/63f6a7ccc6dd/btae293f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/8decc4b675e4/btae293f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ffa/11091743/385d2db2d6de/btae293f8.jpg

相似文献

1
scTPC: a novel semisupervised deep clustering model for scRNA-seq data.scTPC:一种用于 scRNA-seq 数据的新型半监督深度聚类模型。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae293.
2
scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.scBGEDA:基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。
Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.
3
scDCCA: deep contrastive clustering for single-cell RNA-seq data based on auto-encoder network.scDCCA:基于自动编码器网络的单细胞RNA测序数据深度对比聚类
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac625.
4
Deep enhanced constraint clustering based on contrastive learning for scRNA-seq data.基于对比学习的深度增强约束聚类算法在单细胞 RNA-seq 数据分析中的应用。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad222.
5
scZAG: Integrating ZINB-Based Autoencoder with Adaptive Data Augmentation Graph Contrastive Learning for scRNA-seq Clustering.scZAG:基于 ZINB 的自动编码器与自适应数据增强图对比学习在 scRNA-seq 聚类中的整合。
Int J Mol Sci. 2024 May 29;25(11):5976. doi: 10.3390/ijms25115976.
6
Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network.基于自动编码器和图神经网络的单细胞 RNA-seq 数据深度结构聚类。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac018.
7
Deep Multi-Constraint Soft Clustering Analysis for Single-Cell RNA-Seq Data via Zero-Inflated Autoencoder Embedding.基于零膨胀自动编码器嵌入的单细胞 RNA-Seq 数据深度多约束软聚类分析。
IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):2254-2265. doi: 10.1109/TCBB.2023.3240253. Epub 2023 Jun 5.
8
Attention-based deep clustering method for scRNA-seq cell type identification.基于注意力机制的深度聚类方法在 scRNA-seq 细胞类型鉴定中的应用。
PLoS Comput Biol. 2023 Nov 10;19(11):e1011641. doi: 10.1371/journal.pcbi.1011641. eCollection 2023 Nov.
9
scGAC: a graph attentional architecture for clustering single-cell RNA-seq data.scGAC:一种用于聚类单细胞 RNA-seq 数据的图注意力架构。
Bioinformatics. 2022 Apr 12;38(8):2187-2193. doi: 10.1093/bioinformatics/btac099.
10
nsDCC: dual-level contrastive clustering with nonuniform sampling for scRNA-seq data analysis.nsDCC:基于非均匀采样的双层对比聚类算法,用于 scRNA-seq 数据分析。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae477.

引用本文的文献

1
scGGC: a two-stage strategy for single-cell clustering through cellular gene pathway construction.scGGC:一种通过细胞基因通路构建进行单细胞聚类的两阶段策略。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf368.
2
scRECL: representative ensembles with contrastive learning for scRNA-seq data clustering analysis.scRECL:用于scRNA序列数据聚类分析的具有对比学习的代表性集成方法
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf346.
3
SCassist: an AI based workflow assistant for single-cell analysis.SCassist:一种用于单细胞分析的基于人工智能的工作流程助手。

本文引用的文献

1
scSemiAAE: a semi-supervised clustering model for single-cell RNA-seq data.scSemiAAE:一种用于单细胞 RNA-seq 数据的半监督聚类模型。
BMC Bioinformatics. 2023 May 26;24(1):217. doi: 10.1186/s12859-023-05339-4.
2
Clustering of single-cell multi-omics data with a multimodal deep learning method.基于多模态深度学习方法的单细胞多组学数据聚类。
Nat Commun. 2022 Dec 13;13(1):7705. doi: 10.1038/s41467-022-35031-9.
3
scSSA: A clustering method for single cell RNA-seq data based on semi-supervised autoencoder.scSSA:一种基于半监督自动编码器的单细胞RNA测序数据聚类方法。
Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf402.
4
PhytoCluster: a generative deep learning model for clustering plant single-cell RNA-seq data.植物聚类:一种用于对植物单细胞RNA测序数据进行聚类的生成式深度学习模型。
aBIOTECH. 2025 Feb 20;6(2):189-201. doi: 10.1007/s42994-025-00196-6. eCollection 2025 Jun.
5
SCassist: An AI Based Workflow Assistant for Single-Cell Analysis.SCassist:一款基于人工智能的单细胞分析工作流程助手。
bioRxiv. 2025 Apr 28:2025.04.22.650107. doi: 10.1101/2025.04.22.650107.
6
A robust multi-scale clustering framework for single-cell RNA-seq data analysis.一种用于单细胞RNA测序数据分析的强大多尺度聚类框架。
Sci Rep. 2025 May 27;15(1):18543. doi: 10.1038/s41598-025-03603-6.
7
scSAMAC: saliency-adjusted masking induced attention contrastive learning for single-cell clustering.scSAMAC:用于单细胞聚类的显著性调整掩膜诱导注意力对比学习
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf128.
8
SpaGIC: graph-informed clustering in spatial transcriptomics via self-supervised contrastive learning.SpaGIC:基于自监督对比学习的空间转录组学图信息聚类。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae578.
Methods. 2022 Dec;208:66-74. doi: 10.1016/j.ymeth.2022.10.006. Epub 2022 Oct 28.
4
Network-Based Structural Learning Nonnegative Matrix Factorization Algorithm for Clustering of scRNA-Seq Data.用于scRNA-Seq数据聚类的基于网络的结构学习非负矩阵分解算法
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):566-575. doi: 10.1109/TCBB.2022.3161131. Epub 2023 Feb 3.
5
Network-based integrative analysis of single-cell transcriptomic and epigenomic data for cell types.基于网络的单细胞转录组学和表观基因组学数据对细胞类型的整合分析。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab546.
6
scNAME: neighborhood contrastive clustering with ancillary mask estimation for scRNA-seq data.scNAME:基于辅助掩模估计的 scRNA-seq 数据邻域对比聚类。
Bioinformatics. 2022 Mar 4;38(6):1575-1583. doi: 10.1093/bioinformatics/btac011.
7
Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data.基于模型的深度学习嵌入方法用于单细胞 RNA-seq 数据的约束聚类分析。
Nat Commun. 2021 Mar 25;12(1):1873. doi: 10.1038/s41467-021-22008-3.
8
Deep soft -means clustering with self-training for single-cell RNA sequence data.用于单细胞RNA序列数据的基于自训练的深度软均值聚类
NAR Genom Bioinform. 2020 May 25;2(2):lqaa039. doi: 10.1093/nargab/lqaa039. eCollection 2020 Jun.
9
jSRC: a flexible and accurate joint learning algorithm for clustering of single-cell RNA-sequencing data.jSRC:一种用于单细胞 RNA-seq 数据聚类的灵活准确的联合学习算法。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa433.
10
Single-cell RNA-seq data semi-supervised clustering and annotation via structural regularized domain adaptation.基于结构正则化领域自适应的单细胞 RNA-seq 数据半监督聚类和注释。
Bioinformatics. 2021 May 5;37(6):775-784. doi: 10.1093/bioinformatics/btaa908.