• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于单细胞RNA测序数据的软图聚类

Soft graph clustering for single-cell RNA sequencing data.

作者信息

Xu Ping, Wang Pengfei, Ning Zhiyuan, Xiao Meng, Wu Min, Zhou Yuanchun

机构信息

Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100083, China.

University of Chinese Academy of Sciences, Beijing, 100864, China.

出版信息

BMC Bioinformatics. 2025 Jul 25;26(1):195. doi: 10.1186/s12859-025-06231-z.

DOI:10.1186/s12859-025-06231-z
PMID:40713495
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12291377/
Abstract

BACKGROUND

Clustering analysis is fundamental in single-cell RNA sequencing (scRNA-seq) data analysis for elucidating cellular heterogeneity and diversity. Recent graph-based scRNA-seq clustering methods, particularly graph neural networks (GNNs), have significantly improved in tackling the challenges of high-dimension, high-sparsity, and frequent dropout events that lead to ambiguous cell population boundaries. However, one major challenge for GNN-based methods is their reliance on hard graph constructions derived from similarity matrices. These constructions introduce difficulties when applied to scRNA-seq data due to: (i) The simplification of intercellular relationships into binary edges (0 or 1) by applying thresholds, which restricts the capture of continuous similarity features among cells and leads to significant information loss. (ii) The presence of significant inter-cluster connections within hard graphs, which can confuse GNN methods that rely heavily on graph structures, potentially causing erroneous message propagation and biased clustering outcomes.

RESULTS

To tackle these challenges, we introduce scSGC, a Soft Graph Clustering for single-cell RNA sequencing data, which aims to more accurately characterize continuous similarities among cells through non-binary edge weights, thereby mitigating the limitations of rigid data structures. The scSGC framework comprises three core components: (i) a zero-inflated negative binomial (ZINB)-based feature autoencoder designed to effectively handle the sparsity and dropout issues in scRNA-seq data; (ii) a dual-channel cut-informed soft graph embedding module, constructed through deep graph-cut information, capturing continuous similarities between cells while preserving the intrinsic data structures of scRNA-seq; and (iii) an optimal transport-based clustering optimization module, achieving optimal delineation of cell populations while maintaining high biological relevance.

CONCLUSION

By integrating dual-channel cut-informed soft graph representation learning, a ZINB-based feature autoencoder, and optimal transport-driven clustering optimization, scSGC effectively overcomes the challenges associated with traditional hard graph constructions in GNN methods. Extensive experiments across ten datasets demonstrate that scSGC outperforms 13 state-of-the-art clustering models in clustering accuracy, cell type annotation, and computational efficiency. These results highlight its substantial potential to advance scRNA-seq data analysis and deepen our understanding of cellular heterogeneity.

摘要

背景

聚类分析是单细胞RNA测序(scRNA-seq)数据分析中阐明细胞异质性和多样性的基础。最近基于图的scRNA-seq聚类方法,特别是图神经网络(GNN),在应对导致细胞群体边界模糊的高维、高稀疏性和频繁缺失事件的挑战方面有了显著改进。然而,基于GNN的方法面临的一个主要挑战是它们依赖于从相似性矩阵派生的硬图构建。由于以下原因,这些构建在应用于scRNA-seq数据时会带来困难:(i)通过应用阈值将细胞间关系简化为二元边(0或1),这限制了对细胞间连续相似性特征的捕获并导致大量信息丢失。(ii)硬图中存在显著的簇间连接,这可能会使严重依赖图结构的GNN方法产生混淆,潜在地导致错误的消息传播和有偏差的聚类结果。

结果

为应对这些挑战,我们引入了scSGC,一种用于单细胞RNA测序数据的软图聚类方法,旨在通过非二元边权重更准确地表征细胞间的连续相似性,从而减轻刚性数据结构的局限性。scSGC框架包含三个核心组件:(i)一个基于零膨胀负二项分布(ZINB)的特征自动编码器,旨在有效处理scRNA-seq数据中的稀疏性和缺失问题;(ii)一个双通道割集信息软图嵌入模块,通过深度图割信息构建,在保留scRNA-seq固有数据结构的同时捕获细胞间的连续相似性;(iii)一个基于最优传输的聚类优化模块,在保持高生物学相关性的同时实现细胞群体的最优划分。

结论

通过整合双通道割集信息软图表示学习、基于ZINB的特征自动编码器和最优传输驱动的聚类优化,scSGC有效克服了GNN方法中与传统硬图构建相关的挑战。在十个数据集上进行广泛实验表明,scSGC在聚类准确性、细胞类型注释和计算效率方面优于13种先进的聚类模型。这些结果凸显了其在推进scRNA-seq数据分析和深化我们对细胞异质性理解方面的巨大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/c16d7bebad20/12859_2025_6231_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/270074d426c6/12859_2025_6231_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/511fa8f8f8ee/12859_2025_6231_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/cd29af2a6212/12859_2025_6231_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/914cd299f54b/12859_2025_6231_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/0617479ace57/12859_2025_6231_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/f79cc33c9f42/12859_2025_6231_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/0c72bac0efe8/12859_2025_6231_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/a1cfcd5a39a4/12859_2025_6231_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/c16d7bebad20/12859_2025_6231_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/270074d426c6/12859_2025_6231_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/511fa8f8f8ee/12859_2025_6231_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/cd29af2a6212/12859_2025_6231_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/914cd299f54b/12859_2025_6231_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/0617479ace57/12859_2025_6231_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/f79cc33c9f42/12859_2025_6231_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/0c72bac0efe8/12859_2025_6231_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/a1cfcd5a39a4/12859_2025_6231_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9230/12291377/c16d7bebad20/12859_2025_6231_Fig9_HTML.jpg

相似文献

1
Soft graph clustering for single-cell RNA sequencing data.用于单细胞RNA测序数据的软图聚类
BMC Bioinformatics. 2025 Jul 25;26(1):195. doi: 10.1186/s12859-025-06231-z.
2
Differentiable graph clustering with structural grouping for single-cell RNA-seq data.用于单细胞RNA测序数据的具有结构分组的可微图聚类
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf347.
3
scGANSL: Graph Attention Network with Subspace Learning for scRNA-seq Data Clustering.scGANSL:用于scRNA-seq数据聚类的带子空间学习的图注意力网络
J Chem Inf Model. 2025 Jun 23;65(12):6367-6381. doi: 10.1021/acs.jcim.5c00731. Epub 2025 Jun 5.
4
Single-cell RNA sequencing data analysis utilizing multi-type graph neural networks.利用多种类型图神经网络进行单细胞 RNA 测序数据分析。
Comput Biol Med. 2024 Sep;179:108921. doi: 10.1016/j.compbiomed.2024.108921. Epub 2024 Jul 25.
5
scGGC: a two-stage strategy for single-cell clustering through cellular gene pathway construction.scGGC:一种通过细胞基因通路构建进行单细胞聚类的两阶段策略。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf368.
6
stGNN: Spatially Informed Cell-Type Deconvolution Based on Deep Graph Learning and Statistical Modeling.stGNN:基于深度图学习和统计建模的空间信息细胞类型反卷积
Interdiscip Sci. 2025 Jun 26. doi: 10.1007/s12539-025-00728-0.
7
ScAGCN: Graph Convolutional Network with Adaptive Aggregation Mechanism for scRNA-seq Data Dimensionality Reduction.ScAGCN:用于单细胞RNA测序数据降维的具有自适应聚合机制的图卷积网络
Interdiscip Sci. 2025 Apr 25. doi: 10.1007/s12539-025-00702-w.
8
Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network.基于自动编码器和图神经网络的单细胞 RNA-seq 数据深度结构聚类。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac018.
9
Reference Vector-guided Evolutionary Algorithm for cluster analysis of single-cell transcriptomes.用于单细胞转录组聚类分析的参考向量引导进化算法
Comput Methods Programs Biomed. 2025 Sep;269:108873. doi: 10.1016/j.cmpb.2025.108873. Epub 2025 Jun 6.
10
scRECL: representative ensembles with contrastive learning for scRNA-seq data clustering analysis.scRECL:用于scRNA序列数据聚类分析的具有对比学习的代表性集成方法
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf346.

本文引用的文献

1
scCompass: An Integrated Multi-Species scRNA-seq Database for AI-Ready.scCompass:一个适用于人工智能的集成多物种单细胞RNA测序数据库。
Adv Sci (Weinh). 2025 Jul;12(25):e2500870. doi: 10.1002/advs.202500870. Epub 2025 May 2.
2
GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model.基因指南针:基于知识驱动的跨物种基础模型解析通用基因调控机制
Cell Res. 2024 Dec;34(12):830-845. doi: 10.1038/s41422-024-01034-y. Epub 2024 Oct 8.
3
Attention-based deep clustering method for scRNA-seq cell type identification.
基于注意力机制的深度聚类方法在 scRNA-seq 细胞类型鉴定中的应用。
PLoS Comput Biol. 2023 Nov 10;19(11):e1011641. doi: 10.1371/journal.pcbi.1011641. eCollection 2023 Nov.
4
scDSSC: Deep Sparse Subspace Clustering for scRNA-seq Data.scDSSC:用于 scRNA-seq 数据的深度稀疏子空间聚类。
PLoS Comput Biol. 2022 Dec 19;18(12):e1010772. doi: 10.1371/journal.pcbi.1010772. eCollection 2022 Dec.
5
Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network.基于自动编码器和图神经网络的单细胞 RNA-seq 数据深度结构聚类。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac018.
6
scNAME: neighborhood contrastive clustering with ancillary mask estimation for scRNA-seq data.scNAME:基于辅助掩模估计的 scRNA-seq 数据邻域对比聚类。
Bioinformatics. 2022 Mar 4;38(6):1575-1583. doi: 10.1093/bioinformatics/btac011.
7
GNN-based embedding for clustering scRNA-seq data.基于图神经网络的 scRNA-seq 数据聚类嵌入方法。
Bioinformatics. 2022 Jan 27;38(4):1037-1044. doi: 10.1093/bioinformatics/btab787.
8
A topology-preserving dimensionality reduction method for single-cell RNA-seq data using graph autoencoder.基于图自动编码器的单细胞 RNA-seq 数据拓扑保持降维方法。
Sci Rep. 2021 Oct 8;11(1):20028. doi: 10.1038/s41598-021-99003-7.
9
Contrastive self-supervised clustering of scRNA-seq data.单细胞 RNA 测序数据的对比自监督聚类。
BMC Bioinformatics. 2021 May 27;22(1):280. doi: 10.1186/s12859-021-04210-8.
10
scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses.scGNN 是一种用于单细胞 RNA-Seq 分析的新型图神经网络框架。
Nat Commun. 2021 Mar 25;12(1):1882. doi: 10.1038/s41467-021-22197-x.