• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

scDRMAE:集成掩蔽自动编码器和残差注意力网络,利用组学特征依赖性进行准确的细胞聚类。

scDRMAE: integrating masked autoencoder with residual attention networks to leverage omics feature dependencies for accurate cell clustering.

机构信息

Department of Computer Science and Technology, College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China.

Department of Computer Science and Technology, Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China.

出版信息

Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae599.

DOI:10.1093/bioinformatics/btae599
PMID:39404795
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11513018/
Abstract

MOTIVATION

Cell clustering is foundational for analyzing the heterogeneity of biological tissues using single-cell sequencing data. With the maturation of single-cell multi-omics sequencing technologies, we can integrate multiple omics data to perform cell clustering, thereby overcoming the limitations of insufficient information from single omics data. Existing methods for cell clustering often only consider the differences in data patterns during the analysis of multi-omics data, but the dependencies between omics features of different cell types also significantly influence cell clustering. Moreover, the high dropout rates in scRNA-seq and scATAC-seq data can impact the performance of cell clustering.

RESULTS

We propose a cell clustering model based on a masked autoencoder, scDRMAE. Utilizing a masking mechanism, scDRMAE effectively learns the relationships between different features and imputes false zeros caused by dropout events. To differentiate the importance of various omics data in cell clustering, we dynamically adjust the weights of different omics data through an attention mechanism. Finally, we use the K-means algorithm for cluster analysis of the fused multi-omics data. On commonly used sets of 15 multi-omics datasets, our method demonstrates superior cell clustering performance on multiple metrics compared to other computational methods. In addition, when datasets exhibit varying degrees of dropout noise, our method shows better performance and stronger stability on multiple metrics compared to other methods. Moreover, by analyzing the cell clusters classified by scDRMAE, we identified several biologically significant biomarkers that have been validated, further confirming the effectiveness of scDRMAE in cell clustering from a biological perspective.

摘要

动机

细胞聚类是使用单细胞测序数据分析生物组织异质性的基础。随着单细胞多组学测序技术的成熟,我们可以整合多个组学数据进行细胞聚类,从而克服单一组学数据信息量不足的限制。现有的细胞聚类方法通常只考虑多组学数据分析中数据模式的差异,但不同细胞类型的组学特征之间的依赖性也会显著影响细胞聚类。此外,scRNA-seq 和 scATAC-seq 数据中的高缺失率会影响细胞聚类的性能。

结果

我们提出了一种基于掩蔽自动编码器的细胞聚类模型,scDRMAE。利用掩蔽机制,scDRMAE 可以有效地学习不同特征之间的关系,并对由缺失事件引起的假零进行插补。为了区分不同组学数据在细胞聚类中的重要性,我们通过注意力机制动态调整不同组学数据的权重。最后,我们使用 K-means 算法对融合的多组学数据进行聚类分析。在常用的 15 个多组学数据集上,与其他计算方法相比,我们的方法在多个指标上表现出更好的细胞聚类性能。此外,当数据集表现出不同程度的缺失噪声时,与其他方法相比,我们的方法在多个指标上表现出更好的性能和更强的稳定性。此外,通过分析 scDRMAE 分类的细胞簇,我们鉴定出了一些已被验证的具有生物学意义的生物标志物,进一步从生物学角度证实了 scDRMAE 在细胞聚类中的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/87f7d9a2238b/btae599f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/3f7ac15c5ea4/btae599f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/d84793cbcf74/btae599f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/cb5a263439b6/btae599f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/8702d5b06f26/btae599f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/58e90d5ac811/btae599f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/5ba84fe7a753/btae599f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/240acedd2166/btae599f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/87f7d9a2238b/btae599f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/3f7ac15c5ea4/btae599f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/d84793cbcf74/btae599f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/cb5a263439b6/btae599f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/8702d5b06f26/btae599f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/58e90d5ac811/btae599f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/5ba84fe7a753/btae599f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/240acedd2166/btae599f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a1f/11513018/87f7d9a2238b/btae599f8.jpg

相似文献

1
scDRMAE: integrating masked autoencoder with residual attention networks to leverage omics feature dependencies for accurate cell clustering.scDRMAE:集成掩蔽自动编码器和残差注意力网络,利用组学特征依赖性进行准确的细胞聚类。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae599.
2
scMAE: a masked autoencoder for single-cell RNA-seq clustering.scMAE:一种用于单细胞 RNA-seq 聚类的掩蔽自动编码器。
Bioinformatics. 2024 Jan 2;40(1). doi: 10.1093/bioinformatics/btae020.
3
Single-cell RNA sequencing data analysis utilizing multi-type graph neural networks.利用多种类型图神经网络进行单细胞 RNA 测序数据分析。
Comput Biol Med. 2024 Sep;179:108921. doi: 10.1016/j.compbiomed.2024.108921. Epub 2024 Jul 25.
4
DEMOC: a deep embedded multi-omics learning approach for clustering single-cell CITE-seq data.DEMOC:一种用于聚类单细胞 CITE-seq 数据的深度嵌入式多组学学习方法。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac347.
5
scMIC: A Deep Multi-Level Information Fusion Framework for Clustering Single-Cell Multi-Omics Data.scMIC:一种用于聚类单细胞多组学数据的深度多层信息融合框架。
IEEE J Biomed Health Inform. 2023 Dec;27(12):6121-6132. doi: 10.1109/JBHI.2023.3317272. Epub 2023 Dec 5.
6
scDFN: enhancing single-cell RNA-seq clustering with deep fusion networks.scDFN:利用深度融合网络增强单细胞 RNA-seq 聚类
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae486.
7
scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.scBGEDA:基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。
Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.
8
Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis.基于自动编码器的单细胞 RNA-seq 数据分析聚类集成。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 19):660. doi: 10.1186/s12859-019-3179-5.
9
scMNMF: a novel method for single-cell multi-omics clustering based on matrix factorization.scMNMF:一种基于矩阵分解的单细胞多组学聚类新方法。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae228.
10
Clustering single-cell multi-omics data via graph regularized multi-view ensemble learning.通过图正则化多视图集成学习对单细胞多组学数据进行聚类。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae169.

引用本文的文献

1
spaMGCN: a graph convolutional network with autoencoder for spatial domain identification using multi-scale adaptation.spaMGCN:一种带有自动编码器的图卷积网络,用于通过多尺度自适应进行空间域识别。
Genome Biol. 2025 Jun 10;26(1):159. doi: 10.1186/s13059-025-03637-z.

本文引用的文献

1
scLEGA: an attention-based deep clustering method with a tendency for low expression of genes on single-cell RNA-seq data.scLEGA:一种基于注意力的深度聚类方法,在单细胞 RNA-seq 数据中倾向于低表达基因。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae371.
2
scZAG: Integrating ZINB-Based Autoencoder with Adaptive Data Augmentation Graph Contrastive Learning for scRNA-seq Clustering.scZAG:基于 ZINB 的自动编码器与自适应数据增强图对比学习在 scRNA-seq 聚类中的整合。
Int J Mol Sci. 2024 May 29;25(11):5976. doi: 10.3390/ijms25115976.
3
Clustering single-cell RNA sequencing data via iterative smoothing and self-supervised discriminative embedding.
通过迭代平滑和自监督判别嵌入对单细胞 RNA 测序数据进行聚类。
Oncogene. 2024 Jul;43(29):2279-2292. doi: 10.1038/s41388-024-03074-5. Epub 2024 Jun 4.
4
scMNMF: a novel method for single-cell multi-omics clustering based on matrix factorization.scMNMF:一种基于矩阵分解的单细胞多组学聚类新方法。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae228.
5
GTADC: A Graph-Based Method for Inferring Cell Spatial Distribution in Cancer Tissues.GTADC:一种基于图的推断癌症组织中细胞空间分布的方法。
Biomolecules. 2024 Apr 3;14(4):436. doi: 10.3390/biom14040436.
6
CASCC: a co-expression-assisted single-cell RNA-seq data clustering method.CASCC:一种基于共表达辅助的单细胞 RNA-seq 数据聚类方法。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae283.
7
CTEC: a cross-tabulation ensemble clustering approach for single-cell RNA sequencing data analysis.CTEC:一种用于单细胞 RNA 测序数据分析的交叉制表集成聚类方法。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae130.
8
scMLC: an accurate and robust multiplex community detection method for single-cell multi-omics data.scMLC:一种用于单细胞多组学数据的准确稳健的多重社团检测方法。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae101.
9
Effective multi-modal clustering method via skip aggregation network for parallel scRNA-seq and scATAC-seq data.基于 Skip Aggregation Network 的有效多模态聚类方法,用于平行 scRNA-seq 和 scATAC-seq 数据。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae102.
10
scMAE: a masked autoencoder for single-cell RNA-seq clustering.scMAE:一种用于单细胞 RNA-seq 聚类的掩蔽自动编码器。
Bioinformatics. 2024 Jan 2;40(1). doi: 10.1093/bioinformatics/btae020.