scBGEDA：基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。

scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.

机构信息

School of Artificial Intelligence, Hebei University of Technology, Tianjin, China.

School of Artificial Intelligence, Jilin University, Jilin, China.

出版信息

Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.

DOI:10.1093/bioinformatics/btad075

PMID:36734596

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9925104/

Abstract

MOTIVATION

Single-cell RNA sequencing (scRNA-seq) is an increasingly popular technique for transcriptomic analysis of gene expression at the single-cell level. Cell-type clustering is the first crucial task in the analysis of scRNA-seq data that facilitates accurate identification of cell types and the study of the characteristics of their transcripts. Recently, several computational models based on a deep autoencoder and the ensemble clustering have been developed to analyze scRNA-seq data. However, current deep autoencoders are not sufficient to learn the latent representations of scRNA-seq data, and obtaining consensus partitions from these feature representations remains under-explored.

RESULTS

To address this challenge, we propose a single-cell deep clustering model via a dual denoising autoencoder with bipartite graph ensemble clustering called scBGEDA, to identify specific cell populations in single-cell transcriptome profiles. First, a single-cell dual denoising autoencoder network is proposed to project the data into a compressed low-dimensional space and that can learn feature representation via explicit modeling of synergistic optimization of the zero-inflated negative binomial reconstruction loss and denoising reconstruction loss. Then, a bipartite graph ensemble clustering algorithm is designed to exploit the relationships between cells and the learned latent embedded space by means of a graph-based consensus function. Multiple comparison experiments were conducted on 20 scRNA-seq datasets from different sequencing platforms using a variety of clustering metrics. The experimental results indicated that scBGEDA outperforms other state-of-the-art methods on these datasets, and also demonstrated its scalability to large-scale scRNA-seq datasets. Moreover, scBGEDA was able to identify cell-type specific marker genes and provide functional genomic analysis by quantifying the influence of genes on cell clusters, bringing new insights into identifying cell types and characterizing the scRNA-seq data from different perspectives.

AVAILABILITY AND IMPLEMENTATION

The source code of scBGEDA is available at https://github.com/wangyh082/scBGEDA. The software and the supporting data can be downloaded from https://figshare.com/articles/software/scBGEDA/19657911.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

单细胞 RNA 测序（scRNA-seq）是一种在单细胞水平上进行基因表达转录组分析的越来越流行的技术。细胞类型聚类是 scRNA-seq 数据分析的第一个关键任务，它有助于准确识别细胞类型并研究其转录本的特征。最近，已经开发了几种基于深度自动编码器和集成聚类的计算模型来分析 scRNA-seq 数据。然而，当前的深度自动编码器不足以学习 scRNA-seq 数据的潜在表示，并且从这些特征表示中获得共识分区仍然没有得到充分探索。

结果

为了解决这个挑战，我们提出了一种通过具有二部图集成聚类的双去噪自动编码器的单细胞深度学习聚类模型 scBGEDA，用于识别单细胞转录组图谱中的特定细胞群体。首先，提出了一种单细胞双去噪自动编码器网络，将数据投影到一个压缩的低维空间中，并通过协同优化零膨胀负二项式重建损失和去噪重建损失的显式建模来学习特征表示。然后，设计了一个二部图集成聚类算法，通过基于图的共识函数来利用细胞之间的关系和学习到的潜在嵌入空间。使用多种聚类指标在来自不同测序平台的 20 个 scRNA-seq 数据集上进行了多项比较实验。实验结果表明，scBGEDA 在这些数据集上优于其他最先进的方法，并且还证明了其对大规模 scRNA-seq 数据集的可扩展性。此外，scBGEDA 能够识别细胞类型特异性标记基因，并通过量化基因对细胞簇的影响提供功能基因组分析，从而从不同角度对识别细胞类型和描述 scRNA-seq 数据提供新的见解。

可用性和实现

scBGEDA 的源代码可在 https://github.com/wangyh082/scBGEDA 上获得。软件和支持数据可从 https://figshare.com/articles/software/scBGEDA/19657911 下载。

补充信息

补充数据可在《生物信息学》在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dfb/9925104/f6079e110645/btad075f1.jpg

相似文献

scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.scBGEDA：基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。

Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.

Deep structural clustering for single-cell RNA-seq data jointly through autoencoder and graph neural network.基于自动编码器和图神经网络的单细胞 RNA-seq 数据深度结构聚类。

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac018.

scGAC: a graph attentional architecture for clustering single-cell RNA-seq data.scGAC：一种用于聚类单细胞 RNA-seq 数据的图注意力架构。

Bioinformatics. 2022 Apr 12;38(8):2187-2193. doi: 10.1093/bioinformatics/btac099.

Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis.基于自动编码器的单细胞 RNA-seq 数据分析聚类集成。

BMC Bioinformatics. 2019 Dec 24;20(Suppl 19):660. doi: 10.1186/s12859-019-3179-5.

scDCCA: deep contrastive clustering for single-cell RNA-seq data based on auto-encoder network.scDCCA：基于自动编码器网络的单细胞RNA测序数据深度对比聚类

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac625.

scTPC: a novel semisupervised deep clustering model for scRNA-seq data.scTPC：一种用于 scRNA-seq 数据的新型半监督深度聚类模型。

Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae293.

Single-cell RNA sequencing data analysis utilizing multi-type graph neural networks.利用多种类型图神经网络进行单细胞 RNA 测序数据分析。

Comput Biol Med. 2024 Sep;179:108921. doi: 10.1016/j.compbiomed.2024.108921. Epub 2024 Jul 25.

scZAG: Integrating ZINB-Based Autoencoder with Adaptive Data Augmentation Graph Contrastive Learning for scRNA-seq Clustering.scZAG：基于 ZINB 的自动编码器与自适应数据增强图对比学习在 scRNA-seq 聚类中的整合。

Int J Mol Sci. 2024 May 29;25(11):5976. doi: 10.3390/ijms25115976.

Denoising adaptive deep clustering with self-attention mechanism on single-cell sequencing data.基于自注意力机制的单细胞测序数据去噪自适应深度聚类

Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad021.

Attention-based deep clustering method for scRNA-seq cell type identification.基于注意力机制的深度聚类方法在 scRNA-seq 细胞类型鉴定中的应用。

PLoS Comput Biol. 2023 Nov 10;19(11):e1011641. doi: 10.1371/journal.pcbi.1011641. eCollection 2023 Nov.

引用本文的文献

A hybrid adversarial autoencoder-graph network model with dynamic fusion for robust scRNA-seq clustering.一种具有动态融合的混合对抗自编码器-图网络模型，用于稳健的单细胞RNA测序聚类。

BMC Genomics. 2025 Aug 18;26(1):749. doi: 10.1186/s12864-025-11941-y.

iVAE: an interpretable representation learning framework enhances clustering performance for single-cell data.iVAE：一种可解释的表示学习框架提升单细胞数据的聚类性能。

BMC Biol. 2025 Jul 15;23(1):213. doi: 10.1186/s12915-025-02315-7.

scMUG: deep clustering analysis of single-cell RNA-seq data on multiple gene functional modules.scMUG：基于多个基因功能模块的单细胞RNA测序数据深度聚类分析

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf138.

scDFN: enhancing single-cell RNA-seq clustering with deep fusion networks.scDFN：利用深度融合网络增强单细胞 RNA-seq 聚类

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae486.

scLEGA: an attention-based deep clustering method with a tendency for low expression of genes on single-cell RNA-seq data.scLEGA：一种基于注意力的深度聚类方法，在单细胞 RNA-seq 数据中倾向于低表达基因。

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae371.

scGAL: unmask tumor clonal substructure by jointly analyzing independent single-cell copy number and scRNA-seq data.scGAL：通过联合分析独立的单细胞拷贝数和单细胞RNA测序数据来揭示肿瘤克隆亚结构。

BMC Genomics. 2024 Apr 22;25(1):393. doi: 10.1186/s12864-024-10319-w.

Recent Advancements in Subcellular Proteomics: Growing Impact of Organellar Protein Niches on the Understanding of Cell Biology.亚细胞蛋白质组学的最新进展：细胞器蛋白质龛对细胞生物学理解的影响日益增大。

J Proteome Res. 2024 Aug 2;23(8):2700-2722. doi: 10.1021/acs.jproteome.3c00839. Epub 2024 Mar 7.

DeepNet model empowered cuckoo search algorithm for the effective identification of lung cancer nodules.基于深度网络模型的布谷鸟搜索算法用于肺癌结节的有效识别

Front Med Technol. 2023 Sep 11;5:1157919. doi: 10.3389/fmedt.2023.1157919. eCollection 2023.

本文引用的文献

Topological identification and interpretation for single-cell gene regulation elucidation across multiple platforms using scMGCA.利用 scMGCA 在多个平台上进行单细胞基因调控阐明的拓扑鉴定和解释。

Nat Commun. 2023 Jan 25;14(1):400. doi: 10.1038/s41467-023-36134-7.

Deep soft -means clustering with self-training for single-cell RNA sequence data.用于单细胞RNA序列数据的基于自训练的深度软均值聚类

NAR Genom Bioinform. 2020 May 25;2(2):lqaa039. doi: 10.1093/nargab/lqaa039. eCollection 2020 Jun.

Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis.深度学习能够实现单细胞 RNA-seq 分析中具有批次效应去除功能的精确聚类。

Nat Commun. 2020 May 11;11(1):2338. doi: 10.1038/s41467-020-15851-3.

SHARP: hyperfast and accurate processing of single-cell RNA-seq data via ensemble random projection.SHARP：通过集成随机投影实现单细胞 RNA-seq 数据的超快速和精确处理。

Genome Res. 2020 Feb;30(2):205-213. doi: 10.1101/gr.254557.119. Epub 2020 Jan 28.

SAME-clustering: Single-cell Aggregated Clustering via Mixture Model Ensemble.SAME 聚类：基于混合模型集成的单细胞聚集聚类。

Nucleic Acids Res. 2020 Jan 10;48(1):86-95. doi: 10.1093/nar/gkz959.

The single-cell transcriptional landscape of mammalian organogenesis.哺乳动物器官发生的单细胞转录组图谱。

Nature. 2019 Feb;566(7745):496-502. doi: 10.1038/s41586-019-0969-x. Epub 2019 Feb 20.

Single-cell RNA-seq denoising using a deep count autoencoder.基于深度计数自编码器的单细胞 RNA-seq 去噪。

Nat Commun. 2019 Jan 23;10(1):390. doi: 10.1038/s41467-018-07931-2.

Identification of cancer subtypes from single-cell RNA-seq data using a consensus clustering method.使用一致性聚类方法从单细胞RNA测序数据中识别癌症亚型。

BMC Med Genomics. 2018 Dec 31;11(Suppl 6):117. doi: 10.1186/s12920-018-0433-z.

Semisoft clustering of single-cell data.单细胞数据的半软聚类。

Proc Natl Acad Sci U S A. 2019 Jan 8;116(2):466-471. doi: 10.1073/pnas.1817715116. Epub 2018 Dec 26.

CellMarker: a manually curated resource of cell markers in human and mouse.细胞标记物数据库：人类和小鼠细胞标记物的人工整理资源。

Nucleic Acids Res. 2019 Jan 8;47(D1):D721-D728. doi: 10.1093/nar/gky900.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

scBGEDA：基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。

scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献