• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

scMUSCL:用于单细胞RNA测序数据聚类的多源迁移学习

scMUSCL: multi-source transfer learning for clustering scRNA-seq data.

作者信息

Khoeini Arash, Sar Funda, Lin Yen-Yi, Collins Colin, Ester Martin

机构信息

School of Computing Science, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada.

Vancouver Prostate Centre, Vancouver, British Columbia V6H 3Z6, Canada.

出版信息

Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf137.

DOI:10.1093/bioinformatics/btaf137
PMID:40152244
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12065430/
Abstract

MOTIVATION

Single-cell RNA sequencing (scRNA-seq) analysis relies heavily on effective clustering to facilitate numerous downstream applications. Although several machine learning methods have been developed to enhance single-cell clustering, most are fully unsupervised and overlook the rich repository of annotated datasets available from previous single-cell experiments. Since cells are inherently high-dimensional entities, unsupervised clustering can often result in clusters that lack biological relevance. Leveraging annotated scRNA-seq datasets as a reference can significantly enhance clustering performance, enabling the identification of biologically meaningful clusters in target datasets.

RESULTS

In this article, we propose Single Cell MUlti-Source CLustering (scMUSCL), a novel transfer learning method designed to identify cell clusters in a target dataset by leveraging knowledge from multiple annotated reference datasets. scMUSCL employs a deep neural network to extract domain- and batch-invariant cell representations, effectively addressing discrepancies across various source datasets and between source and target datasets within the new representation space. Unlike existing methods, scMUSCL does not require prior knowledge of the number of clusters in the target dataset and eliminates the need for batch correction between source and target datasets. We conduct extensive experiments using 20 real-life datasets, demonstrating that scMUSCL consistently outperforms existing unsupervised and transfer learning-based methods. Furthermore, our experiments show that scMUSCL benefits from multiple source datasets as learning references and accurately estimates the number of clusters.

AVAILABILITY AND IMPLEMENTATION

The Python implementation of scMUSCL is available at https://github.com/arashkhoeini/scMUSCL.

摘要

动机

单细胞RNA测序(scRNA-seq)分析在很大程度上依赖于有效的聚类来促进众多下游应用。尽管已经开发了几种机器学习方法来增强单细胞聚类,但大多数方法都是完全无监督的,并且忽略了先前单细胞实验中可用的大量注释数据集。由于细胞本质上是高维实体,无监督聚类通常会导致缺乏生物学相关性的聚类。利用注释的scRNA-seq数据集作为参考可以显著提高聚类性能,从而在目标数据集中识别出具有生物学意义的聚类。

结果

在本文中,我们提出了单细胞多源聚类(scMUSCL),这是一种新颖的迁移学习方法,旨在通过利用来自多个注释参考数据集的知识来识别目标数据集中的细胞聚类。scMUSCL采用深度神经网络来提取域和批次不变的细胞表示,有效地解决了新表示空间中各种源数据集之间以及源数据集和目标数据集之间的差异。与现有方法不同,scMUSCL不需要事先知道目标数据集中的聚类数量,并且无需对源数据集和目标数据集进行批次校正。我们使用20个实际数据集进行了广泛的实验,证明scMUSCL始终优于现有的无监督和基于迁移学习的方法。此外,我们的实验表明,scMUSCL受益于多个源数据集作为学习参考,并能准确估计聚类数量。

可用性和实现

scMUSCL的Python实现可在https://github.com/arashkhoeini/scMUSCL上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/02ebcc930b0e/btaf137f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/8a8bf5b59a58/btaf137f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/b47be36d99a4/btaf137f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/76980f967a28/btaf137f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/9e7b831a9369/btaf137f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/02ebcc930b0e/btaf137f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/8a8bf5b59a58/btaf137f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/b47be36d99a4/btaf137f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/76980f967a28/btaf137f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/9e7b831a9369/btaf137f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab90/12065430/02ebcc930b0e/btaf137f5.jpg

相似文献

1
scMUSCL: multi-source transfer learning for clustering scRNA-seq data.scMUSCL:用于单细胞RNA测序数据聚类的多源迁移学习
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf137.
2
Graph contrastive learning as a versatile foundation for advanced scRNA-seq data analysis.图对比学习作为高级 scRNA-seq 数据分析的多功能基础。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae558.
3
Transfer learning for clustering single-cell RNA-seq data crossing-species and batch, case on uterine fibroids.跨物种和批次的单细胞 RNA-seq 数据聚类的迁移学习:以子宫肌瘤为例。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad426.
4
Multi-level multi-view network based on structural contrastive learning for scRNA-seq data clustering.基于结构对比学习的多层次多视图网络用于 scRNA-seq 数据聚类。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae562.
5
scTPC: a novel semisupervised deep clustering model for scRNA-seq data.scTPC:一种用于 scRNA-seq 数据的新型半监督深度聚类模型。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae293.
6
scBGEDA: deep single-cell clustering analysis via a dual denoising autoencoder with bipartite graph ensemble clustering.scBGEDA:基于双分图集成分聚类的对偶去噪自动编码器的单细胞聚类分析。
Bioinformatics. 2023 Feb 14;39(2). doi: 10.1093/bioinformatics/btad075.
7
scCNC: a method based on capsule network for clustering scRNA-seq data.scCNC:一种基于胶囊网络的 scRNA-seq 数据聚类方法。
Bioinformatics. 2022 Aug 2;38(15):3703-3709. doi: 10.1093/bioinformatics/btac393.
8
Deep enhanced constraint clustering based on contrastive learning for scRNA-seq data.基于对比学习的深度增强约束聚类算法在单细胞 RNA-seq 数据分析中的应用。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad222.
9
Using transfer learning from prior reference knowledge to improve the clustering of single-cell RNA-Seq data.利用先验参考知识的迁移学习来改进单细胞 RNA-Seq 数据的聚类。
Sci Rep. 2019 Dec 30;9(1):20353. doi: 10.1038/s41598-019-56911-z.
10
DCRELM: dual correlation reduction network-based extreme learning machine for single-cell RNA-seq data clustering.基于双相关降维网络的极限学习机用于单细胞 RNA-seq 数据聚类。
Sci Rep. 2024 Jun 12;14(1):13541. doi: 10.1038/s41598-024-64217-y.

本文引用的文献

1
Significance analysis for clustering with single-cell RNA-sequencing data.基于单细胞 RNA-seq 数据的聚类意义分析。
Nat Methods. 2023 Aug;20(8):1196-1202. doi: 10.1038/s41592-023-01933-9. Epub 2023 Jul 10.
2
Review of single-cell RNA-seq data clustering for cell-type identification and characterization.单细胞 RNA-seq 数据聚类用于细胞类型鉴定和特征分析的综述。
RNA. 2023 May;29(5):517-530. doi: 10.1261/rna.078965.121. Epub 2023 Feb 3.
3
scNAME: neighborhood contrastive clustering with ancillary mask estimation for scRNA-seq data.
scNAME:基于辅助掩模估计的 scRNA-seq 数据邻域对比聚类。
Bioinformatics. 2022 Mar 4;38(6):1575-1583. doi: 10.1093/bioinformatics/btac011.
4
Mapping single-cell data to reference atlases by transfer learning.通过迁移学习将单细胞数据映射到参考图谱上。
Nat Biotechnol. 2022 Jan;40(1):121-130. doi: 10.1038/s41587-021-01001-7. Epub 2021 Aug 30.
5
Contrastive self-supervised clustering of scRNA-seq data.单细胞 RNA 测序数据的对比自监督聚类。
BMC Bioinformatics. 2021 May 27;22(1):280. doi: 10.1186/s12859-021-04210-8.
6
Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods.教程:使用自动化和手动方法标注单细胞转录组图谱的指南。
Nat Protoc. 2021 Jun;16(6):2749-2764. doi: 10.1038/s41596-021-00534-0. Epub 2021 May 24.
7
Charting human development using a multi-endodermal organ atlas and organoid models.利用多内胚层器官图谱和类器官模型描绘人类发育。
Cell. 2021 Jun 10;184(12):3281-3298.e22. doi: 10.1016/j.cell.2021.04.028. Epub 2021 May 20.
8
MARS: discovering novel cell types across heterogeneous single-cell experiments.MARS:在异质单细胞实验中发现新型细胞类型。
Nat Methods. 2020 Dec;17(12):1200-1206. doi: 10.1038/s41592-020-00979-3. Epub 2020 Oct 19.
9
Current best practices in single-cell RNA-seq analysis: a tutorial.单细胞 RNA 测序分析的当前最佳实践:教程。
Mol Syst Biol. 2019 Jun 19;15(6):e8746. doi: 10.15252/msb.20188746.
10
Comparative analysis of differential gene expression analysis tools for single-cell RNA sequencing data.单细胞 RNA 测序数据分析工具的差异基因表达分析比较。
BMC Bioinformatics. 2019 Jan 18;20(1):40. doi: 10.1186/s12859-019-2599-6.