• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

scEVOLVE:单细胞 RNA-seq 数据的细胞类型增量注释而不忘却。

scEVOLVE: cell-type incremental annotation without forgetting for single-cell RNA-seq data.

机构信息

School of Mathematical Sciences, Peking University, Beijing, China.

Huawei Technologies Co., Ltd., Beijing, China.

出版信息

Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae039.

DOI:10.1093/bib/bbae039
PMID:38366803
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10939389/
Abstract

The evolution in single-cell RNA sequencing (scRNA-seq) technology has opened a new avenue for researchers to inspect cellular heterogeneity with single-cell precision. One crucial aspect of this technology is cell-type annotation, which is fundamental for any subsequent analysis in single-cell data mining. Recently, the scientific community has seen a surge in the development of automatic annotation methods aimed at this task. However, these methods generally operate at a steady-state total cell-type capacity, significantly restricting the cell annotation systems'capacity for continuous knowledge acquisition. Furthermore, creating a unified scRNA-seq annotation system remains challenged by the need to progressively expand its understanding of ever-increasing cell-type concepts derived from a continuous data stream. In response to these challenges, this paper presents a novel and challenging setting for annotation, namely cell-type incremental annotation. This concept is designed to perpetually enhance cell-type knowledge, gleaned from continuously incoming data. This task encounters difficulty with data stream samples that can only be observed once, leading to catastrophic forgetting. To address this problem, we introduce our breakthrough methodology termed scEVOLVE, an incremental annotation method. This innovative approach is built upon the methodology of contrastive sample replay combined with the fundamental principle of partition confidence maximization. Specifically, we initially retain and replay sections of the old data in each subsequent training phase, then establish a unique prototypical learning objective to mitigate the cell-type imbalance problem, as an alternative to using cross-entropy. To effectively emulate a model that trains concurrently with complete data, we introduce a cell-type decorrelation strategy that efficiently scatters feature representations of each cell type uniformly. We constructed the scEVOLVE framework with simplicity and ease of integration into most deep softmax-based single-cell annotation methods. Thorough experiments conducted on a range of meticulously constructed benchmarks consistently prove that our methodology can incrementally learn numerous cell types over an extended period, outperforming other strategies that fail quickly. As far as our knowledge extends, this is the first attempt to propose and formulate an end-to-end algorithm framework to address this new, practical task. Additionally, scEVOLVE, coded in Python using the Pytorch machine-learning library, is freely accessible at https://github.com/aimeeyaoyao/scEVOLVE.

摘要

单细胞 RNA 测序 (scRNA-seq) 技术的发展为研究人员以单细胞精度检查细胞异质性开辟了新途径。该技术的一个关键方面是细胞类型注释,这对于单细胞数据挖掘中的任何后续分析都是基础。最近,科学界看到了针对这一任务的自动注释方法的发展热潮。然而,这些方法通常在稳定的总细胞类型容量下运行,显著限制了细胞注释系统对持续知识获取的能力。此外,创建一个统一的 scRNA-seq 注释系统仍然面临着挑战,需要逐步扩展其对来自连续数据流的不断增加的细胞类型概念的理解。针对这些挑战,本文提出了一种新的、具有挑战性的注释设置,即细胞类型增量注释。这个概念旨在通过不断输入的数据来持续增强细胞类型知识。这个任务在数据流样本上遇到了困难,因为这些样本只能观察一次,导致灾难性遗忘。为了解决这个问题,我们提出了一种名为 scEVOLVE 的突破性增量注释方法。这种创新方法是基于对比样本重放的方法和分区置信最大化的基本原则构建的。具体来说,我们在每个后续的训练阶段首先保留和重放旧数据的部分,然后建立一个独特的原型学习目标来减轻细胞类型不平衡问题,而不是使用交叉熵。为了有效地模拟一个与完整数据同时训练的模型,我们引入了一种细胞类型去相关策略,有效地将每个细胞类型的特征表示均匀地分散开来。我们构建了 scEVOLVE 框架,简单易用,可以集成到大多数基于深度 softmax 的单细胞注释方法中。在一系列精心构建的基准上进行的彻底实验一致证明,我们的方法可以在较长时间内增量学习多个细胞类型,优于其他很快就失败的策略。据我们所知,这是首次尝试提出和制定一个端到端的算法框架来解决这个新的实际任务。此外,scEVOLVE 是用 Python 编写的,使用了 Pytorch 机器学习库,可在 https://github.com/aimeeyaoyao/scEVOLVE 上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/e487cbbd9db6/bbae039f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/515c4c43a917/bbae039f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/9e16c014d4f0/bbae039f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/60bc08f9888b/bbae039f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/457d0b14cd02/bbae039f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/5166e94c33e9/bbae039f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/9b20f2eea861/bbae039f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/fe01eaeb2f02/bbae039f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/f92c39638e4e/bbae039f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/29d26d50ca08/bbae039f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/e487cbbd9db6/bbae039f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/515c4c43a917/bbae039f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/9e16c014d4f0/bbae039f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/60bc08f9888b/bbae039f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/457d0b14cd02/bbae039f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/5166e94c33e9/bbae039f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/9b20f2eea861/bbae039f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/fe01eaeb2f02/bbae039f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/f92c39638e4e/bbae039f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/29d26d50ca08/bbae039f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acdf/10939389/e487cbbd9db6/bbae039f10.jpg

相似文献

1
scEVOLVE: cell-type incremental annotation without forgetting for single-cell RNA-seq data.scEVOLVE:单细胞 RNA-seq 数据的细胞类型增量注释而不忘却。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae039.
2
scGAD: a new task and end-to-end framework for generalized cell type annotation and discovery.scGAD:用于广义细胞类型注释和发现的新任务和端到端框架。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad045.
3
Deep enhanced constraint clustering based on contrastive learning for scRNA-seq data.基于对比学习的深度增强约束聚类算法在单细胞 RNA-seq 数据分析中的应用。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad222.
4
Continually adapting pre-trained language model to universal annotation of single-cell RNA-seq data.持续调整预先训练的语言模型,以实现单细胞 RNA-seq 数据的通用注释。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae047.
5
scNAME: neighborhood contrastive clustering with ancillary mask estimation for scRNA-seq data.scNAME:基于辅助掩模估计的 scRNA-seq 数据邻域对比聚类。
Bioinformatics. 2022 Mar 4;38(6):1575-1583. doi: 10.1093/bioinformatics/btac011.
6
scBOL: a universal cell type identification framework for single-cell and spatial transcriptomics data.scBOL:单细胞和空间转录组学数据的通用细胞类型识别框架。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae188.
7
scRGCL: a cell type annotation method for single-cell RNA-seq data using residual graph convolutional neural network with contrastive learning.scRGCL:一种使用带有对比学习的残差图卷积神经网络对单细胞RNA测序数据进行细胞类型注释的方法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae662.
8
TripletCell: a deep metric learning framework for accurate annotation of cell types at the single-cell level.三重细胞:一种用于单细胞水平准确注释细胞类型的深度度量学习框架。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad132.
9
scDCCA: deep contrastive clustering for single-cell RNA-seq data based on auto-encoder network.scDCCA:基于自动编码器网络的单细胞RNA测序数据深度对比聚类
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac625.
10
Graph contrastive learning as a versatile foundation for advanced scRNA-seq data analysis.图对比学习作为高级 scRNA-seq 数据分析的多功能基础。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae558.

引用本文的文献

1
An overview of computational methods in single-cell transcriptomic cell type annotation.单细胞转录组细胞类型注释中的计算方法概述。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf207.
2
scBOL: a universal cell type identification framework for single-cell and spatial transcriptomics data.scBOL:单细胞和空间转录组学数据的通用细胞类型识别框架。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae188.

本文引用的文献

1
Golgi_DF: Golgi proteins classification with deep forest.高尔基体_DF:基于深度森林的高尔基体蛋白质分类
Front Neurosci. 2023 May 12;17:1197824. doi: 10.3389/fnins.2023.1197824. eCollection 2023.
2
CIForm as a Transformer-based model for cell-type annotation of large-scale single-cell RNA-seq data.CIForm 作为一种基于 Transformer 的模型,用于大规模单细胞 RNA-seq 数据的细胞类型注释。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad195.
3
scGAD: a new task and end-to-end framework for generalized cell type annotation and discovery.
scGAD:用于广义细胞类型注释和发现的新任务和端到端框架。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad045.
4
Class-Incremental Learning: Survey and Performance Evaluation on Image Classification.类别增量学习:图像分类的综述与性能评估
IEEE Trans Pattern Anal Mach Intell. 2023 May;45(5):5513-5533. doi: 10.1109/TPAMI.2022.3213473. Epub 2023 Apr 3.
5
Gene Regulatory Identification Based on the Novel Hybrid Time-Delayed Method.基于新型混合时间延迟方法的基因调控识别
Front Genet. 2022 May 19;13:888786. doi: 10.3389/fgene.2022.888786. eCollection 2022.
6
CellDART: cell type inference by domain adaptation of single-cell and spatial transcriptomic data.CellDART:通过单细胞和空间转录组数据的领域自适应进行细胞类型推断。
Nucleic Acids Res. 2022 Jun 10;50(10):e57. doi: 10.1093/nar/gkac084.
7
Mapping single-cell data to reference atlases by transfer learning.通过迁移学习将单细胞数据映射到参考图谱上。
Nat Biotechnol. 2022 Jan;40(1):121-130. doi: 10.1038/s41587-021-01001-7. Epub 2021 Aug 30.
8
A Continual Learning Survey: Defying Forgetting in Classification Tasks.持续学习调查:在分类任务中对抗遗忘
IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3366-3385. doi: 10.1109/TPAMI.2021.3057446. Epub 2022 Jun 3.
9
Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models.使用深度生成模型对单细胞转录组学数据进行概率协调和注释。
Mol Syst Biol. 2021 Jan;17(1):e9620. doi: 10.15252/msb.20209620.
10
Dissecting human embryonic skeletal stem cell ontogeny by single-cell transcriptomic and functional analyses.通过单细胞转录组学和功能分析解析人类胚胎骨骼干细胞的个体发生。
Cell Res. 2021 Jul;31(7):742-757. doi: 10.1038/s41422-021-00467-z. Epub 2021 Jan 20.