用于一站式可解释细胞类型注释的 Transformer。

Transformer for one stop interpretable cell type annotation.

机构信息

Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.

出版信息

Nat Commun. 2023 Jan 14;14(1):223. doi: 10.1038/s41467-023-35923-4.

DOI:10.1038/s41467-023-35923-4

PMID:36641532

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9840170/

Abstract

Consistent annotation transfer from reference dataset to query dataset is fundamental to the development and reproducibility of single-cell research. Compared with traditional annotation methods, deep learning based methods are faster and more automated. A series of useful single cell analysis tools based on autoencoder architecture have been developed but these struggle to strike a balance between depth and interpretability. Here, we present TOSICA, a multi-head self-attention deep learning model based on Transformer that enables interpretable cell type annotation using biologically understandable entities, such as pathways or regulons. We show that TOSICA achieves fast and accurate one-stop annotation and batch-insensitive integration while providing biologically interpretable insights for understanding cellular behavior during development and disease progressions. We demonstrate TOSICA's advantages by applying it to scRNA-seq data of tumor-infiltrating immune cells, and CD14+ monocytes in COVID-19 to reveal rare cell types, heterogeneity and dynamic trajectories associated with disease progression and severity.

摘要

从参考数据集到查询数据集的一致注释转移对于单细胞研究的发展和可重复性至关重要。与传统的注释方法相比，基于深度学习的方法更快、更自动化。已经开发了一系列基于自动编码器架构的有用的单细胞分析工具，但这些工具在深度和可解释性之间难以平衡。在这里，我们提出了 TOSICA，这是一种基于 Transformer 的多头自注意力深度学习模型，它可以使用生物上可理解的实体（如途径或调控子）进行可解释的细胞类型注释。我们表明，TOSICA 实现了快速准确的一站式注释和批次不敏感的集成，同时为理解发育和疾病进展过程中的细胞行为提供了生物上可解释的见解。我们通过将 TOSICA 应用于肿瘤浸润免疫细胞的 scRNA-seq 数据和 COVID-19 中的 CD14+单核细胞，展示了 TOSICA 的优势，揭示了与疾病进展和严重程度相关的罕见细胞类型、异质性和动态轨迹。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e620/9840638/33a58e9ea136/41467_2023_35923_Fig1_HTML.jpg

相似文献

Transformer for one stop interpretable cell type annotation.

Nat Commun. 2023 Jan 14;14(1):223. doi: 10.1038/s41467-023-35923-4.

CIForm as a Transformer-based model for cell-type annotation of large-scale single-cell RNA-seq data.

Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad195.

Cell-type annotation with accurate unseen cell-type identification using multiple references.

PLoS Comput Biol. 2023 Jun 28;19(6):e1011261. doi: 10.1371/journal.pcbi.1011261. eCollection 2023 Jun.

scMRA: a robust deep learning method to annotate scRNA-seq data with multiple reference datasets.

Bioinformatics. 2022 Jan 12;38(3):738-745. doi: 10.1093/bioinformatics/btab700.

scMAGIC: accurately annotating single cells using two rounds of reference-based classification.

Nucleic Acids Res. 2022 May 6;50(8):e43. doi: 10.1093/nar/gkab1275.

scSemiGAN: a single-cell semi-supervised annotation and dimensionality reduction framework based on generative adversarial network.

Bioinformatics. 2022 Nov 15;38(22):5042-5048. doi: 10.1093/bioinformatics/btac652.

Non-linear archetypal analysis of single-cell RNA-seq data by deep autoencoders.

PLoS Comput Biol. 2022 Apr 1;18(4):e1010025. doi: 10.1371/journal.pcbi.1010025. eCollection 2022 Apr.

Atlas-scale single-cell multi-sample multi-condition data integration using scMerge2.

Nat Commun. 2023 Jul 17;14(1):4272. doi: 10.1038/s41467-023-39923-2.

A scalable sparse neural network framework for rare cell type annotation of single-cell transcriptome data.

Commun Biol. 2023 May 20;6(1):545. doi: 10.1038/s42003-023-04928-6.

Learning deep features and topological structure of cells for clustering of scRNA-sequencing data.

Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac068.

引用本文的文献

BioLLM: A standardized framework for integrating and benchmarking single-cell foundation models.

Patterns (N Y). 2025 Jul 30;6(8):101326. doi: 10.1016/j.patter.2025.101326. eCollection 2025 Aug 8.

A message passing framework for precise cell state identification with scClassify2.

Genome Biol. 2025 Aug 19;26(1):252. doi: 10.1186/s13059-025-03722-3.

SANNO: A Graph-Transformer Enhanced Optimal Transport Tool for Spatial Transcriptomic Annotation.

Interdiscip Sci. 2025 Aug 11. doi: 10.1007/s12539-025-00752-0.

New insights into liver injury and regeneration from single-cell transcriptomics.

eGastroenterology. 2025 Jul 23;3(3):e100202. doi: 10.1136/egastro-2025-100202. eCollection 2025.

Exploring machine learning strategies for single-cell transcriptomic analysis in wound healing.

Burns Trauma. 2025 May 13;13:tkaf032. doi: 10.1093/burnst/tkaf032. eCollection 2025.

scGPT: end-to-end protocol for fine-tuned retinal cell type annotation.

Nat Protoc. 2025 Jul 15. doi: 10.1038/s41596-025-01220-1.

SCassist: an AI based workflow assistant for single-cell analysis.

Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf402.

A review of transformer models in drug discovery and beyond.

J Pharm Anal. 2025 Jun;15(6):101081. doi: 10.1016/j.jpha.2024.101081. Epub 2024 Aug 30.

Multi-omic analysis of gallbladder cancer identifies distinct tumor microenvironments associated with disease progression.

Nat Genet. 2025 Jun 26. doi: 10.1038/s41588-025-02236-9.

CellMemory: hierarchical interpretation of out-of-distribution cells using bottlenecked transformer.

Genome Biol. 2025 Jun 23;26(1):178. doi: 10.1186/s13059-025-03638-y.

本文引用的文献

Decoding the transcriptome of calcified atherosclerotic plaque at single-cell resolution.

Commun Biol. 2022 Oct 12;5(1):1084. doi: 10.1038/s42003-022-04056-7.

MarkerCount: A stable, count-based cell type identifier for single-cell RNA-seq experiments.

Comput Struct Biotechnol J. 2022 Jun 14;20:3120-3132. doi: 10.1016/j.csbj.2022.06.010. eCollection 2022.

Deep learning shapes single-cell data analysis.

Nat Rev Mol Cell Biol. 2022 May;23(5):303-304. doi: 10.1038/s41580-022-00466-x.

Benchmarking atlas-level data integration in single-cell genomics.

Nat Methods. 2022 Jan;19(1):41-50. doi: 10.1038/s41592-021-01336-8. Epub 2021 Dec 23.

Pan-cancer single-cell landscape of tumor-infiltrating T cells.

Science. 2021 Dec 17;374(6574):abe6474. doi: 10.1126/science.abe6474.

Automatic cell type identification methods for single-cell RNA sequencing.

Comput Struct Biotechnol J. 2021 Oct 20;19:5874-5887. doi: 10.1016/j.csbj.2021.10.027. eCollection 2021.

COVID-19 immune features revealed by a large-scale single-cell transcriptome atlas.

Cell. 2021 Nov 11;184(23):5838. doi: 10.1016/j.cell.2021.10.023.

A pan-cancer single-cell transcriptional atlas of tumor infiltrating myeloid cells.

Cell. 2021 Feb 4;184(3):792-809.e23. doi: 10.1016/j.cell.2021.01.010.

Learning for single-cell assignment.

Sci Adv. 2020 Oct 30;6(44). doi: 10.1126/sciadv.abd0855. Print 2020 Oct.

scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data.

BMC Bioinformatics. 2020 Aug 4;21(1):342. doi: 10.1186/s12859-020-03700-5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于一站式可解释细胞类型注释的 Transformer。

Transformer for one stop interpretable cell type annotation.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献