• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

scTransSort:基于基因嵌入的细胞类型智能注释的转换器。

scTransSort: Transformers for Intelligent Annotation of Cell Types by Gene Embeddings.

机构信息

College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China.

Department of Artificial Intelligence, Faculty of Computer Science, Campus de Montegancedo, Polytechnical University of Madrid, Boadilla del Monte, 28660 Madrid, Spain.

出版信息

Biomolecules. 2023 Mar 28;13(4):611. doi: 10.3390/biom13040611.

DOI:10.3390/biom13040611
PMID:37189359
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10136153/
Abstract

Single-cell transcriptomics is rapidly advancing our understanding of the composition of complex tissues and biological cells, and single-cell RNA sequencing (scRNA-seq) holds great potential for identifying and characterizing the cell composition of complex tissues. Cell type identification by analyzing scRNA-seq data is mostly limited by time-consuming and irreproducible manual annotation. As scRNA-seq technology scales to thousands of cells per experiment, the exponential increase in the number of cell samples makes manual annotation more difficult. On the other hand, the sparsity of gene transcriptome data remains a major challenge. This paper applied the idea of the transformer to single-cell classification tasks based on scRNA-seq data. We propose scTransSort, a cell-type annotation method pretrained with single-cell transcriptomics data. The scTransSort incorporates a method of representing genes as gene expression embedding blocks to reduce the sparsity of data used for cell type identification and reduce the computational complexity. The feature of scTransSort is that its implementation of intelligent information extraction for unordered data, automatically extracting valid features of cell types without the need for manually labeled features and additional references. In experiments on cells from 35 human and 26 mouse tissues, scTransSort successfully elucidated its high accuracy and high performance for cell type identification, and demonstrated its own high robustness and generalization ability.

摘要

单细胞转录组学正在迅速提高我们对复杂组织和生物细胞组成的认识,单细胞 RNA 测序(scRNA-seq)在识别和描述复杂组织的细胞组成方面具有巨大的潜力。通过分析 scRNA-seq 数据进行细胞类型鉴定,主要受到耗时且不可重复的手动注释的限制。随着 scRNA-seq 技术扩展到每个实验数千个细胞,细胞样本数量的指数级增长使得手动注释更加困难。另一方面,基因转录组数据的稀疏性仍然是一个主要挑战。本文基于 scRNA-seq 数据将变压器的思想应用于单细胞分类任务。我们提出了 scTransSort,这是一种基于单细胞转录组学数据预训练的细胞类型注释方法。scTransSort 采用了将基因表示为基因表达嵌入块的方法,以减少用于细胞类型识别的数据稀疏性,并降低计算复杂度。scTransSort 的特点是它对无序数据的智能信息提取的实现,无需手动标记特征和额外的参考,即可自动提取细胞类型的有效特征。在来自 35 个人类和 26 个小鼠组织的细胞的实验中,scTransSort 成功地证明了其在细胞类型识别方面的高精度和高性能,并展示了其自身的高鲁棒性和泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/ece2c5da2dcc/biomolecules-13-00611-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/3519595df893/biomolecules-13-00611-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/8853189c05f1/biomolecules-13-00611-g002a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/9ec38d3137f9/biomolecules-13-00611-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/f4958f17cb9a/biomolecules-13-00611-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/fdf93eded97b/biomolecules-13-00611-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/9efdf02a730e/biomolecules-13-00611-g006a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/afae0433587c/biomolecules-13-00611-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/ece2c5da2dcc/biomolecules-13-00611-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/3519595df893/biomolecules-13-00611-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/8853189c05f1/biomolecules-13-00611-g002a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/9ec38d3137f9/biomolecules-13-00611-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/f4958f17cb9a/biomolecules-13-00611-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/fdf93eded97b/biomolecules-13-00611-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/9efdf02a730e/biomolecules-13-00611-g006a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/afae0433587c/biomolecules-13-00611-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89d8/10136153/ece2c5da2dcc/biomolecules-13-00611-g008.jpg

相似文献

1
scTransSort: Transformers for Intelligent Annotation of Cell Types by Gene Embeddings.scTransSort:基于基因嵌入的细胞类型智能注释的转换器。
Biomolecules. 2023 Mar 28;13(4):611. doi: 10.3390/biom13040611.
2
scSwinFormer: A Transformer-Based Cell-Type Annotation Method for scRNA-Seq Data Using Smooth Gene Embedding and Global Features.scSwinFormer:一种基于 Transformer 的单细胞 RNA-Seq 数据细胞类型注释方法,使用平滑基因嵌入和全局特征。
J Chem Inf Model. 2024 Aug 26;64(16):6316-6323. doi: 10.1021/acs.jcim.4c00616. Epub 2024 Aug 5.
3
TripletCell: a deep metric learning framework for accurate annotation of cell types at the single-cell level.三重细胞:一种用于单细胞水平准确注释细胞类型的深度度量学习框架。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad132.
4
scDeepSort: a pre-trained cell-type annotation method for single-cell transcriptomics using deep learning with a weighted graph neural network.scDeepSort:一种使用深度学习和加权图神经网络进行单细胞转录组学的预训练细胞类型注释方法。
Nucleic Acids Res. 2021 Dec 2;49(21):e122. doi: 10.1093/nar/gkab775.
5
SPANN: annotating single-cell resolution spatial transcriptome data with scRNA-seq data.SPANN:利用单细胞RNA测序数据注释单细胞分辨率空间转录组数据。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbad533.
6
Automatic Cell Type Annotation Using Marker Genes for Single-Cell RNA Sequencing Data.基于标记基因的单细胞 RNA 测序数据自动细胞类型注释。
Biomolecules. 2022 Oct 21;12(10):1539. doi: 10.3390/biom12101539.
7
A comparison of automatic cell identification methods for single-cell RNA sequencing data.单细胞 RNA 测序数据的自动细胞识别方法比较。
Genome Biol. 2019 Sep 9;20(1):194. doi: 10.1186/s13059-019-1795-z.
8
A neural network-based method for exhaustive cell label assignment using single cell RNA-seq data.基于神经网络的方法,利用单细胞 RNA-seq 数据进行全面的细胞标签分配。
Sci Rep. 2022 Jan 18;12(1):910. doi: 10.1038/s41598-021-04473-4.
9
CIForm as a Transformer-based model for cell-type annotation of large-scale single-cell RNA-seq data.CIForm 作为一种基于 Transformer 的模型,用于大规模单细胞 RNA-seq 数据的细胞类型注释。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad195.
10
scAnno: a deconvolution strategy-based automatic cell type annotation tool for single-cell RNA-sequencing data sets.scAnno:一种基于去卷积策略的单细胞 RNA 测序数据集自动细胞类型注释工具。
Brief Bioinform. 2023 May 19;24(3). doi: 10.1093/bib/bbad179.

引用本文的文献

1
Exploring machine learning strategies for single-cell transcriptomic analysis in wound healing.探索用于伤口愈合单细胞转录组分析的机器学习策略。
Burns Trauma. 2025 May 13;13:tkaf032. doi: 10.1093/burnst/tkaf032. eCollection 2025.
2
scGPT: end-to-end protocol for fine-tuned retinal cell type annotation.scGPT:用于微调视网膜细胞类型注释的端到端协议。
Nat Protoc. 2025 Jul 15. doi: 10.1038/s41596-025-01220-1.
3
A review of transformer models in drug discovery and beyond.药物发现及其他领域中变压器模型综述。

本文引用的文献

1
Semi-Supervised Deep Learning for Cell Type Identification From Single-Cell Transcriptomic Data.用于从单细胞转录组数据中识别细胞类型的半监督深度学习
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1492-1505. doi: 10.1109/TCBB.2022.3173587. Epub 2023 Apr 3.
2
ACTIVA: realistic single-cell RNA-seq generation with automatic cell-type identification using introspective variational autoencoders.ACTIVA:使用内省变分自动编码器实现自动细胞类型识别的真实单细胞 RNA-seq 生成。
Bioinformatics. 2022 Apr 12;38(8):2194-2201. doi: 10.1093/bioinformatics/btac095.
3
Single-cell Iso-Sequencing enables rapid genome annotation for scRNAseq analysis.
J Pharm Anal. 2025 Jun;15(6):101081. doi: 10.1016/j.jpha.2024.101081. Epub 2024 Aug 30.
4
An overview of computational methods in single-cell transcriptomic cell type annotation.单细胞转录组细胞类型注释中的计算方法概述。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf207.
5
Advances and applications in single-cell and spatial genomics.单细胞和空间基因组学的进展与应用
Sci China Life Sci. 2024 Dec 20. doi: 10.1007/s11427-024-2770-x.
6
Analyzing scRNA-seq data by CCP-assisted UMAP and tSNE.通过CCP辅助的UMAP和tSNE分析单细胞RNA测序数据。
PLoS One. 2024 Dec 13;19(12):e0311791. doi: 10.1371/journal.pone.0311791. eCollection 2024.
7
scGAA: a general gated axial-attention model for accurate cell-type annotation of single-cell RNA-seq data.scGAA:一种用于单细胞 RNA-seq 数据中准确细胞类型注释的通用门控轴向注意力模型。
Sci Rep. 2024 Sep 27;14(1):22308. doi: 10.1038/s41598-024-73356-1.
8
Tracing unknown tumor origins with a biological-pathway-based transformer model.基于生物途径的变换模型追踪未知肿瘤起源。
Cell Rep Methods. 2024 Jun 17;4(6):100797. doi: 10.1016/j.crmeth.2024.100797.
9
Advancing bioinformatics with large language models: components, applications and perspectives.利用大语言模型推进生物信息学:组件、应用与展望
ArXiv. 2025 Jan 31:arXiv:2401.04155v2.
10
New perspectives on biology, disease progression, and therapy response of head and neck cancer gained from single cell RNA sequencing and spatial transcriptomics.单细胞 RNA 测序和空间转录组学为头颈部癌症的生物学、疾病进展和治疗反应带来新视角。
Oncol Res. 2023 Nov 15;32(1):1-17. doi: 10.32604/or.2023.044774. eCollection 2023.
单细胞同重测序可实现 scRNAseq 分析的快速基因组注释。
Genetics. 2022 Mar 3;220(3). doi: 10.1093/genetics/iyac017.
4
From bulk, single-cell to spatial RNA sequencing.从批量、单细胞到空间 RNA 测序。
Int J Oral Sci. 2021 Nov 15;13(1):36. doi: 10.1038/s41368-021-00146-0.
5
Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration.大规模整合单细胞转录组数据捕获了小鼠骨骼肌再生中的过渡祖细胞状态。
Commun Biol. 2021 Nov 12;4(1):1280. doi: 10.1038/s42003-021-02810-x.
6
scDeepSort: a pre-trained cell-type annotation method for single-cell transcriptomics using deep learning with a weighted graph neural network.scDeepSort:一种使用深度学习和加权图神经网络进行单细胞转录组学的预训练细胞类型注释方法。
Nucleic Acids Res. 2021 Dec 2;49(21):e122. doi: 10.1093/nar/gkab775.
7
Automated methods for cell type annotation on scRNA-seq data.单细胞RNA测序(scRNA-seq)数据细胞类型注释的自动化方法。
Comput Struct Biotechnol J. 2021 Jan 19;19:961-969. doi: 10.1016/j.csbj.2021.01.015. eCollection 2021.
8
Evaluation of machine learning approaches for cell-type identification from single-cell transcriptomics data.基于单细胞转录组学数据的细胞类型识别的机器学习方法评估。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab035.
9
Phenotypically supervised single-cell sequencing parses within-cell-type heterogeneity.表型监督单细胞测序解析细胞类型内的异质性。
iScience. 2020 Dec 26;24(1):101991. doi: 10.1016/j.isci.2020.101991. eCollection 2021 Jan 22.
10
FR-Match: robust matching of cell type clusters from single cell RNA sequencing data using the Friedman-Rafsky non-parametric test.FR-Match:使用 Friedman-Rafsky 非参数检验对单细胞 RNA 测序数据中的细胞类型簇进行稳健匹配。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa339.