scTab：缩放跨组织单细胞注释模型。

scTab: Scaling cross-tissue single-cell annotation models.

机构信息

Department of Computational Health, Institute of Computational Biology, Helmholtz, Munich, Germany.

School of Computing, Information and Technology, Technical University of Munich, Munich, Germany.

出版信息

Nat Commun. 2024 Aug 4;15(1):6611. doi: 10.1038/s41467-024-51059-5.

DOI:10.1038/s41467-024-51059-5

PMID:39098889

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11298532/

Abstract

Identifying cellular identities is a key use case in single-cell transcriptomics. While machine learning has been leveraged to automate cell annotation predictions for some time, there has been little progress in scaling neural networks to large data sets and in constructing models that generalize well across diverse tissues. Here, we propose scTab, an automated cell type prediction model specific to tabular data, and train it using a novel data augmentation scheme across a large corpus of single-cell RNA-seq observations (22.2 million cells). In this context, we show that cross-tissue annotation requires nonlinear models and that the performance of scTab scales both in terms of training dataset size and model size. Additionally, we show that the proposed data augmentation schema improves model generalization. In summary, we introduce a de novo cell type prediction model for single-cell RNA-seq data that can be trained across a large-scale collection of curated datasets and demonstrate the benefits of using deep learning methods in this paradigm.

摘要

鉴定细胞身份是单细胞转录组学的一个关键应用。虽然机器学习已经被用于自动化细胞注释预测一段时间了，但在将神经网络扩展到大数据集和构建能够很好地跨多种组织概括的模型方面，几乎没有取得什么进展。在这里，我们提出了 scTab，这是一种专门针对表格数据的自动化细胞类型预测模型，并使用一种新的数据增强方案在一个包含大量单细胞 RNA-seq 观测值（2220 万细胞）的语料库中对其进行训练。在这种情况下，我们表明跨组织注释需要非线性模型，并且 scTab 的性能在训练数据集大小和模型大小方面都有所扩展。此外，我们表明所提出的数据增强方案提高了模型的泛化能力。总之，我们为单细胞 RNA-seq 数据引入了一种新的细胞类型预测模型，该模型可以在大规模的经过整理的数据集集合上进行训练，并展示了在这种范例中使用深度学习方法的好处。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

scTab：缩放跨组织单细胞注释模型。

scTab: Scaling cross-tissue single-cell annotation models.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

scTab：缩放跨组织单细胞注释模型。

scTab: Scaling cross-tissue single-cell annotation models.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献