Suppr超能文献

单细胞转录组学的大规模基础模型。

Large-scale foundation model on single-cell transcriptomics.

机构信息

MOE Key Laboratory of Bioinformatics and Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing, China.

BioMap, Beijing, China.

出版信息

Nat Methods. 2024 Aug;21(8):1481-1491. doi: 10.1038/s41592-024-02305-7. Epub 2024 Jun 6.

Abstract

Large pretrained models have become foundation models leading to breakthroughs in natural language processing and related fields. Developing foundation models for deciphering the 'languages' of cells and facilitating biomedical research is promising yet challenging. Here we developed a large pretrained model scFoundation, also named 'xTrimoscFoundation', with 100 million parameters covering about 20,000 genes, pretrained on over 50 million human single-cell transcriptomic profiles. scFoundation is a large-scale model in terms of the size of trainable parameters, dimensionality of genes and volume of training data. Its asymmetric transformer-like architecture and pretraining task design empower effectively capturing complex context relations among genes in a variety of cell types and states. Experiments showed its merit as a foundation model that achieved state-of-the-art performances in a diverse array of single-cell analysis tasks such as gene expression enhancement, tissue drug response prediction, single-cell drug response classification, single-cell perturbation prediction, cell type annotation and gene module inference.

摘要

大型预训练模型已经成为引领自然语言处理及相关领域取得突破的基础模型。开发用于破译细胞“语言”并促进生物医学研究的基础模型具有广阔的前景,但也极具挑战性。在此,我们开发了一个名为 xTrimoscFoundation 的大型预训练模型 scFoundation,它拥有 1 亿个参数,涵盖约 2 万个基因,在超过 5000 万个人类单细胞转录组图谱上进行了预训练。scFoundation 是一个在可训练参数大小、基因维度和训练数据量方面的大规模模型。其非对称的类 Transformer 结构和预训练任务设计,能够有效地捕捉各种细胞类型和状态下基因之间复杂的上下文关系。实验表明,scFoundation 作为一个基础模型具有优势,它在各种单细胞分析任务中取得了最先进的性能,例如基因表达增强、组织药物反应预测、单细胞药物反应分类、单细胞扰动预测、细胞类型注释和基因模块推断。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验