Suppr超能文献

scGPT:迈向使用生成式人工智能构建单细胞多组学基础模型

scGPT: toward building a foundation model for single-cell multi-omics using generative AI.

作者信息

Cui Haotian, Wang Chloe, Maan Hassaan, Pang Kuan, Luo Fengning, Duan Nan, Wang Bo

机构信息

Peter Munk Cardiac Centre, University Health Network, Toronto, Ontartio, Canada.

Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.

出版信息

Nat Methods. 2024 Aug;21(8):1470-1480. doi: 10.1038/s41592-024-02201-0. Epub 2024 Feb 26.

Abstract

Generative pretrained models have achieved remarkable success in various domains such as language and computer vision. Specifically, the combination of large-scale diverse datasets and pretrained transformers has emerged as a promising approach for developing foundation models. Drawing parallels between language and cellular biology (in which texts comprise words; similarly, cells are defined by genes), our study probes the applicability of foundation models to advance cellular biology and genetic research. Using burgeoning single-cell sequencing data, we have constructed a foundation model for single-cell biology, scGPT, based on a generative pretrained transformer across a repository of over 33 million cells. Our findings illustrate that scGPT effectively distills critical biological insights concerning genes and cells. Through further adaptation of transfer learning, scGPT can be optimized to achieve superior performance across diverse downstream applications. This includes tasks such as cell type annotation, multi-batch integration, multi-omic integration, perturbation response prediction and gene network inference.

摘要

生成式预训练模型在语言和计算机视觉等各个领域都取得了显著成功。具体而言,大规模多样数据集与预训练的Transformer的结合已成为开发基础模型的一种有前景的方法。鉴于语言和细胞生物学之间的相似性(文本由单词组成;同样,细胞由基因定义),我们的研究探讨了基础模型在推进细胞生物学和基因研究方面的适用性。利用新兴的单细胞测序数据,我们基于一个跨越超过3300万个细胞库的生成式预训练Transformer构建了一个单细胞生物学基础模型scGPT。我们的研究结果表明,scGPT有效地提炼了有关基因和细胞的关键生物学见解。通过进一步采用迁移学习进行调整,scGPT可以进行优化,以在各种不同的下游应用中实现卓越性能。这包括细胞类型注释、多批次整合、多组学整合、扰动反应预测和基因网络推断等任务。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验