零样本评估揭示了单细胞基础模型的局限性。

Zero-shot evaluation reveals limitations of single-cell foundation models.

作者信息

Kedzierska Kasia Z, Crawford Lorin, Amini Ava P, Lu Alex X

机构信息

University of Oxford, Oxford, UK.

Microsoft Research, Cambridge, MA, USA.

出版信息

Genome Biol. 2025 Apr 18;26(1):101. doi: 10.1186/s13059-025-03574-x.

DOI:10.1186/s13059-025-03574-x

PMID:40251685

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12007350/

Abstract

Foundation models such as scGPT and Geneformer have not been rigorously evaluated in a setting where they are used without any further training (i.e., zero-shot). Understanding the performance of models in zero-shot settings is critical to applications that exclude the ability to fine-tune, such as discovery settings where labels are unknown. Our evaluation of the zero-shot performance of Geneformer and scGPT suggests that, in some cases, these models may face reliability challenges and could be outperformed by simpler methods. Our findings underscore the importance of zero-shot evaluations in development and deployment of foundation models in single-cell research.

摘要

诸如scGPT和Geneformer等基础模型，在未经过任何进一步训练（即零样本）的情况下使用时，尚未得到严格评估。了解模型在零样本设置下的性能对于那些排除了微调能力的应用至关重要，比如标签未知的发现设置。我们对Geneformer和scGPT的零样本性能评估表明，在某些情况下，这些模型可能面临可靠性挑战，并且可能被更简单的方法超越。我们的研究结果强调了零样本评估在单细胞研究中基础模型开发和部署中的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac2/12007350/17675c0732dd/13059_2025_3574_Fig1_HTML.jpg

相似文献

Zero-shot evaluation reveals limitations of single-cell foundation models.零样本评估揭示了单细胞基础模型的局限性。

Genome Biol. 2025 Apr 18;26(1):101. doi: 10.1186/s13059-025-03574-x.

Mouse-Geneformer: A deep learning model for mouse single-cell transcriptome and its cross-species utility.小鼠基因Transformer：一种用于小鼠单细胞转录组的深度学习模型及其跨物种效用。

PLoS Genet. 2025 Mar 19;21(3):e1011420. doi: 10.1371/journal.pgen.1011420. eCollection 2025 Mar.

scGPT: toward building a foundation model for single-cell multi-omics using generative AI.scGPT：迈向使用生成式人工智能构建单细胞多组学基础模型

Nat Methods. 2024 Aug;21(8):1470-1480. doi: 10.1038/s41592-024-02201-0. Epub 2024 Feb 26.

GPT-4 as an X data annotator: Unraveling its performance on a stance classification task.GPT-4 作为 X 数据标注员：在立场分类任务中表现如何。

PLoS One. 2024 Aug 15;19(8):e0307741. doi: 10.1371/journal.pone.0307741. eCollection 2024.

Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation.用于医学成像的通用视觉基础模型：以零样本医学分割中的分割一切模型为例

Diagnostics (Basel). 2023 Jun 2;13(11):1947. doi: 10.3390/diagnostics13111947.

An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing: Algorithm Development and Validation Study.零样本临床自然语言处理中大型语言模型提示策略的实证评估：算法开发与验证研究

JMIR Med Inform. 2024 Apr 8;12:e55318. doi: 10.2196/55318.

Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-Shot Metric Depth and Surface Normal Estimation.Metric3D v2：一种用于零样本度量深度和表面法线估计的通用单目几何基础模型。

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):10579-10596. doi: 10.1109/TPAMI.2024.3444912. Epub 2024 Nov 6.

ChatGPT-4 extraction of heart failure symptoms and signs from electronic health records.ChatGPT-4从电子健康记录中提取心力衰竭症状和体征

Prog Cardiovasc Dis. 2024 Nov-Dec;87:44-49. doi: 10.1016/j.pcad.2024.10.010. Epub 2024 Oct 21.

GenePT: A Simple But Effective Foundation Model for Genes and Cells Built From ChatGPT.GenePT：一种基于ChatGPT构建的用于基因和细胞的简单而有效的基础模型。

bioRxiv. 2024 Mar 5:2023.10.16.562533. doi: 10.1101/2023.10.16.562533.

Benchmarking foundation cell models for post-perturbation RNA-seq prediction.用于扰动后RNA测序预测的基础细胞模型基准测试

BMC Genomics. 2025 Apr 23;26(1):393. doi: 10.1186/s12864-025-11600-2.

引用本文的文献

Multimodal integration strategies for clinical application in oncology.肿瘤学临床应用中的多模态整合策略

Front Pharmacol. 2025 Aug 20;16:1609079. doi: 10.3389/fphar.2025.1609079. eCollection 2025.

scELMo: Embeddings from Language Models are Good Learners for Single-cell Data Analysis.scELMo：来自语言模型的嵌入是单细胞数据分析的优秀学习者。

bioRxiv. 2025 Aug 23:2023.12.07.569910. doi: 10.1101/2023.12.07.569910.

Deep-learning-based gene perturbation effect prediction does not yet outperform simple linear baselines.基于深度学习的基因扰动效应预测尚未超越简单的线性基线。

Nat Methods. 2025 Aug;22(8):1657-1661. doi: 10.1038/s41592-025-02772-6. Epub 2025 Aug 4.

Can AI build a virtual cell? Scientists race to model life's smallest unit.人工智能能构建虚拟细胞吗？科学家竞相为生命的最小单位建模。

Nature. 2025 Jul;643(8070):13-14. doi: 10.1038/d41586-025-02011-0.

sciLaMA: A Single-Cell Representation Learning Framework to Leverage Prior Knowledge from Large Language Models.sciLaMA：一种利用大语言模型先验知识的单细胞表示学习框架。

bioRxiv. 2025 May 29:2025.01.28.635153. doi: 10.1101/2025.01.28.635153.

Limitations of cell embedding metrics assessed using drifting islands.使用漂移岛评估细胞嵌入指标的局限性。

Nat Biotechnol. 2025 Jun 11. doi: 10.1038/s41587-025-02702-z.

Primer on machine learning applications in brain immunology.脑免疫学中机器学习应用入门

Front Bioinform. 2025 Apr 17;5:1554010. doi: 10.3389/fbinf.2025.1554010. eCollection 2025.

New horizons at the interface of artificial intelligence and translational cancer research.人工智能与转化性癌症研究交叉领域的新视野。

Cancer Cell. 2025 Apr 14;43(4):708-727. doi: 10.1016/j.ccell.2025.03.018.

本文引用的文献

scGPT: toward building a foundation model for single-cell multi-omics using generative AI.scGPT：迈向使用生成式人工智能构建单细胞多组学基础模型

Nat Methods. 2024 Aug;21(8):1470-1480. doi: 10.1038/s41592-024-02201-0. Epub 2024 Feb 26.

Transfer learning enables predictions in network biology.迁移学习可实现网络生物学预测。

Nature. 2023 Jun;618(7965):616-624. doi: 10.1038/s41586-023-06139-9. Epub 2023 May 31.

Best practices for single-cell analysis across modalities.多模态单细胞分析的最佳实践。

Nat Rev Genet. 2023 Aug;24(8):550-572. doi: 10.1038/s41576-023-00586-w. Epub 2023 Mar 31.

Cross-tissue immune cell analysis reveals tissue-specific features in humans.跨组织免疫细胞分析揭示人类组织特异性特征。

Science. 2022 May 13;376(6594):eabl5197. doi: 10.1126/science.abl5197.

The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans.智慧人图谱：人类多器官单细胞转录组图谱。

Science. 2022 May 13;376(6594):eabl4896. doi: 10.1126/science.abl4896.

A Python library for probabilistic analysis of single-cell omics data.一个用于单细胞组学数据概率分析的Python库。

Nat Biotechnol. 2022 Feb;40(2):163-166. doi: 10.1038/s41587-021-01206-w.

Benchmarking atlas-level data integration in single-cell genomics.单细胞基因组学中图谱级数据整合的基准测试。

Nat Methods. 2022 Jan;19(1):41-50. doi: 10.1038/s41592-021-01336-8. Epub 2021 Dec 23.

Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。

Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.

Computational principles and challenges in single-cell data integration.单细胞数据整合的计算原理与挑战。

Nat Biotechnol. 2021 Oct;39(10):1202-1215. doi: 10.1038/s41587-021-00895-7. Epub 2021 May 3.

Fast, sensitive and accurate integration of single-cell data with Harmony.利用 Harmony 实现单细胞数据的快速、灵敏和精确整合。

Nat Methods. 2019 Dec;16(12):1289-1296. doi: 10.1038/s41592-019-0619-0. Epub 2019 Nov 18.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

零样本评估揭示了单细胞基础模型的局限性。

Zero-shot evaluation reveals limitations of single-cell foundation models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献