• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BioLLM:一个用于整合和基准测试单细胞基础模型的标准化框架。

BioLLM: A standardized framework for integrating and benchmarking single-cell foundation models.

作者信息

Qiu Ping, Chen Qianqian, Qin Hua, Fang Shuangsang, Zhang Yilin, Zhang Yanlin, Xia Tianyi, Cao Lei, Zhang Yong, Fang Xiaodong, Li Yuxiang, Hu Luni

机构信息

College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.

BGI Research, Beijing 102601, China.

出版信息

Patterns (N Y). 2025 Jul 30;6(8):101326. doi: 10.1016/j.patter.2025.101326. eCollection 2025 Aug 8.

DOI:10.1016/j.patter.2025.101326
PMID:40843339
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12365531/
Abstract

The application and evaluation of single-cell foundation models (scFMs) present significant challenges due to heterogeneous architectures and coding standards. To address this, we introduce BioLLM (biological large language model), a unified framework for integrating and applying scFMs to single-cell RNA sequencing analysis. BioLLM provides a unified interface that integrates diverse scFMs, eliminating architectural and coding inconsistencies to enable streamlined model access. With standardized APIs and comprehensive documentation, BioLLM supports streamlined model switching and consistent benchmarking. Our comprehensive evaluation of scFMs revealed distinct strengths and limitations, highlighting scGPT's robust performance across all tasks, including zero shot and fine-tuning. Geneformer and scFoundation demonstrated strong capabilities in gene-level tasks, benefiting from effective pretraining strategies. In contrast, scBERT lagged behind, likely due to its smaller model size and limited training data. Ultimately, BioLLM aims to empower the scientific community to leverage the full potential of foundational models, advancing our understanding of complex biological systems through enhanced single-cell analysis.

摘要

由于架构和编码标准的异质性,单细胞基础模型(scFMs)的应用和评估面临重大挑战。为解决这一问题,我们引入了BioLLM(生物大语言模型),这是一个用于将scFMs集成并应用于单细胞RNA测序分析的统一框架。BioLLM提供了一个统一接口,可集成各种scFMs,消除架构和编码不一致问题,以实现简化的模型访问。借助标准化的应用程序编程接口(APIs)和全面的文档,BioLLM支持简化的模型切换和一致的基准测试。我们对scFMs的全面评估揭示了其不同的优势和局限性,突出了scGPT在所有任务(包括零样本和微调)中的强大性能。Geneformer和scFoundation在基因级任务中表现出强大能力,这得益于有效的预训练策略。相比之下,scBERT落后了,可能是由于其模型规模较小和训练数据有限。最终,BioLLM旨在使科学界能够充分利用基础模型的潜力,通过增强单细胞分析来推进我们对复杂生物系统的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/3e54d86316c7/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/8d142e55fc0a/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/2a3228d5a5c2/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/7f75890c5ce1/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/8b576d98b89a/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/f0bfacf9f250/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/3e54d86316c7/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/8d142e55fc0a/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/2a3228d5a5c2/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/7f75890c5ce1/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/8b576d98b89a/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/f0bfacf9f250/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ce/12365531/3e54d86316c7/gr6.jpg

相似文献

1
BioLLM: A standardized framework for integrating and benchmarking single-cell foundation models.BioLLM:一个用于整合和基准测试单细胞基础模型的标准化框架。
Patterns (N Y). 2025 Jul 30;6(8):101326. doi: 10.1016/j.patter.2025.101326. eCollection 2025 Aug 8.
2
Enhancing Clinical Relevance of Pretrained Language Models Through Integration of External Knowledge: Case Study on Cardiovascular Diagnosis From Electronic Health Records.通过整合外部知识提高预训练语言模型的临床相关性:来自电子健康记录的心血管诊断案例研究
JMIR AI. 2024 Aug 6;3:e56932. doi: 10.2196/56932.
3
Evaluating the Reasoning Capabilities of Large Language Models for Medical Coding and Hospital Readmission Risk Stratification: Zero-Shot Prompting Approach.评估大型语言模型在医学编码和医院再入院风险分层方面的推理能力:零样本提示方法。
J Med Internet Res. 2025 Jul 30;27:e74142. doi: 10.2196/74142.
4
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
5
Psychometric Evaluation of Large Language Model Embeddings for Personality Trait Prediction.用于人格特质预测的大语言模型嵌入的心理测量评估
J Med Internet Res. 2025 Jul 8;27:e75347. doi: 10.2196/75347.
6
BAHBench: A Unified Benchmark for Evaluating Bio-Acoustic Health With Acoustic Foundation Models.BAHBench:一个用于利用声学基础模型评估生物声学健康状况的统一基准。
IEEE J Biomed Health Inform. 2025 Jul;29(7):4897-4909. doi: 10.1109/JBHI.2025.3543968.
7
A narrative review of foundation models for medical image segmentation: zero-shot performance evaluation on diverse modalities.医学图像分割基础模型的叙述性综述:不同模态下的零样本性能评估
Quant Imaging Med Surg. 2025 Jun 6;15(6):5825-5858. doi: 10.21037/qims-2024-2826. Epub 2025 Jun 3.
8
A dataset and benchmark for hospital course summarization with adapted large language models.一个用于医院病程总结的数据集和基准测试,采用了适配的大语言模型。
J Am Med Inform Assoc. 2025 Mar 1;32(3):470-479. doi: 10.1093/jamia/ocae312.
9
Development and Validation of a Large Language Model-Based System for Medical History-Taking Training: Prospective Multicase Study on Evaluation Stability, Human-AI Consistency, and Transparency.基于大语言模型的病史采集训练系统的开发与验证:关于评估稳定性、人机一致性和透明度的前瞻性多案例研究
JMIR Med Educ. 2025 Aug 29;11:e73419. doi: 10.2196/73419.
10
Using a Diverse Test Suite to Assess Large Language Models on Fast Health Care Interoperability Resources Knowledge: Comparative Analysis.使用多样化测试套件在快速医疗保健互操作性资源知识方面评估大语言模型:比较分析
J Med Internet Res. 2025 Aug 12;27:e73540. doi: 10.2196/73540.

本文引用的文献

1
GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model.基因指南针:基于知识驱动的跨物种基础模型解析通用基因调控机制
Cell Res. 2024 Dec;34(12):830-845. doi: 10.1038/s41422-024-01034-y. Epub 2024 Oct 8.
2
Transformers in single-cell omics: a review and new perspectives.单细胞组学中的转换器:综述与新视角。
Nat Methods. 2024 Aug;21(8):1430-1443. doi: 10.1038/s41592-024-02353-z. Epub 2024 Aug 9.
3
scTab: Scaling cross-tissue single-cell annotation models.scTab:缩放跨组织单细胞注释模型。
Nat Commun. 2024 Aug 4;15(1):6611. doi: 10.1038/s41467-024-51059-5.
4
scCross: a deep generative model for unifying single-cell multi-omics with seamless integration, cross-modal generation, and in silico exploration.scCross:一个深度生成模型,用于将单细胞多组学数据进行统一,实现无缝集成、跨模态生成和计算探索。
Genome Biol. 2024 Jul 29;25(1):198. doi: 10.1186/s13059-024-03338-z.
5
Harnessing the deep learning power of foundation models in single-cell omics.利用基础模型在单细胞组学中的深度学习能力。
Nat Rev Mol Cell Biol. 2024 Aug;25(8):593-594. doi: 10.1038/s41580-024-00756-6.
6
Large-scale foundation model on single-cell transcriptomics.单细胞转录组学的大规模基础模型。
Nat Methods. 2024 Aug;21(8):1481-1491. doi: 10.1038/s41592-024-02305-7. Epub 2024 Jun 6.
7
scGPT: toward building a foundation model for single-cell multi-omics using generative AI.scGPT:迈向使用生成式人工智能构建单细胞多组学基础模型
Nat Methods. 2024 Aug;21(8):1470-1480. doi: 10.1038/s41592-024-02201-0. Epub 2024 Feb 26.
8
Semi-supervised integration of single-cell transcriptomics data.单细胞转录组学数据的半监督整合。
Nat Commun. 2024 Jan 29;15(1):872. doi: 10.1038/s41467-024-45240-z.
9
Transfer learning enables predictions in network biology.迁移学习可实现网络生物学预测。
Nature. 2023 Jun;618(7965):616-624. doi: 10.1038/s41586-023-06139-9. Epub 2023 May 31.
10
CIForm as a Transformer-based model for cell-type annotation of large-scale single-cell RNA-seq data.CIForm 作为一种基于 Transformer 的模型,用于大规模单细胞 RNA-seq 数据的细胞类型注释。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad195.