• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用scPEFT通过参数高效微调来发挥单细胞大语言模型的强大作用。

Harnessing the Power of Single-Cell Large Language Models with Parameter Efficient Fine-Tuning using scPEFT.

作者信息

He Fei, Fei Ruixin, Krull Jordan E, Zhang Xinyu, Gao Mingyue, Su Li, Chen Yibo, Yu Yang, Li Jinpu, Jin Baichuan, Chang Yuzhou, Ma Anjun, Ma Qin, Xu Dong

机构信息

Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA.

Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA.

出版信息

Res Sq. 2025 Apr 25:rs.3.rs-5926885. doi: 10.21203/rs.3.rs-5926885/v1.

DOI:10.21203/rs.3.rs-5926885/v1
PMID:40313770
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12045372/
Abstract

Single-cell large language models (scLLMs) capture essential biological insights from vast single-cell atlases but struggle in out-of-context applications, where zero-shot predictions can be unreliable. To address this, we introduce a single-cell parameter-efficient fine-tuning (scPEFT) framework that integrates learnable, low-dimensional adapters into scLLMs. By freezing the backbone model and updating only the adapter parameters, scPEFT efficiently adapts to specific tasks using limited custom data. This approach mitigates catastrophic forgetting, reduces parameter tuning by over 96%, and decreases GPU memory usage by more than half, significantly enhancing scLLMs's accessibility for resource-constrained researchers. Validated across diverse datasets, scPEFT outperformed zero-shot models and traditional fine-tuning in disease-specific, cross-species, and under-characterized cell population tasks. Its attention-mechanism analysis identified COVID-related genes associated with specific cell states and uncovered unique blood cell subpopulations, demonstrating scPEFT's capacity for condition-specific interpretations. These findings position scPEFT as an efficient solution for improving scLLMs' utilities in general single-cell analyses.

摘要

单细胞大语言模型(scLLMs)能从海量的单细胞图谱中获取重要的生物学见解,但在脱离上下文的应用中表现不佳,在这类应用中零样本预测可能不可靠。为了解决这一问题,我们引入了一种单细胞参数高效微调(scPEFT)框架,该框架将可学习的低维适配器集成到scLLMs中。通过冻结主干模型并仅更新适配器参数,scPEFT使用有限的自定义数据有效地适应特定任务。这种方法减轻了灾难性遗忘,将参数调整减少了96%以上,并将GPU内存使用量减少了一半以上,显著提高了资源受限的研究人员使用scLLMs的便利性。在各种数据集上经过验证,scPEFT在疾病特异性、跨物种和特征不明确的细胞群体任务中优于零样本模型和传统微调。其注意力机制分析确定了与特定细胞状态相关的COVID相关基因,并发现了独特的血细胞亚群,证明了scPEFT进行特定条件解释的能力。这些发现使scPEFT成为在一般单细胞分析中提高scLLMs实用性的有效解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/0bdcb50c1580/nihpp-rs5926885v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/d87f04a2a51c/nihpp-rs5926885v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/975c986f5e4e/nihpp-rs5926885v1-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/155dd3333789/nihpp-rs5926885v1-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/2a389ecc573d/nihpp-rs5926885v1-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/854cf509c19b/nihpp-rs5926885v1-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/351061416891/nihpp-rs5926885v1-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/41f0c82c6091/nihpp-rs5926885v1-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/eb21765efbe7/nihpp-rs5926885v1-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/7493c96fc66f/nihpp-rs5926885v1-f0015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/f1c409f2fecb/nihpp-rs5926885v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/7f0816c370e6/nihpp-rs5926885v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/ee65abaaeeee/nihpp-rs5926885v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/3cc427016016/nihpp-rs5926885v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/c70efa055f63/nihpp-rs5926885v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/0bdcb50c1580/nihpp-rs5926885v1-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/d87f04a2a51c/nihpp-rs5926885v1-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/975c986f5e4e/nihpp-rs5926885v1-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/155dd3333789/nihpp-rs5926885v1-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/2a389ecc573d/nihpp-rs5926885v1-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/854cf509c19b/nihpp-rs5926885v1-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/351061416891/nihpp-rs5926885v1-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/41f0c82c6091/nihpp-rs5926885v1-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/eb21765efbe7/nihpp-rs5926885v1-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/7493c96fc66f/nihpp-rs5926885v1-f0015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/f1c409f2fecb/nihpp-rs5926885v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/7f0816c370e6/nihpp-rs5926885v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/ee65abaaeeee/nihpp-rs5926885v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/3cc427016016/nihpp-rs5926885v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/c70efa055f63/nihpp-rs5926885v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad0/12045372/0bdcb50c1580/nihpp-rs5926885v1-f0006.jpg

相似文献

1
Harnessing the Power of Single-Cell Large Language Models with Parameter Efficient Fine-Tuning using scPEFT.利用scPEFT通过参数高效微调来发挥单细胞大语言模型的强大作用。
Res Sq. 2025 Apr 25:rs.3.rs-5926885. doi: 10.21203/rs.3.rs-5926885/v1.
2
Parameter-Efficient Fine-Tuning Enhances Adaptation of Single Cell Large Language Model for Cell Type Identification.参数高效微调增强了单细胞大语言模型在细胞类型识别中的适应性。
bioRxiv. 2024 Jan 30:2024.01.27.577455. doi: 10.1101/2024.01.27.577455.
3
Parameter Efficient Fine-tuning of Transformer-based Masked Autoencoder Enhances Resource Constrained Neuroimage Analysis.基于Transformer的掩码自动编码器的参数高效微调增强了资源受限的神经图像分析。
bioRxiv. 2025 Feb 20:2025.02.15.638442. doi: 10.1101/2025.02.15.638442.
4
Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation.迈向用于体积器官分割的基础模型和少样本参数高效微调
Med Image Anal. 2025 May 2;103:103596. doi: 10.1016/j.media.2025.103596.
5
Enhancing Few-Shot CLIP With Semantic-Aware Fine-Tuning.通过语义感知微调增强少样本CLIP
IEEE Trans Neural Netw Learn Syst. 2024 Aug 26;PP. doi: 10.1109/TNNLS.2024.3443394.
6
Democratizing Protein Language Models with Parameter-Efficient Fine-Tuning.通过参数高效微调实现蛋白质语言模型的民主化
bioRxiv. 2023 Nov 10:2023.11.09.566187. doi: 10.1101/2023.11.09.566187.
7
Positional embeddings and zero-shot learning using BERT for molecular-property prediction.使用BERT进行位置嵌入和零样本学习以预测分子性质
J Cheminform. 2025 Feb 5;17(1):17. doi: 10.1186/s13321-025-00959-9.
8
Fine-tuning protein language models boosts predictions across diverse tasks.微调蛋白质语言模型可提高跨多种任务的预测能力。
Nat Commun. 2024 Aug 28;15(1):7407. doi: 10.1038/s41467-024-51844-2.
9
DVPT: Dynamic Visual Prompt Tuning of large pre-trained models for medical image analysis.DVPT:用于医学图像分析的大型预训练模型的动态视觉提示调整
Neural Netw. 2025 May;185:107168. doi: 10.1016/j.neunet.2025.107168. Epub 2025 Jan 16.
10
Proto-Adapter: Efficient Training-Free CLIP-Adapter for Few-Shot Image Classification.Proto-Adapter:用于少样本图像分类的高效无需训练的CLIP-Adapter
Sensors (Basel). 2024 Jun 4;24(11):3624. doi: 10.3390/s24113624.

本文引用的文献

1
CZ CELLxGENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data.CZ CELLxGENE发现平台:一个用于对聚合数据进行可扩展探索、分析和建模的单细胞数据平台。
Nucleic Acids Res. 2025 Jan 6;53(D1):D886-D900. doi: 10.1093/nar/gkae1142.
2
A cell atlas foundation model for scalable search of similar human cells.一种用于可扩展搜索相似人类细胞的细胞图谱基础模型。
Nature. 2025 Feb;638(8052):1085-1094. doi: 10.1038/s41586-024-08411-y. Epub 2024 Nov 20.
3
GeneCompass: deciphering universal gene regulatory mechanisms with a knowledge-informed cross-species foundation model.
基因指南针:基于知识驱动的跨物种基础模型解析通用基因调控机制
Cell Res. 2024 Dec;34(12):830-845. doi: 10.1038/s41422-024-01034-y. Epub 2024 Oct 8.
4
scTab: Scaling cross-tissue single-cell annotation models.scTab:缩放跨组织单细胞注释模型。
Nat Commun. 2024 Aug 4;15(1):6611. doi: 10.1038/s41467-024-51059-5.
5
Exploring and retrieving sequence and metadata for species across the tree of life with NCBI Datasets.利用 NCBI Datasets 探索和获取跨生命之树的物种的序列和元数据。
Sci Data. 2024 Jul 5;11(1):732. doi: 10.1038/s41597-024-03571-y.
6
Harnessing the deep learning power of foundation models in single-cell omics.利用基础模型在单细胞组学中的深度学习能力。
Nat Rev Mol Cell Biol. 2024 Aug;25(8):593-594. doi: 10.1038/s41580-024-00756-6.
7
Large-scale foundation model on single-cell transcriptomics.单细胞转录组学的大规模基础模型。
Nat Methods. 2024 Aug;21(8):1481-1491. doi: 10.1038/s41592-024-02305-7. Epub 2024 Jun 6.
8
Scanorama: integrating large and diverse single-cell transcriptomic datasets.Scanorama:整合大型和多样化的单细胞转录组数据集。
Nat Protoc. 2024 Aug;19(8):2283-2297. doi: 10.1038/s41596-024-00991-3. Epub 2024 Jun 6.
9
scGPT: toward building a foundation model for single-cell multi-omics using generative AI.scGPT:迈向使用生成式人工智能构建单细胞多组学基础模型
Nat Methods. 2024 Aug;21(8):1470-1480. doi: 10.1038/s41592-024-02201-0. Epub 2024 Feb 26.
10
Toward universal cell embeddings: integrating single-cell RNA-seq datasets across species with SATURN.迈向通用细胞嵌入:使用SATURN整合跨物种的单细胞RNA测序数据集。
Nat Methods. 2024 Aug;21(8):1492-1500. doi: 10.1038/s41592-024-02191-z. Epub 2024 Feb 16.