• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

规模化效率:研究微型语言模型在临床任务中的性能。

Efficiency at scale: Investigating the performance of diminutive language models in clinical tasks.

机构信息

Department of Psychiatry, University of Oxford, Oxford, United Kingdom.

Department of Psychiatry, University of Oxford, Oxford, United Kingdom; Centre for Artificial Intelligence in Precision Medicines, University of Oxford, United Kingdom; King Abdulaziz University, Saudi Arabia.

出版信息

Artif Intell Med. 2024 Nov;157:103002. doi: 10.1016/j.artmed.2024.103002. Epub 2024 Oct 23.

DOI:10.1016/j.artmed.2024.103002
PMID:39471774
Abstract

The entry of large language models (LLMs) into research and commercial spaces has led to a trend of ever-larger models, with initial promises of generalisability. This was followed by a widespread desire to downsize and create specialised models without the need for complete fine-tuning, using Parameter Efficient Fine-tuning (PEFT) methods. We present an investigation into the suitability of different PEFT methods to clinical decision-making tasks, across a range of model sizes, including extremely small models with as few as 25 million parameters. Our analysis shows that the performance of most PEFT approaches varies significantly from one task to another, with the exception of LoRA, which maintains relatively high performance across all model sizes and tasks, typically approaching or matching full fine-tuned performance. The effectiveness of PEFT methods in the clinical domain is evident, particularly for specialised models which can operate on low-cost, in-house computing infrastructure. The advantages of these models, in terms of speed and reduced training costs, dramatically outweighs any performance gain from large foundation LLMs. Furthermore, we highlight how domain-specific pre-training interacts with PEFT methods and model size, finding the domain pre-training to be particularly important in smaller models and discuss how these factors interplay to provide the best efficiency-performance trade-off. Full code available at: https://github.com/nlpie-research/efficient-ml.

摘要

大型语言模型 (LLM) 进入研究和商业领域,导致了模型越来越大的趋势,最初承诺具有通用性。随后,人们普遍希望通过参数高效微调 (PEFT) 方法,缩小模型规模并创建专门的模型,而无需进行完整的微调。我们研究了不同的 PEFT 方法在各种模型规模下,包括参数数量少至 2500 万的极小型模型,用于临床决策任务的适用性。我们的分析表明,大多数 PEFT 方法的性能在不同任务之间差异很大,除了 LoRA,它在所有模型大小和任务中都保持相对较高的性能,通常接近或匹配全精调性能。PEFT 方法在临床领域的有效性是明显的,特别是对于可以在低成本内部计算基础设施上运行的专用模型。这些模型在速度和降低培训成本方面的优势,大大超过了大型基础 LLM 带来的任何性能提升。此外,我们还强调了特定于领域的预训练与 PEFT 方法和模型大小之间的相互作用,发现领域预训练在较小的模型中特别重要,并讨论了这些因素如何相互作用,以提供最佳的效率-性能权衡。完整的代码可在 https://github.com/nlpie-research/efficient-ml 上获得。

相似文献

1
Efficiency at scale: Investigating the performance of diminutive language models in clinical tasks.规模化效率:研究微型语言模型在临床任务中的性能。
Artif Intell Med. 2024 Nov;157:103002. doi: 10.1016/j.artmed.2024.103002. Epub 2024 Oct 23.
2
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗:一项网状荟萃分析。
Cochrane Database Syst Rev. 2017 Dec 22;12(12):CD011535. doi: 10.1002/14651858.CD011535.pub2.
5
Sexual Harassment and Prevention Training性骚扰与预防培训
6
Developing healthcare language model embedding spaces.开发医疗保健语言模型嵌入空间。
Artif Intell Med. 2024 Dec;158:103009. doi: 10.1016/j.artmed.2024.103009. Epub 2024 Oct 31.
7
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
8
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗:一项网状Meta分析。
Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.
9
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.
10
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

引用本文的文献

1
Applying Large Language Models for Surgical Case Length Prediction.将大语言模型应用于手术病例时长预测。
JAMA Surg. 2025 Jul 9. doi: 10.1001/jamasurg.2025.2154.
2
Adapting Generative Large Language Models for Information Extraction from Unstructured Electronic Health Records in Residential Aged Care: A Comparative Analysis of Training Approaches.使生成式大语言模型适用于从老年护理机构的非结构化电子健康记录中提取信息:训练方法的比较分析
J Healthc Inform Res. 2025 Feb 20;9(2):191-219. doi: 10.1007/s41666-025-00190-z. eCollection 2025 Jun.