• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于肿瘤学的大语言模型的开发与评估:一项范围综述。

Development and evaluation of large-language models (LLMs) for oncology: A scoping review.

作者信息

Mehan Namya, Desinghe Teshan Dias, Saha Ashirbani

机构信息

Integrated Biomedical Engineering and Health Sciences, McMaster University, Hamilton, Ontario, Canada.

Global Health Program, Faculty of Health Sciences, McMaster University, Hamilton, Ontario, Canada.

出版信息

PLOS Digit Health. 2025 Aug 7;4(8):e0000980. doi: 10.1371/journal.pdig.0000980. eCollection 2025 Aug.

DOI:10.1371/journal.pdig.0000980
PMID:40773525
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12331086/
Abstract

Large language models (LLMs), a significant development in artificial intelligence (AI), are continuing to demonstrate seminal improvement in performance for various text analysis and generation tasks. There are limited systematic studies on LLM applications that were developed/evaluated in relevance to oncology. Our scoping review explores applications of LLMs in oncology to determine (1) the nature of LLM applications relevant to a cancer/tumor type, (2) the phases of cancer care addressed by the LLMs, (3) which LLMs were used in these applications, (4) the sources and pre-processing of datasets used, (5) the techniques used to optimize the performance of LLMs, (6) the methods of evaluation, and (7) the common limitations noted by the authors of these LLM applications and to study their implications in research and practice. A librarian-assisted search was performed across the following databases: Association for Computing Machinery (ACM), Embase, Engineering Village, IEEE Xplore, Medline, Scopus, SPIE and Web of Science till Jan 12, 2024. Pre-prints from this search were considered if they were published/accepted by Feb 29, 2024. From the initial search of 14863 articles, 60 were finally included. Our results demonstrated that LLMs were mostly evaluated across a diverse set of oncology-related applications. Generative pre-trained transformer (GPT)-based LLMs were mostly used. In the subset of studies where the phase(s) of cancer care was/were provided or implied, treatment and diagnosis were the most included phases. Data for development and evaluation extended from patient health records, synthetic patient records, research and professional society publications to social media. Prompt-designing and engineering were performed as data pre-processing steps in several studies. Clinicians, trainees, researchers, and patients were among the variety of users targeted by the applications. In the17% studies that developed LLMs for oncological aspects, domain adaptation through pre-training and fine-tuning were often performed and resulted in performance improvement. The evaluation of an LLM's performance involved usage of both standard, validated, non-standardized, and/or customized performance measures considering a variety of constructs, other than accuracy. Six primary themes emerged as limitations including limitation of generalizability/applicability, sample size, bias and subjectivity, and evaluation metrics. This review highlights that LLMs, specific to oncological aspects, are less common than general-purpose LLMs. The application areas were heterogeneous, used diverse data sources, were directed towards a variety of users, and resulted in variety of evaluation methods. Despite the diversity of LLM applications in oncology, future research needs to address the limited generalizability of these applications, mitigation of bias and subjectivity, and standardization of evaluation methodologies. Future applications of LLMs in oncology should include developing oncology-specific LLMs that can mitigate knowledge gaps and extend to diverse areas of oncology training and practice not considered so far.

摘要

大语言模型(LLMs)是人工智能(AI)领域的一项重大发展,在各种文本分析和生成任务中持续展现出开创性的性能提升。针对与肿瘤学相关而开发/评估的大语言模型应用的系统性研究有限。我们的范围综述探讨了大语言模型在肿瘤学中的应用,以确定:(1)与癌症/肿瘤类型相关的大语言模型应用的性质;(2)大语言模型所涉及的癌症护理阶段;(3)这些应用中使用的大语言模型;(4)所使用数据集的来源和预处理;(5)用于优化大语言模型性能的技术;(6)评估方法;以及(7)这些大语言模型应用的作者指出的常见局限性,并研究它们在研究和实践中的影响。在以下数据库中进行了图书馆员协助的检索:美国计算机协会(ACM)、Embase、工程索引(Engineering Village)、IEEE Xplore、医学索引(Medline)、Scopus、国际光学工程学会(SPIE)和科学引文索引(Web of Science),检索截至2024年1月12日。如果预印本在2024年2月29日前已发表/被接受,则纳入此次检索。从最初检索到的14863篇文章中,最终纳入了60篇。我们的结果表明,大语言模型大多在一系列不同的肿瘤学相关应用中得到评估。基于生成式预训练变换器(GPT)的大语言模型使用最为广泛。在提供或暗示了癌症护理阶段的研究子集中,治疗和诊断是最常涉及的阶段。用于开发和评估的数据范围从患者健康记录、合成患者记录、研究及专业协会出版物到社交媒体。在一些研究中,提示设计和工程作为数据预处理步骤进行。应用的目标用户包括临床医生、实习生、研究人员和患者等各类人群。在17%针对肿瘤学方面开发大语言模型的研究中,常通过预训练和微调进行领域适应,从而提高了性能。对大语言模型性能的评估涉及使用标准的、经过验证的、非标准化的和/或定制的性能指标,这些指标考虑了除准确性之外的各种结构。出现了六个主要的局限性主题,包括可推广性/适用性的局限性、样本量、偏差和主观性以及评估指标。本综述强调,针对肿瘤学方面的大语言模型比通用大语言模型少见。应用领域各异,使用了不同的数据来源,面向各类用户,且产生了多种评估方法。尽管大语言模型在肿瘤学中的应用具有多样性,但未来研究需要解决这些应用可推广性有限、偏差和主观性的缓解以及评估方法的标准化等问题。大语言模型在肿瘤学中的未来应用应包括开发特定于肿瘤学的大语言模型,以弥补知识差距,并扩展到目前尚未考虑的肿瘤学培训和实践的不同领域。

相似文献

1
Development and evaluation of large-language models (LLMs) for oncology: A scoping review.用于肿瘤学的大语言模型的开发与评估:一项范围综述。
PLOS Digit Health. 2025 Aug 7;4(8):e0000980. doi: 10.1371/journal.pdig.0000980. eCollection 2025 Aug.
2
Implementing Large Language Models in Health Care: Clinician-Focused Review With Interactive Guideline.在医疗保健中应用大语言模型:以临床医生为重点的回顾与交互式指南
J Med Internet Res. 2025 Jul 11;27:e71916. doi: 10.2196/71916.
3
Large Language Models and Empathy: Systematic Review.大语言模型与同理心:系统综述
J Med Internet Res. 2024 Dec 11;26:e52597. doi: 10.2196/52597.
4
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
5
Applications and Concerns of ChatGPT and Other Conversational Large Language Models in Health Care: Systematic Review.ChatGPT 及其他会话型大型语言模型在医疗保健中的应用及关注:系统评价。
J Med Internet Res. 2024 Nov 7;26:e22769. doi: 10.2196/22769.
6
Sexual Harassment and Prevention Training性骚扰与预防培训
7
Behavioral interventions to reduce risk for sexual transmission of HIV among men who have sex with men.降低男男性行为者中艾滋病毒性传播风险的行为干预措施。
Cochrane Database Syst Rev. 2008 Jul 16(3):CD001230. doi: 10.1002/14651858.CD001230.pub2.
8
Magnetic resonance perfusion for differentiating low-grade from high-grade gliomas at first presentation.首次就诊时磁共振灌注成像用于鉴别低级别与高级别胶质瘤
Cochrane Database Syst Rev. 2018 Jan 22;1(1):CD011551. doi: 10.1002/14651858.CD011551.pub2.
9
Applications of Large Language Models in the Field of Suicide Prevention: Scoping Review.大语言模型在自杀预防领域的应用:范围综述
J Med Internet Res. 2025 Jan 23;27:e63126. doi: 10.2196/63126.
10
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.

本文引用的文献

1
Implications of Large Language Models for Clinical Practice: Ethical Analysis Through the Principlism Framework.大语言模型对临床实践的影响:通过原则主义框架进行伦理分析
J Eval Clin Pract. 2025 Feb;31(1):e14250. doi: 10.1111/jep.14250.
2
Large language model use in clinical oncology.大语言模型在临床肿瘤学中的应用。
NPJ Precis Oncol. 2024 Oct 23;8(1):240. doi: 10.1038/s41698-024-00733-4.
3
Clinician voices on ethics of LLM integration in healthcare: a thematic analysis of ethical concerns and implications.临床医生对医疗保健中 LLM 整合的伦理看法:对伦理问题和影响的主题分析。
BMC Med Inform Decis Mak. 2024 Sep 9;24(1):250. doi: 10.1186/s12911-024-02656-3.
4
Comparative Evaluation of LLMs in Clinical Oncology.临床肿瘤学中大型语言模型的比较评估
NEJM AI. 2024 May;1(5). doi: 10.1056/aioa2300151. Epub 2024 Apr 16.
5
Current Strengths and Weaknesses of ChatGPT as a Resource for Radiation Oncology Patients and Providers.ChatGPT 在肿瘤放疗患者和医护人员中的优势和劣势
Int J Radiat Oncol Biol Phys. 2024 Mar 15;118(4):905-915. doi: 10.1016/j.ijrobp.2023.10.020. Epub 2023 Oct 30.
6
The ethics of ChatGPT in medicine and healthcare: a systematic review on Large Language Models (LLMs).ChatGPT在医学与医疗保健领域的伦理问题:关于大语言模型(LLMs)的系统综述
NPJ Digit Med. 2024 Jul 8;7(1):183. doi: 10.1038/s41746-024-01157-x.
7
Clinician- and Patient-Directed Communication Strategies for Patients With Cancer at High Mortality Risk: A Cluster Randomized Trial.面向高死亡风险癌症患者的临床医生和患者导向的沟通策略:一项集群随机试验。
JAMA Netw Open. 2024 Jul 1;7(7):e2418639. doi: 10.1001/jamanetworkopen.2024.18639.
8
Large Language Models in Oncology: Revolution or Cause for Concern?大语言模型在肿瘤学中的应用:是革命还是值得关注的问题?
Curr Oncol. 2024 Mar 29;31(4):1817-1830. doi: 10.3390/curroncol31040137.
9
neuroGPT-X: toward a clinic-ready large language model.神经 GPT-X:迈向临床就绪的大型语言模型。
J Neurosurg. 2023 Oct 6;140(4):1041-1053. doi: 10.3171/2023.7.JNS23573. Print 2024 Apr 1.
10
CancerGPT for few shot drug pair synergy prediction using large pretrained language models.使用大型预训练语言模型进行少样本药物对协同作用预测的CancerGPT
NPJ Digit Med. 2024 Feb 19;7(1):40. doi: 10.1038/s41746-024-01024-9.