• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用基于指南的临床决策支持系统与大语言模型:乳腺癌案例研究

Leveraging Guideline-Based Clinical Decision Support Systems with Large Language Models: A Case Study with Breast Cancer.

作者信息

Delourme Solène, Redjdal Akram, Bouaud Jacques, Seroussi Brigitte

机构信息

Sorbonne Université, Université Sorbonne Paris Nord, INSERM, LIMICS, Paris, France.

EPITA, Paris, France.

出版信息

Methods Inf Med. 2024 Sep;63(3-04):85-96. doi: 10.1055/a-2528-4299. Epub 2025 Jan 29.

DOI:10.1055/a-2528-4299
PMID:39880005
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12133322/
Abstract

BACKGROUND

Multidisciplinary tumor boards (MTBs) have been established in most countries to allow experts collaboratively determine the best treatment decisions for cancer patients. However, MTBs often face challenges such as case overload, which can compromise MTB decision quality. Clinical decision support systems (CDSSs) have been introduced to assist clinicians in this process. Despite their potential, CDSSs are still underutilized in routine practice. The emergence of large language models (LLMs), such as ChatGPT, offers new opportunities to improve the efficiency and usability of traditional CDSSs.

OBJECTIVES

OncoDoc2 is a guideline-based CDSS developed using a documentary approach and applied to breast cancer management. This study aims to evaluate the potential of LLMs, used as question-answering (QA) systems, to improve the usability of OncoDoc2 across different prompt engineering techniques (PETs).

METHODS

Data extracted from breast cancer patient summaries (BCPSs), together with questions formulated by OncoDoc2, were used to create prompts for various LLMs, and several PETs were designed and tested. Using a sample of 200 randomized BCPSs, LLMs and PETs were initially compared with regard to their responses to OncoDoc2 questions using classic metrics (accuracy, precision, recall, and F1 score). Best performing LLMs and PETs were further assessed by comparing the therapeutic recommendations generated by OncoDoc2, based on LLM inputs, to those provided by MTB clinicians using OncoDoc2. Finally, the best performing method was validated using a new sample of 30 randomized BCPSs.

RESULTS

The combination of Mistral and OpenChat models under the enhanced Zero-Shot PET showed the best performance as a question-answering system. This approach gets a precision of 60.16%, a recall of 54.18%, an F1 score of 56.59%, and an accuracy of 75.57% on the validation set of 30 BCPSs. However, this approach yielded poor results as a CDSS, with only 16.67% of the recommendations generated by OncoDoc2 based on LLM inputs matching the gold standard.

CONCLUSION

All the criteria in the OncoDoc2 decision tree are crucial for capturing the uniqueness of each patient. Any deviation from a criterion alters the recommendations generated. Despite achieving a good accuracy rate of 75.57%, LLMs still face challenges in reliably understanding complex medical contexts and be effective as CDSSs.

摘要

背景

大多数国家都已设立多学科肿瘤委员会(MTB),以便专家共同为癌症患者确定最佳治疗决策。然而,MTB常常面临病例过多等挑战,这可能会影响MTB的决策质量。临床决策支持系统(CDSS)已被引入以协助临床医生进行这一过程。尽管CDSS具有潜力,但在常规实践中仍未得到充分利用。诸如ChatGPT等大语言模型(LLM)的出现为提高传统CDSS的效率和可用性提供了新机遇。

目的

OncoDoc2是一种基于指南开发的CDSS,采用文献法开发并应用于乳腺癌管理。本研究旨在评估用作问答(QA)系统的LLM在不同提示工程技术(PET)下提高OncoDoc2可用性的潜力。

方法

从乳腺癌患者摘要(BCPS)中提取的数据,以及OncoDoc2提出的问题,被用于为各种LLM创建提示,并设计和测试了几种PET。使用200个随机抽取的BCPS样本,最初使用经典指标(准确率、精确率、召回率和F1分数)比较LLM和PET对OncoDoc2问题的回答。通过比较基于LLM输入由OncoDoc2生成的治疗建议与MTB临床医生使用OncoDoc2提供的建议,进一步评估表现最佳的LLM和PET。最后,使用30个随机抽取的BCPS新样本对表现最佳的方法进行验证。

结果

在增强型零样本PET下,Mistral和OpenChat模型的组合作为问答系统表现最佳。在30个BCPS的验证集上,这种方法的精确率为60.16%,召回率为54.18%,F1分数为56.59%,准确率为75.57%。然而,作为CDSS,这种方法产生的结果很差,基于LLM输入由OncoDoc2生成的建议中只有16.67%与金标准匹配。

结论

OncoDoc2决策树中的所有标准对于把握每个患者的独特性都至关重要。任何偏离标准的情况都会改变生成的建议。尽管LLM达到了75.57%的良好准确率,但在可靠理解复杂医学背景并有效作为CDSS方面仍面临挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/e5ee2dcc6812/10-1055-a-2528-4299-i24020011-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/3c1847f383aa/10-1055-a-2528-4299-i24020011-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/efb9daa137e2/10-1055-a-2528-4299-i24020011-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/ba8811bba035/10-1055-a-2528-4299-i24020011-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/5aa9ad008586/10-1055-a-2528-4299-i24020011-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/7f911b7cb73c/10-1055-a-2528-4299-i24020011-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/158285aefd5d/10-1055-a-2528-4299-i24020011-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/3d879ba54cc3/10-1055-a-2528-4299-i24020011-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/590c8fa05709/10-1055-a-2528-4299-i24020011-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/b686caec49b7/10-1055-a-2528-4299-i24020011-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/e5ee2dcc6812/10-1055-a-2528-4299-i24020011-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/3c1847f383aa/10-1055-a-2528-4299-i24020011-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/efb9daa137e2/10-1055-a-2528-4299-i24020011-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/ba8811bba035/10-1055-a-2528-4299-i24020011-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/5aa9ad008586/10-1055-a-2528-4299-i24020011-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/7f911b7cb73c/10-1055-a-2528-4299-i24020011-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/158285aefd5d/10-1055-a-2528-4299-i24020011-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/3d879ba54cc3/10-1055-a-2528-4299-i24020011-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/590c8fa05709/10-1055-a-2528-4299-i24020011-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/b686caec49b7/10-1055-a-2528-4299-i24020011-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0b9/12133322/e5ee2dcc6812/10-1055-a-2528-4299-i24020011-10.jpg

相似文献

1
Leveraging Guideline-Based Clinical Decision Support Systems with Large Language Models: A Case Study with Breast Cancer.利用基于指南的临床决策支持系统与大语言模型:乳腺癌案例研究
Methods Inf Med. 2024 Sep;63(3-04):85-96. doi: 10.1055/a-2528-4299. Epub 2025 Jan 29.
2
Physicians' Attitudes Towards the Advice of a Guideline-Based Decision Support System: A Case Study With OncoDoc2 in the Management of Breast Cancer Patients.医生对基于指南的决策支持系统建议的态度:以OncoDoc2在乳腺癌患者管理中的应用为例的研究
Stud Health Technol Inform. 2015;216:264-9.
3
Are Semantic Annotators Able to Extract Relevant Complexity-Related Concepts from Clinical Notes?语义标注员能否从临床记录中提取相关的复杂性相关概念?
Stud Health Technol Inform. 2021 Nov 18;287:153-157. doi: 10.3233/SHTI210836.
4
Leveraging Large Language Models for Decision Support in Personalized Oncology.利用大型语言模型为个性化肿瘤学提供决策支持。
JAMA Netw Open. 2023 Nov 1;6(11):e2343689. doi: 10.1001/jamanetworkopen.2023.43689.
5
The interaction of structured data using openEHR and large Language models for clinical decision support in prostate cancer.使用openEHR结构化数据与大语言模型在前列腺癌临床决策支持中的交互。
World J Urol. 2025 Jan 13;43(1):67. doi: 10.1007/s00345-024-05423-1.
6
Exploring the role of artificial intelligence, large language models: Comparing patient-focused information and clinical decision support capabilities to the gynecologic oncology guidelines.探索人工智能、大语言模型的作用:将以患者为中心的信息和临床决策支持能力与妇科肿瘤学指南进行比较。
Int J Gynaecol Obstet. 2025 Feb;168(2):419-427. doi: 10.1002/ijgo.15869. Epub 2024 Aug 20.
7
Supporting multidisciplinary staff meetings for guideline-based breast cancer management: a study with OncoDoc2.支持基于指南的乳腺癌管理的多学科员工会议:一项使用OncoDoc2的研究。
AMIA Annu Symp Proc. 2007 Oct 11;2007:656-60.
8
Large language model-generated clinical practice guideline for appendicitis.大型语言模型生成的阑尾炎临床实践指南。
Surg Endosc. 2025 Jun;39(6):3539-3551. doi: 10.1007/s00464-025-11723-3. Epub 2025 Apr 18.
9
Do large language model chatbots perform better than established patient information resources in answering patient questions? A comparative study on melanoma.在回答患者问题方面,大型语言模型聊天机器人的表现是否优于成熟的患者信息资源?一项关于黑色素瘤的比较研究。
Br J Dermatol. 2025 Jan 24;192(2):306-315. doi: 10.1093/bjd/ljae377.
10
Patient clinical profiles associated with physician non-compliance despite the use of a guideline-based decision support system: a case study with OncoDoc2 using data mining techniques.尽管使用了基于指南的决策支持系统,但与医生不依从相关的患者临床概况:使用数据挖掘技术对OncoDoc2进行的案例研究
AMIA Annu Symp Proc. 2012;2012:828-37. Epub 2012 Nov 3.

本文引用的文献

1
Measured Performance and Healthcare Professional Perception of Large Language Models Used as Clinical Decision Support Systems: A Scoping Review.大型语言模型作为临床决策支持系统的测量性能和医疗保健专业人员认知:范围综述。
Stud Health Technol Inform. 2024 Aug 22;316:841-845. doi: 10.3233/SHTI240543.
2
Global cancer burden growing, amidst mounting need for services.全球癌症负担不断增加,对服务的需求也日益迫切。
Saudi Med J. 2024 Mar;45(3):326-327.
3
Appropriateness of Artificial Intelligence Chatbots in Diabetic Foot Ulcer Management.人工智能聊天机器人在糖尿病足溃疡管理中的适用性
Int J Low Extrem Wounds. 2024 Feb 28:15347346241236811. doi: 10.1177/15347346241236811.
4
Performance of large language models on advocating the management of meningitis: a comparative qualitative study.大型语言模型在倡导脑膜炎管理方面的表现:一项比较定性研究。
BMJ Health Care Inform. 2024 Feb 2;31(1):e100978. doi: 10.1136/bmjhci-2023-100978.
5
Utilizing Artificial Intelligence and Chat Generative Pretrained Transformer to Answer Questions About Clinical Scenarios in Neuroanesthesiology.利用人工智能和聊天生成预训练变换器回答神经麻醉学临床场景相关问题。
J Neurosurg Anesthesiol. 2023 Dec 19. doi: 10.1097/ANA.0000000000000949.
6
Leveraging Generative AI and Large Language Models: A Comprehensive Roadmap for Healthcare Integration.利用生成式人工智能和大语言模型:医疗保健整合综合路线图。
Healthcare (Basel). 2023 Oct 20;11(20):2776. doi: 10.3390/healthcare11202776.
7
Challenging ChatGPT 3.5 in Senology-An Assessment of Concordance with Breast Cancer Tumor Board Decision Making.在乳腺病学中挑战ChatGPT 3.5——与乳腺癌肿瘤委员会决策的一致性评估
J Pers Med. 2023 Oct 16;13(10):1502. doi: 10.3390/jpm13101502.
8
Evaluating ChatGPT as an adjunct for the multidisciplinary tumor board decision-making in primary breast cancer cases.评估 ChatGPT 在原发性乳腺癌多学科肿瘤委员会决策中的辅助作用。
Arch Gynecol Obstet. 2023 Dec;308(6):1831-1844. doi: 10.1007/s00404-023-07130-5. Epub 2023 Jul 17.
9
Clinical Decision Support Systems Applied to the Management of Breast Cancer Patients: A Scoping Review.临床决策支持系统在乳腺癌患者管理中的应用:综述
Stud Health Technol Inform. 2023 Jun 29;305:353-356. doi: 10.3233/SHTI230503.
10
Large language model (ChatGPT) as a support tool for breast tumor board.大语言模型(ChatGPT)作为乳腺肿瘤多学科诊疗团队的辅助工具。
NPJ Breast Cancer. 2023 May 30;9(1):44. doi: 10.1038/s41523-023-00557-8.