• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用检索增强生成技术评估上下文感知聊天机器人,以回答关于药物性颌骨坏死的临床问题。

Evaluation of a context-aware chatbot using retrieval-augmented generation for answering clinical questions on medication-related osteonecrosis of the jaw.

作者信息

Steybe David, Poxleitner Philipp, Aljohani Suad, Herlofson Bente Brokstad, Nicolatou-Galitis Ourania, Patel Vinod, Fedele Stefano, Kwon Tae-Geon, Fusco Vittorio, Pichardo Sarina E C, Obermeier Katharina Theresa, Otto Sven, Rau Alexander, Russe Maximilian Frederik

机构信息

Department of Oral and Maxillofacial Surgery and Facial Plastic Surgery, University Hospital, LMU Munich, Munich, Germany.

Department of Oral and Maxillofacial Surgery and Facial Plastic Surgery, University Hospital, LMU Munich, Munich, Germany.

出版信息

J Craniomaxillofac Surg. 2025 Apr;53(4):355-360. doi: 10.1016/j.jcms.2024.12.009. Epub 2025 Jan 10.

DOI:10.1016/j.jcms.2024.12.009
PMID:
39799075
Abstract

The potential of large language models (LLMs) in medical applications is significant, and Retrieval-augmented generation (RAG) can address the weaknesses of these models in terms of data transparency and scientific accuracy by incorporating current scientific knowledge into responses. In this study, RAG and GPT-4 by OpenAI were applied to develop GuideGPT, a context aware chatbot integrated with a knowledge database from 449 scientific publications designed to provide answers on the prevention, diagnosis, and treatment of medication-related osteonecrosis of the jaw (MRONJ). A comparison was made with a generic LLM ("PureGPT") across 30 MRONJ-related questions. Ten international experts in MRONJ evaluated the responses based on content, language, scientific explanation, and agreement using 5-point Likert scales. Statistical analysis using the Mann-Whitney U test showed significantly better ratings for GuideGPT than PureGPT regarding content (p = 0.006), scientific explanation (p = 0.032), and agreement (p = 0.008), though not for language (p = 0.407). Thus, this study demonstrates RAG to be a promising tool to improve response quality and reliability of LLMs by incorporating domain-specific knowledge. This approach addresses the limitations of generic chatbots and can provide traceable and up-to-date responses essential for clinical practice.

摘要

大语言模型(LLMs)在医学应用中的潜力巨大,检索增强生成(RAG)可以通过将当前科学知识纳入回答来解决这些模型在数据透明度和科学准确性方面的弱点。在本研究中,RAG和OpenAI的GPT-4被应用于开发GuideGPT,这是一个上下文感知聊天机器人,它集成了来自449篇科学出版物的知识数据库,旨在提供有关颌骨药物性骨坏死(MRONJ)预防、诊断和治疗的答案。针对30个与MRONJ相关的问题,将其与通用大语言模型(“PureGPT”)进行了比较。10位MRONJ国际专家使用5点李克特量表,基于内容、语言、科学解释和一致性对回答进行了评估。使用曼-惠特尼U检验的统计分析表明,在内容(p = 0.006)、科学解释(p = 0.032)和一致性(p = 0.008)方面,GuideGPT的评分显著高于PureGPT,但在语言方面(p = 0.407)并非如此。因此,本研究表明,RAG是一种很有前景利用特定领域知识来提高大语言模型回答质量和可靠性的工具。这种方法解决了通用聊天机器人的局限性,并可以提供临床实践中必不可少的可追溯且最新的回答。

相似文献

1
Evaluation of a context-aware chatbot using retrieval-augmented generation for answering clinical questions on medication-related osteonecrosis of the jaw.使用检索增强生成技术评估上下文感知聊天机器人,以回答关于药物性颌骨坏死的临床问题。
J Craniomaxillofac Surg. 2025 Apr;53(4):355-360. doi: 10.1016/j.jcms.2024.12.009. Epub 2025 Jan 10.
2
Improving Dietary Supplement Information Retrieval: Development of a Retrieval-Augmented Generation System With Large Language Models.改善膳食补充剂信息检索:利用大语言模型开发检索增强生成系统
J Med Internet Res. 2025 Mar 19;27:e67677. doi: 10.2196/67677.
3
Accuracy of Current Large Language Models and the Retrieval-Augmented Generation Model in Determining Dietary Principles in Chronic Kidney Disease.当前大语言模型及检索增强生成模型在确定慢性肾脏病饮食原则方面的准确性
J Ren Nutr. 2025 May;35(3):401-409. doi: 10.1053/j.jrn.2025.01.004. Epub 2025 Jan 24.
4
Improving accuracy of GPT-3/4 results on biomedical data using a retrieval-augmented language model.使用检索增强语言模型提高GPT-3/4在生物医学数据上的结果准确性。
PLOS Digit Health. 2024 Aug 21;3(8):e0000568. doi: 10.1371/journal.pdig.0000568. eCollection 2024 Aug.
5
Quality of Answers of Generative Large Language Models Versus Peer Users for Interpreting Laboratory Test Results for Lay Patients: Evaluation Study.生成式大语言模型与同行用户对解释非专业患者实验室检测结果的答案质量比较:评估研究。
J Med Internet Res. 2024 Apr 17;26:e56655. doi: 10.2196/56655.
6
Custom Large Language Models Improve Accuracy: Comparing Retrieval Augmented Generation and Artificial Intelligence Agents to Noncustom Models for Evidence-Based Medicine.定制大语言模型提高准确性:将检索增强生成和人工智能代理与非定制模型在循证医学方面进行比较
Arthroscopy. 2025 Mar;41(3):565-573.e6. doi: 10.1016/j.arthro.2024.10.042. Epub 2024 Nov 7.
7
Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性:与GPT-3.5和GPT-4的比较研究
JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.
8
Current knowledge regarding medication-related osteonecrosis of the jaw among different health professionals.不同健康专业人员对药物相关性颌骨坏死的认识。
Support Care Cancer. 2020 Nov;28(11):5397-5404. doi: 10.1007/s00520-020-05374-4. Epub 2020 Mar 6.
9
Assessing the Potential Role of Artificial Intelligence in Medication-Related Osteonecrosis of the Jaw Information Sharing.评估人工智能在药物相关性颌骨坏死信息共享中的潜在作用。
J Oral Maxillofac Surg. 2024 Jun;82(6):699-705. doi: 10.1016/j.joms.2024.03.001. Epub 2024 Mar 9.
10
Using Generative Artificial Intelligence in Health Economics and Outcomes Research: A Primer on Techniques and Breakthroughs.在卫生经济学与结果研究中使用生成式人工智能:技术与突破入门
Pharmacoecon Open. 2025 Apr 29. doi: 10.1007/s41669-025-00580-4.

引用本文的文献

1
Retrieval augmented generation for large language models in healthcare: A systematic review.医疗保健领域大语言模型的检索增强生成:一项系统综述。
PLOS Digit Health. 2025 Jun 11;4(6):e0000877. doi: 10.1371/journal.pdig.0000877. eCollection 2025 Jun.