Suppr超能文献

提高大语言模型的可靠性:基于最新糖尿病指南,通过双检索增强生成最小化幻觉

Enhancing Large Language Model Reliability: Minimizing Hallucinations with Dual Retrieval-Augmented Generation Based on the Latest Diabetes Guidelines.

作者信息

Lee Jaedong, Cha Hyosoung, Hwangbo Yul, Cheon Wonjoong

机构信息

Healthcare AI Team, National Cancer Center, Goyang-si 10408, Gyeonggi-do, Republic of Korea.

Department of Cancer AI & Digital Health, Graduate School of Cancer Science and Policy, National Cancer Center, Goyang-si 10408, Gyeonggi-do, Republic of Korea.

出版信息

J Pers Med. 2024 Nov 30;14(12):1131. doi: 10.3390/jpm14121131.

Abstract

Large language models (LLMs) show promise in healthcare but face challenges with hallucinations, particularly in rapidly evolving fields like diabetes management. Traditional LLM updating methods are resource-intensive, necessitating new approaches for delivering reliable, current medical information. This study aimed to develop and evaluate a novel retrieval system to enhance LLM reliability in diabetes management across different languages and guidelines. We developed a dual retrieval-augmented generation (RAG) system integrating both Korean Diabetes Association and American Diabetes Association 2023 guidelines. The system employed dense retrieval with 11 embedding models (including OpenAI, Upstage, and multilingual models) and sparse retrieval using BM25 algorithm with language-specific tokenizers. Performance was evaluated across different top-k values, leading to optimized ensemble retrievers for each guideline. For dense retrievers, Upstage's Solar Embedding-1-large and OpenAI's text-embedding-3-large showed superior performance for Korean and English guidelines, respectively. Multilingual models outperformed language-specific models in both cases. For sparse retrievers, the ko_kiwi tokenizer demonstrated superior performance for Korean text, while both ko_kiwi and porter_stemmer showed comparable effectiveness for English text. The ensemble retrievers, combining optimal dense and sparse configurations, demonstrated enhanced coverage while maintaining precision. This study presents an effective dual RAG system that enhances LLM reliability in diabetes management across different languages. The successful implementation with both Korean and American guidelines demonstrates the system's cross-regional capability, laying a foundation for more trustworthy AI-assisted healthcare applications.

摘要

大语言模型(LLMs)在医疗保健领域展现出了潜力,但面临着幻觉问题的挑战,尤其是在糖尿病管理等快速发展的领域。传统的大语言模型更新方法资源密集,因此需要新的方法来提供可靠的最新医学信息。本研究旨在开发并评估一种新型检索系统,以提高大语言模型在不同语言和指南下糖尿病管理中的可靠性。我们开发了一种双重检索增强生成(RAG)系统,该系统整合了韩国糖尿病协会和美国糖尿病协会2023年指南。该系统采用了11种嵌入模型(包括OpenAI、Upstage和多语言模型)进行密集检索,并使用带有特定语言分词器的BM25算法进行稀疏检索。在不同的top-k值下对性能进行了评估,从而为每个指南优化了集成检索器。对于密集检索器,Upstage的Solar Embedding-1-large和OpenAI的text-embedding-3-large分别在韩语和英语指南中表现出卓越性能。在这两种情况下,多语言模型都优于特定语言模型。对于稀疏检索器,ko_kiwi分词器在韩语文本中表现出卓越性能,而ko_kiwi和porter_stemmer在英语文本中表现出相当的有效性。结合最佳密集和稀疏配置的集成检索器在保持精度的同时展示了更高的覆盖率。本研究提出了一种有效的双重RAG系统,该系统提高了大语言模型在不同语言糖尿病管理中的可靠性。在韩国和美国指南上的成功实施证明了该系统的跨区域能力,为更值得信赖的人工智能辅助医疗保健应用奠定了基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8494/11677479/8db360d06167/jpm-14-01131-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验