Suppr超能文献

在检索增强语言模型中利用长上下文进行医学问答。

Leveraging long context in retrieval augmented language models for medical question answering.

作者信息

Zhang Gongbo, Xu Zihan, Jin Qiao, Chen Fangyi, Fang Yilu, Liu Yi, Rousseau Justin F, Xu Ziyang, Lu Zhiyong, Weng Chunhua, Peng Yifan

机构信息

Department of Biomedical Informatics, Columbia University, New York, NY, USA.

Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.

出版信息

NPJ Digit Med. 2025 May 2;8(1):239. doi: 10.1038/s41746-025-01651-w.

Abstract

While holding great promise for improving and facilitating healthcare through applications of medical literature summarization, large language models (LLMs) struggle to produce up-to-date responses on evolving topics due to outdated knowledge or hallucination. Retrieval-augmented generation (RAG) is a pivotal innovation that improves the accuracy and relevance of LLM responses by integrating LLMs with a search engine and external sources of knowledge. However, the quality of RAG responses can be largely impacted by the rank and density of key information in the retrieval results, such as the "lost-in-the-middle" problem. In this work, we aim to improve the robustness and reliability of the RAG workflow in the medical domain. Specifically, we propose a map-reduce strategy, BriefContext, to combat the "lost-in-the-middle" issue without modifying the model weights. We demonstrated the advantage of the workflow with various LLM backbones and on multiple QA datasets. This method promises to improve the safety and reliability of LLMs deployed in healthcare domains by reducing the risk of misinformation, ensuring critical clinical content is retained in generated responses, and enabling more trustworthy use of LLMs in critical tasks such as medical question answering, clinical decision support, and patient-facing applications.

摘要

虽然通过应用医学文献摘要来改善和促进医疗保健具有巨大潜力,但由于知识过时或产生幻觉,大语言模型(LLMs)在针对不断发展的主题生成最新回复方面存在困难。检索增强生成(RAG)是一项关键创新,它通过将大语言模型与搜索引擎和外部知识源集成,提高了大语言模型回复的准确性和相关性。然而,RAG回复的质量在很大程度上会受到检索结果中关键信息的排名和密度的影响,比如“中间迷失”问题。在这项工作中,我们旨在提高医学领域RAG工作流程的稳健性和可靠性。具体而言,我们提出了一种映射规约策略BriefContext,以在不修改模型权重的情况下解决“中间迷失”问题。我们在各种大语言模型主干和多个问答数据集上展示了该工作流程的优势。这种方法有望通过降低错误信息的风险、确保关键临床内容保留在生成的回复中以及在医疗问答、临床决策支持和面向患者的应用等关键任务中实现对大语言模型更可靠的使用,从而提高部署在医疗保健领域的大语言模型的安全性和可靠性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/10b8/12048518/044533bfacc8/41746_2025_1651_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验