Suppr超能文献

MedKA:一种基于知识图谱增强的方法,用于提高医学大语言模型中的事实性。

MedKA: A knowledge graph-augmented approach to improve factuality in medical Large Language Models.

作者信息

Deng Yiyan, Zhao Shen, Miao Yongming, Zhu Junjie, Li Jin

机构信息

Department of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China.

Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai, China; Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.

出版信息

J Biomed Inform. 2025 Aug;168:104871. doi: 10.1016/j.jbi.2025.104871. Epub 2025 Jul 8.

Abstract

Large language models (LLMs) have demonstrated remarkable potential in medical applications. However, they still face critical challenges such as hallucinations, knowledge inconsistency, and insufficient integration of domain-specific medical expertise. To address these limitations, we introduce MedKA, a novel knowledge graph-augmented approach for fine-tuning and evaluating medical LLMs. Our approach systematically transforms structured knowledge from a medical knowledge graph into a high-quality QA corpus, cMKGQA, by clustering multiple fields around clinically meaningful scenarios (e.g., diagnosis, treatment planning). This grouping strategy enables comprehensive and use-case-specific data construction and supports one-stage training of the LLM, ensuring better alignment with structured medical knowledge. This transformation process ensures the comprehensive integration of domain-specific knowledge while maintaining factual consistency. To evaluate the factuality of LLM-generated response, we further propose the Knowledge Graph-based Auxiliary Evaluation Metrics (KG-AEMs)-a novel benchmarking framework that compares LLM outputs with fine-grained, attribute-level ground truth from knowledge graph. Experimental results demonstrate that MedKA achieves state-of-the-art performance, significantly outperforming existing models, including LLaMA-3.1-8B-Chinese-Chat, HuatuoGPT2-7B, and Apollo2-7B. On the cMKGQA dataset, MedKA achieves 44.63 BLEU-1 and 17.62 BLEU-4 scores, with particularly strong performance in areas such as medication recommendations and diagnostic tests as measured by KG-AEMs. Our approach highlights the potential of integrating knowledge graphs into LLM fine-tuning to improve the accuracy and reliability of medical AI systems. It advances factual accuracy in medical dialogue systems and provides a comprehensive framework for evaluating the integration of medical knowledge into LLMs. This work is publicly available on Github: https://github.com/Yai017/MedKA.

摘要

大语言模型(LLMs)在医学应用中已展现出显著潜力。然而,它们仍面临诸如幻觉、知识不一致以及特定领域医学专业知识整合不足等关键挑战。为解决这些局限性,我们引入了MedKA,这是一种用于微调与评估医学大语言模型的新型知识图谱增强方法。我们的方法通过围绕临床有意义的场景(如诊断、治疗规划)对多个领域进行聚类,将医学知识图谱中的结构化知识系统地转换为高质量的问答语料库cMKGQA。这种分组策略能够实现全面且针对用例的数据构建,并支持大语言模型的单阶段训练,确保与结构化医学知识更好地对齐。此转换过程确保了特定领域知识的全面整合,同时保持事实一致性。为评估大语言模型生成回答的事实性,我们进一步提出了基于知识图谱的辅助评估指标(KG - AEMs)——一种新型基准框架,该框架将大语言模型的输出与来自知识图谱的细粒度、属性级真实情况进行比较。实验结果表明,MedKA取得了领先的性能,显著优于现有模型,包括LLaMA - 3.1 - 8B - Chinese - Chat、HuatuoGPT2 - 7B和Apollo2 - 7B。在cMKGQA数据集上,MedKA分别取得了44.63的BLEU - 1分数和17.62的BLEU - 4分数,在用药建议和诊断测试等领域,通过KG - AEMs衡量表现尤为突出。我们的方法凸显了将知识图谱集成到大语言模型微调中以提高医学人工智能系统准确性和可靠性的潜力。它提升了医学对话系统中的事实准确性,并为评估医学知识融入大语言模型提供了一个全面框架。这项工作在Github上公开可用:https://github.com/Yai017/MedKA

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验