Song Jie, Xu Zhichuan, He Mengqiao, Feng Jinhua, Shen Bairong
Department of Ophthalmology and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China.
School of Computer Science and Software Engineering, Southwest Petroleum University, Chengdu, China.
NPJ Digit Med. 2025 Aug 24;8(1):543. doi: 10.1038/s41746-025-01955-x.
Many rare genetic diseases have recognizable facial phenotypes that serve as diagnostic clues. While Large Language Models (LLMs) have shown potential in healthcare, their application to rare genetic diseases still faces challenges like hallucination and limited domain knowledge. To address these challenges, Retrieval-Augmented Generation (RAG) is an effective method, while Knowledge Graphs (KGs) provide more accurate and reliable information. In this paper, we constructed a Facial Phenotype Knowledge Graph (FPKG) including 6143 nodes and 19,282 relations and incorporate RAG to alleviate the hallucination of LLMs and enhance their ability to answer rare genetic disease questions. We evaluated eight LLMs across four tasks: domain-specific QA, diagnostic tests, consistency evaluation, and temperature analysis. The results showed that our approach improves both diagnostic accuracy and response consistency. Notably, RAG reduces temperature-induced variability by 53.94%. This study demonstrates that LLMs can effectively incorporate domain-specific KGs to enhance accuracy, and consistency, thereby improving diagnostic decision-making.
许多罕见遗传病具有可识别的面部表型,可作为诊断线索。虽然大语言模型(LLMs)在医疗保健领域已显示出潜力,但其在罕见遗传病中的应用仍面临诸如幻觉和领域知识有限等挑战。为应对这些挑战,检索增强生成(RAG)是一种有效方法,而知识图谱(KGs)提供更准确可靠的信息。在本文中,我们构建了一个包含6143个节点和19282条关系的面部表型知识图谱(FPKG),并结合RAG来减轻大语言模型的幻觉,增强其回答罕见遗传病问题的能力。我们在四个任务中评估了八个大语言模型:特定领域问答、诊断测试、一致性评估和温度分析。结果表明,我们的方法提高了诊断准确性和回答一致性。值得注意的是,RAG将温度引起的变异性降低了53.94%。这项研究表明,大语言模型可以有效地整合特定领域的知识图谱,以提高准确性和一致性,从而改善诊断决策。