Hou Yu, Yeung Jeremy, Xu Hua, Su Chang, Wang Fei, Zhang Rui
Department of Surgery, University of Minnesota, Minneapolis, MN, USA.
Section of Biomedical Informatics and Data Science, Yale University, New Haven, Connecticut, USA.
medRxiv. 2023 Jun 12:2023.06.09.23291208. doi: 10.1101/2023.06.09.23291208.
Large Language Models (LLMs) have demonstrated exceptional performance in various natural language processing tasks, utilizing their language generation capabilities and knowledge acquisition potential from unstructured text. However, when applied to the biomedical domain, LLMs encounter limitations, resulting in erroneous and inconsistent answers. Knowledge Graphs (KGs) have emerged as valuable resources for structured information representation and organization. Specifically, Biomedical Knowledge Graphs (BKGs) have attracted significant interest in managing large-scale and heterogeneous biomedical knowledge. This study evaluates the capabilities of ChatGPT and existing BKGs in question answering, knowledge discovery, and reasoning. Results indicate that while ChatGPT with GPT-4.0 surpasses both GPT-3.5 and BKGs in providing existing information, BKGs demonstrate superior information reliability. Additionally, ChatGPT exhibits limitations in performing novel discoveries and reasoning, particularly in establishing structured links between entities compared to BKGs. To overcome these limitations, future research should focus on integrating LLMs and BKGs to leverage their respective strengths. Such an integrated approach would optimize task performance and mitigate potential risks, thereby advancing knowledge in the biomedical field and contributing to overall well-being.
大语言模型(LLMs)在各种自然语言处理任务中展现出了卓越的性能,它们利用自身的语言生成能力以及从非结构化文本中获取知识的潜力。然而,当应用于生物医学领域时,大语言模型存在局限性,会给出错误和不一致的答案。知识图谱(KGs)已成为结构化信息表示和组织的宝贵资源。具体而言,生物医学知识图谱(BKGs)在管理大规模和异构生物医学知识方面引起了极大关注。本研究评估了ChatGPT和现有生物医学知识图谱在问答、知识发现和推理方面的能力。结果表明,虽然配备GPT - 4.0的ChatGPT在提供现有信息方面超过了GPT - 3.5和生物医学知识图谱,但生物医学知识图谱显示出更高的信息可靠性。此外,ChatGPT在进行新发现和推理时存在局限性,特别是与生物医学知识图谱相比,在建立实体之间的结构化链接方面。为克服这些局限性,未来的研究应专注于整合大语言模型和生物医学知识图谱,以发挥它们各自的优势。这种综合方法将优化任务性能并降低潜在风险,从而推动生物医学领域的知识发展并促进整体健康。