Lu Qiuhao, Li Rui, Wen Andrew, Wang Jinlian, Wang Liwei, Liu Hongfang
McWilliams School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, USA.
AMIA Annu Symp Proc. 2025 May 22;2024:748-757. eCollection 2024.
Large Language Models (LLMs) have revolutionized various sectors, including healthcare where they are employed in diverse applications. Their utility is particularly significant in the context of rare diseases, where data scarcity, complexity, and specificity pose considerable challenges. In the clinical domain, Named Entity Recognition (NER) stands out as an essential task and it plays a crucial role in extracting relevant information from clinical texts. Despite the promise of LLMs, current research mostly concentrates on document-level NER, identifying entities in a more general context across entire documents, without extracting their precise location. Additionally, efforts have been directed towards adapting ChatGPTfor token-level NER. However, there is a significant research gap when it comes to employing token-level NER for clinical texts, especially with the use of local open-source LLMs. This study aims to bridge this gap by investigating the effectiveness of both proprietary and local LLMs in token-level clinical NER. Essentially, we delve into the capabilities of these models through a series of experiments involving zero-shot prompting, few-shot prompting, retrieval-augmented generation (RAG), and instruction-fine-tuning. Our exploration reveals the inherent challenges LLMs face in token-level NER, particularly in the context of rare diseases, and suggests possible improvements for their application in healthcare. This research contributes to narrowing a significant gap in healthcare informatics and offers insights that could lead to a more refined application of LLMs in the healthcare sector.
大语言模型(LLMs)已经彻底改变了各个领域,包括医疗保健领域,在该领域它们被应用于各种不同的用途。在罕见病的背景下,它们的作用尤为显著,因为数据稀缺、复杂和具有特异性带来了相当大的挑战。在临床领域,命名实体识别(NER)是一项至关重要的任务,它在从临床文本中提取相关信息方面发挥着关键作用。尽管大语言模型前景广阔,但目前的研究大多集中在文档级NER,即在更一般的背景下识别整个文档中的实体,而不提取它们的精确位置。此外,也有人致力于将ChatGPT应用于词元级NER。然而,在将词元级NER应用于临床文本方面,尤其是使用本地开源大语言模型时,存在重大的研究空白。本研究旨在通过调查专有和本地大语言模型在词元级临床NER中的有效性来弥合这一差距。本质上,我们通过一系列涉及零样本提示、少样本提示、检索增强生成(RAG)和指令微调的实验,深入研究这些模型的能力。我们的探索揭示了大语言模型在词元级NER中面临的固有挑战,特别是在罕见病的背景下,并提出了它们在医疗保健领域应用的可能改进方向。这项研究有助于缩小医疗保健信息学中的重大差距,并提供见解,可能会使大语言模型在医疗保健领域得到更精细的应用。