比较大型语言模型与先前深度学习模型在药物不良反应命名实体识别上的表现。

Comparing a Large Language Model with Previous Deep Learning Models on Named Entity Recognition of Adverse Drug Events.

机构信息

Public health and medical information unit, Saint Etienne University Hospital, France.

Laboratoire Inserm, SAINBIOSE, U1059, dysfonction vasculaire et hémostase, université Jean-Monnet, Saint-Étienne, France.

出版信息

Stud Health Technol Inform. 2024 Aug 22;316:781-785. doi: 10.3233/SHTI240528.

DOI:10.3233/SHTI240528

PMID:39176909

Abstract

The ability to fine-tune pre-trained deep learning models to learn how to process a downstream task using a large training set allow to significantly improve performances of named entity recognition. Large language models are recent models based on the Transformers architecture that may be conditioned on a new task with in-context learning, by providing a series of instructions or prompt. These models only require few examples and such approach is defined as few shot learning. Our objective was to compare performances of named entity recognition of adverse drug events between state of the art deep learning models fine-tuned on Pubmed abstracts and a large language model using few-shot learning. Hussain et al's state of the art model (PMID: 34422092) significantly outperformed the ChatGPT-3.5 model (F1-Score: 97.6% vs 86.0%). Few-shot learning is a convenient way to perform named entity recognition when training examples are rare, but performances are still inferior to those of a deep learning model fine-tuned with several training examples. Perspectives are to evaluate few-shot prompting with GPT-4 and perform fine-tuning on GPT-3.5.

摘要

使用大型训练集微调预先训练的深度学习模型以学习如何处理下游任务的能力，可显著提高命名实体识别的性能。大型语言模型是基于 Transformer 架构的最新模型，可通过提供一系列指令或提示，通过上下文学习对新任务进行条件处理。这些模型仅需要少量示例，这种方法被定义为少样本学习。我们的目标是比较在 Pubmed 摘要上微调的最先进的深度学习模型和使用少样本学习的大型语言模型在药物不良事件命名实体识别方面的性能。Hussain 等人的最先进模型（PMID：34422092）显著优于 ChatGPT-3.5 模型（F1 得分：97.6%对 86.0%）。当训练示例很少时，少样本学习是执行命名实体识别的一种便捷方法，但性能仍不如经过几个训练示例微调的深度学习模型。未来的研究方向是评估 GPT-4 的少样本提示并在 GPT-3.5 上进行微调。