Suppr超能文献

大型语言模型在生物医学命名实体识别方面是否超越了编码器?

Do LLMs Surpass Encoders for Biomedical NER?

作者信息

Obeidat Motasem S, Al Nahian Md Sultan, Kavuluru Ramakanth

机构信息

Department of Computer Science, University of Kentucky, Lexington, KY USA.

Division of Biomedical Informatics, University of Kentucky, Lexington, KY USA.

出版信息

Proc (IEEE Int Conf Healthc Inform). 2025 Jun;2025:352-358. doi: 10.1109/ICHI64645.2025.00048. Epub 2025 Jul 22.

Abstract

Recognizing spans of biomedical concepts and their types (e.g., drug or gene) in free text, often called biomedical named entity recognition (NER), is a basic component of information extraction (IE) pipelines. Without a strong NER component, other applications, such as knowledge discovery and information retrieval, are not practical. State-of-the-art in NER shifted from traditional ML models to deep neural networks with transformer-based encoder models (e.g., BERT) emerging as the current standard. However, decoder models (also called large language models or LLMs) are gaining traction in IE. But LLM-driven NER often ignores positional information due to the generative nature of decoder models. Furthermore, they are computationally very expensive (both in inference time and hardware needs). Hence, it is worth exploring if they actually excel at biomedical NER and assess any associated trade-offs (performance vs efficiency). This is exactly what we do in this effort employing the same BIO entity tagging scheme (that retains positional information) using five different datasets with varying proportions of longer entities. Our results show that the LLMs chosen (Mistral and Llama: 8B range) often outperform best encoder models (BERT-(un)cased, BiomedBERT, and DeBERTav3: 300M range) by 2-8% in F-scores except for one dataset, where they equal encoder performance. This gain is more prominent among longer entities of length ≥ 3 tokens. However, LLMs are one to two orders of magnitude more expensive at inference time and may need cost prohibitive hardware. Thus, when performance differences are small or real time user feedback is needed, encoder models might still be more suitable than LLMs.

摘要

在自由文本中识别生物医学概念的跨度及其类型(例如,药物或基因),通常称为生物医学命名实体识别(NER),是信息提取(IE)管道的基本组成部分。没有强大的NER组件,其他应用程序,如知识发现和信息检索,将无法实际应用。NER的最新技术已从传统的机器学习模型转向深度神经网络,基于Transformer的编码器模型(例如BERT)成为当前的标准。然而,解码器模型(也称为大语言模型或LLMs)在信息提取中越来越受到关注。但是,由于解码器模型的生成性质,基于LLM的NER通常会忽略位置信息。此外,它们在计算上非常昂贵(在推理时间和硬件需求方面)。因此,值得探讨它们是否真的在生物医学NER方面表现出色,并评估任何相关的权衡(性能与效率)。这正是我们在这项工作中所做的,我们使用相同的BIO实体标记方案(保留位置信息),使用五个不同比例的较长实体的数据集。我们的结果表明,所选的LLMs(Mistral和Llama:8B范围)在F分数上通常比最佳编码器模型(BERT-(无)大小写、BiomedBERT和DeBERTav3:300M范围)高出2-8%,除了一个数据集,在该数据集中它们与编码器性能相当。这种优势在长度≥3个词元的较长实体中更为明显。然而,LLMs在推理时要贵一到两个数量级,可能需要成本高昂的硬件。因此,当性能差异较小时或需要实时用户反馈时,编码器模型可能仍然比LLMs更合适。

相似文献

1
Do LLMs Surpass Encoders for Biomedical NER?大型语言模型在生物医学命名实体识别方面是否超越了编码器?
Proc (IEEE Int Conf Healthc Inform). 2025 Jun;2025:352-358. doi: 10.1109/ICHI64645.2025.00048. Epub 2025 Jul 22.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验