Suppr超能文献

零样本语言模型在命名实体识别中的应用:以德国临床文本中的心脏功能指标为例。

Zero-Shot LLMs for Named Entity Recognition: Targeting Cardiac Function Indicators in German Clinical Texts.

机构信息

Institute of Medical Informatics, University of Münster, Münster, Germany.

Interdisciplinary Center for Clinical Research (IZKF), University of Münster, Münster, Germany.

出版信息

Stud Health Technol Inform. 2024 Aug 30;317:228-234. doi: 10.3233/SHTI240861.

Abstract

INTRODUCTION

Large Language Models (LLMs) like ChatGPT have become increasingly prevalent. In medicine, many potential areas arise where LLMs may offer added value. Our research focuses on the use of open-source LLM alternatives like Llama 3, Gemma, Mistral, and Mixtral to extract medical parameters from German clinical texts. We concentrate on German due to an observed gap in research for non-English tasks.

OBJECTIVE

To evaluate the effectiveness of open-source LLMs in extracting medical parameters from German clinical texts, specially focusing on cardiovascular function indicators from cardiac MRI reports.

METHODS

We extracted 14 cardiovascular function indicators, including left and right ventricular ejection fraction (LV-EF and RV-EF), from 497 variously formulated cardiac magnetic resonance imaging (MRI) reports. Our systematic analysis involved assessing the performance of Llama 3, Gemma, Mistral, and Mixtral models in terms of right annotation and named entity recognition (NER) accuracy.

RESULTS

The analysis confirms strong performance with up to 95.4% right annotation and 99.8% NER accuracy across different architectures, despite the fact that these models were not explicitly fine-tuned for data extraction and the German language.

CONCLUSION

The results strongly recommend using open-source LLMs for extracting medical parameters from clinical texts, including those in German, due to their high accuracy and effectiveness even without specific fine-tuning.

摘要

简介

像 ChatGPT 这样的大型语言模型(LLM)变得越来越普遍。在医学领域,许多潜在的领域出现了 LLM 可能提供附加值的情况。我们的研究重点是使用 Llama 3、Gemma、Mistral 和 Mixtral 等开源 LLM 替代品从德语临床文本中提取医学参数。我们专注于德语,因为观察到非英语任务的研究存在差距。

目的

评估开源 LLM 从德语临床文本中提取医学参数的有效性,特别是专注于从心脏 MRI 报告中提取心血管功能指标。

方法

我们从 497 份不同形式的心脏磁共振成像(MRI)报告中提取了 14 个心血管功能指标,包括左心室射血分数(LV-EF)和右心室射血分数(RV-EF)。我们的系统分析包括评估 Llama 3、Gemma、Mistral 和 Mixtral 模型在右注释和命名实体识别(NER)准确性方面的性能。

结果

尽管这些模型并未针对数据提取和德语进行专门微调,但分析结果证实了其强大的性能,右注释准确率高达 95.4%,NER 准确率高达 99.8%,且在不同架构下均表现出色。

结论

即使没有特定的微调,这些结果强烈推荐使用开源 LLM 从临床文本中提取医学参数,包括德语文本,因为它们具有高精度和高效性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验