基于大语言模型从放射学报告中提取信息的方法的范围综述。

A scoping review of large language model based approaches for information extraction from radiology reports.

作者信息

Reichenpfader Daniel, Müller Henning, Denecke Kerstin

机构信息

Institute for Patient-Centered Digital Health, Bern University of Applied Sciences, Biel/Bienne, Switzerland.

Faculty of Medicine, University of Geneva, Geneva, Switzerland.

出版信息

NPJ Digit Med. 2024 Aug 24;7(1):222. doi: 10.1038/s41746-024-01219-0.

Abstract

Radiological imaging is a globally prevalent diagnostic method, yet the free text contained in radiology reports is not frequently used for secondary purposes. Natural Language Processing can provide structured data retrieved from these reports. This paper provides a summary of the current state of research on Large Language Model (LLM) based approaches for information extraction (IE) from radiology reports. We conduct a scoping review that follows the PRISMA-ScR guideline. Queries of five databases were conducted on August 1st 2023. Among the 34 studies that met inclusion criteria, only pre-transformer and encoder-based models are described. External validation shows a general performance decrease, although LLMs might improve generalizability of IE approaches. Reports related to CT and MRI examinations, as well as thoracic reports, prevail. Most common challenges reported are missing validation on external data and augmentation of the described methods. Different reporting granularities affect the comparability and transparency of approaches.

摘要

放射成像在全球范围内是一种普遍使用的诊断方法,然而放射学报告中包含的自由文本并不常用于二次目的。自然语言处理可以从这些报告中提取结构化数据。本文总结了基于大语言模型(LLM)从放射学报告中提取信息(IE)的研究现状。我们按照PRISMA-ScR指南进行了一项范围综述。于2023年8月1日对五个数据库进行了检索。在符合纳入标准的34项研究中,仅描述了基于预变换器和编码器的模型。外部验证表明,尽管大语言模型可能会提高信息提取方法的通用性,但总体性能会下降。与CT和MRI检查相关的报告以及胸部报告占主导地位。报告的最常见挑战是缺乏对外部数据的验证以及所描述方法的扩充。不同的报告粒度会影响方法的可比性和透明度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/55ae/11344824/37342734b75b/41746_2024_1219_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索