使用两阶段深度学习方法从日本放射学报告中提取临床信息：算法开发与验证

Extracting Clinical Information From Japanese Radiology Reports Using a 2-Stage Deep Learning Approach: Algorithm Development and Validation.

作者信息

Sugimoto Kento, Wada Shoya, Konishi Shozo, Okada Katsuki, Manabe Shirou, Matsumura Yasushi, Takeda Toshihiro

机构信息

Department of Medical Informatics, Graduate School of Medicine, Osaka University, Suita, Osaka, Japan.

Department of Transformative System for Medical Information, Graduate School of Medicine, Osaka University, Suita, Osaka, Japan.

出版信息

JMIR Med Inform. 2023 Nov 14;11:e49041. doi: 10.2196/49041.

DOI:10.2196/49041

PMID:37991979

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10686535/

Abstract

BACKGROUND

Radiology reports are usually written in a free-text format, which makes it challenging to reuse the reports.

OBJECTIVE

For secondary use, we developed a 2-stage deep learning system for extracting clinical information and converting it into a structured format.

METHODS

Our system mainly consists of 2 deep learning modules: entity extraction and relation extraction. For each module, state-of-the-art deep learning models were applied. We trained and evaluated the models using 1040 in-house Japanese computed tomography (CT) reports annotated by medical experts. We also evaluated the performance of the entire pipeline of our system. In addition, the ratio of annotated entities in the reports was measured to validate the coverage of the clinical information with our information model.

RESULTS

The microaveraged F1-scores of our best-performing model for entity extraction and relation extraction were 96.1% and 97.4%, respectively. The microaveraged F1-score of the 2-stage system, which is a measure of the performance of the entire pipeline of our system, was 91.9%. Our system showed encouraging results for the conversion of free-text radiology reports into a structured format. The coverage of clinical information in the reports was 96.2% (6595/6853).

CONCLUSIONS

Our 2-stage deep system can extract clinical information from chest and abdomen CT reports accurately and comprehensively.

摘要

背景

放射学报告通常采用自由文本格式撰写，这使得报告的再利用具有挑战性。

目的

为了二次使用，我们开发了一个两阶段深度学习系统，用于提取临床信息并将其转换为结构化格式。

方法

我们的系统主要由两个深度学习模块组成：实体提取和关系提取。对于每个模块，应用了最先进的深度学习模型。我们使用由医学专家注释的1040份内部日本计算机断层扫描（CT）报告对模型进行训练和评估。我们还评估了系统整个流程的性能。此外，测量报告中注释实体的比例，以验证我们的信息模型对临床信息的覆盖范围。

结果

我们表现最佳的实体提取模型和关系提取模型的微平均F1分数分别为96.1%和97.4%。两阶段系统的微平均F1分数（衡量我们系统整个流程性能的指标）为91.9%。我们的系统在将自由文本放射学报告转换为结构化格式方面显示出令人鼓舞的结果。报告中临床信息的覆盖范围为96.2%（6595/6853）。

结论

我们的两阶段深度系统可以准确、全面地从胸部和腹部CT报告中提取临床信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b02/10686535/743abe0220dd/medinform-v11-e49041-g001.jpg

相似文献

Extracting Clinical Information From Japanese Radiology Reports Using a 2-Stage Deep Learning Approach: Algorithm Development and Validation.使用两阶段深度学习方法从日本放射学报告中提取临床信息：算法开发与验证

JMIR Med Inform. 2023 Nov 14;11:e49041. doi: 10.2196/49041.

Extracting clinical terms from radiology reports with deep learning.深度学习从放射学报告中提取临床术语。

J Biomed Inform. 2021 Apr;116:103729. doi: 10.1016/j.jbi.2021.103729. Epub 2021 Mar 9.

Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma.基于临床概念的肺癌放射学报告分类流水线。

J Digit Imaging. 2023 Jun;36(3):812-826. doi: 10.1007/s10278-023-00787-z. Epub 2023 Feb 14.

Application of Deep Learning in Generating Structured Radiology Reports: A Transformer-Based Technique.深度学习在生成结构化放射学报告中的应用：基于转换器的技术。

J Digit Imaging. 2023 Feb;36(1):80-90. doi: 10.1007/s10278-022-00692-x. Epub 2022 Aug 24.

Information extraction from multi-institutional radiology reports.从多机构放射学报告中提取信息。

Artif Intell Med. 2016 Jan;66:29-39. doi: 10.1016/j.artmed.2015.09.007. Epub 2015 Oct 3.

Extraction of Information Related to Adverse Drug Events from Electronic Health Record Notes: Design of an End-to-End Model Based on Deep Learning.从电子健康记录笔记中提取与药物不良事件相关的信息：基于深度学习的端到端模型设计

JMIR Med Inform. 2018 Nov 26;6(4):e12159. doi: 10.2196/12159.

Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

Identifying Patient Populations in Texts Describing Drug Approvals Through Deep Learning-Based Information Extraction: Development of a Natural Language Processing Algorithm.通过基于深度学习的信息提取在描述药物批准的文本中识别患者群体：一种自然语言处理算法的开发

JMIR Form Res. 2023 Jun 22;7:e44876. doi: 10.2196/44876.

A Hybrid Model for Family History Information Identification and Relation Extraction: Development and Evaluation of an End-to-End Information Extraction System.一种用于家族病史信息识别与关系抽取的混合模型：一个端到端信息抽取系统的开发与评估

JMIR Med Inform. 2021 Apr 22;9(4):e22797. doi: 10.2196/22797.

Extracting entities with attributes in clinical text via joint deep learning.通过联合深度学习从临床文本中提取具有属性的实体。

J Am Med Inform Assoc. 2019 Dec 1;26(12):1584-1591. doi: 10.1093/jamia/ocz158.

引用本文的文献

Two stage large language model approach enhancing entity classification and relationship mapping in radiology reports.两阶段大语言模型方法增强放射学报告中的实体分类和关系映射

Sci Rep. 2025 Aug 27;15(1):31550. doi: 10.1038/s41598-025-16213-z.

Year 2023 in Biomedical Natural Language Processing: a Tribute to Large Language Models and Generative AI.2023年生物医学自然语言处理领域：向大语言模型和生成式人工智能致敬。

Yearb Med Inform. 2024 Aug;33(1):241-248. doi: 10.1055/s-0044-1800751. Epub 2025 Apr 8.

Performance Improvement of a Natural Language Processing Tool for Extracting Patient Narratives Related to Medical States From Japanese Pharmaceutical Care Records by Increasing the Amount of Training Data: Natural Language Processing Analysis and Validation Study.通过增加训练数据量提高从日本药学服务记录中提取与医疗状况相关患者叙述的自然语言处理工具的性能：自然语言处理分析与验证研究

JMIR Med Inform. 2025 Mar 4;13:e68863. doi: 10.2196/68863.

Automated Detection of Cancer-Suspicious Findings in Japanese Radiology Reports with Natural Language Processing: A Multicenter Study.利用自然语言处理技术自动检测日本放射学报告中可疑癌症的发现：一项多中心研究。

J Imaging Inform Med. 2025 Jan 22. doi: 10.1007/s10278-024-01338-w.

Annotation-free multi-organ anomaly detection in abdominal CT using free-text radiology reports: a multi-centre retrospective study.利用自由文本放射学报告在腹部CT中进行无标注多器官异常检测：一项多中心回顾性研究

EBioMedicine. 2024 Dec;110:105463. doi: 10.1016/j.ebiom.2024.105463. Epub 2024 Nov 28.

Automated information extraction model enhancing traditional Chinese medicine RCT evidence extraction (Evi-BERT): algorithm development and validation.增强中医药随机对照试验证据提取的自动化信息提取模型（Evi-BERT）：算法开发与验证

Front Artif Intell. 2024 Aug 15;7:1454945. doi: 10.3389/frai.2024.1454945. eCollection 2024.

本文引用的文献

Extracting clinical terms from radiology reports with deep learning.深度学习从放射学报告中提取临床术语。

J Biomed Inform. 2021 Apr;116:103729. doi: 10.1016/j.jbi.2021.103729. Epub 2021 Mar 9.

Clinical Text Data in Machine Learning: Systematic Review.机器学习中的临床文本数据：系统综述

JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984.

Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

A two-stage deep learning approach for extracting entities and relationships from medical texts.一种从医学文本中提取实体和关系的两阶段深度学习方法。

J Biomed Inform. 2019 Nov;99:103285. doi: 10.1016/j.jbi.2019.103285. Epub 2019 Sep 20.

Introducing Information Extraction to Radiology Information Systems to Improve the Efficiency on Reading Reports.将信息提取引入放射学信息系统以提高报告阅读效率。

Methods Inf Med. 2019 Sep;58(2-03):94-106. doi: 10.1055/s-0039-1694992. Epub 2019 Sep 12.

Extraction of BI-RADS findings from breast ultrasound reports in Chinese using deep learning approaches.使用深度学习方法从中文乳腺超声报告中提取 BI-RADS 结果。

Int J Med Inform. 2018 Nov;119:17-21. doi: 10.1016/j.ijmedinf.2018.08.009. Epub 2018 Aug 18.

Clinical Natural Language Processing in languages other than English: opportunities and challenges.非英语语言的临床自然语言处理：机遇与挑战。

J Biomed Semantics. 2018 Mar 30;9(1):12. doi: 10.1186/s13326-018-0179-8.

ESR paper on structured reporting in radiology.放射学中关于结构化报告的红细胞沉降率论文。

Insights Imaging. 2018 Feb;9(1):1-7. doi: 10.1007/s13244-017-0588-8. Epub 2018 Feb 19.

Clinical information extraction applications: A literature review.临床信息提取应用：文献综述。

J Biomed Inform. 2018 Jan;77:34-49. doi: 10.1016/j.jbi.2017.11.011. Epub 2017 Nov 21.

Recurrent neural networks with specialized word embeddings for health-domain named-entity recognition.用于健康领域命名实体识别的具有专用词嵌入的递归神经网络。

J Biomed Inform. 2017 Dec;76:102-109. doi: 10.1016/j.jbi.2017.11.007. Epub 2017 Nov 13.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用两阶段深度学习方法从日本放射学报告中提取临床信息：算法开发与验证

Extracting Clinical Information From Japanese Radiology Reports Using a 2-Stage Deep Learning Approach: Algorithm Development and Validation.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献