基于 BERT（来自 Transformers 的双向编码器表示）的深度学习方法在提取中文放射学报告证据中的应用：计算机辅助肝癌诊断框架的开发。

Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework.

机构信息

School of Biomedical Engineering, Capital Medical University, Beijing, China.

Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing, China.

出版信息

J Med Internet Res. 2021 Jan 12;23(1):e19689. doi: 10.2196/19689.

DOI:10.2196/19689

PMID:33433395

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7837998/

Abstract

BACKGROUND

Liver cancer is a substantial disease burden in China. As one of the primary diagnostic tools for detecting liver cancer, dynamic contrast-enhanced computed tomography provides detailed evidences for diagnosis that are recorded in free-text radiology reports.

OBJECTIVE

The aim of our study was to apply a deep learning model and rule-based natural language processing (NLP) method to identify evidences for liver cancer diagnosis automatically.

METHODS

We proposed a pretrained, fine-tuned BERT (Bidirectional Encoder Representations from Transformers)-based BiLSTM-CRF (Bidirectional Long Short-Term Memory-Conditional Random Field) model to recognize the phrases of APHE (hyperintense enhancement in the arterial phase) and PDPH (hypointense in the portal and delayed phases). To identify more essential diagnostic evidences, we used the traditional rule-based NLP methods for the extraction of radiological features. APHE, PDPH, and other extracted radiological features were used to design a computer-aided liver cancer diagnosis framework by random forest.

RESULTS

The BERT-BiLSTM-CRF predicted the phrases of APHE and PDPH with an F1 score of 98.40% and 90.67%, respectively. The prediction model using combined features had a higher performance (F1 score, 88.55%) than those using APHE and PDPH (84.88%) or other extracted radiological features (83.52%). APHE and PDPH were the top 2 essential features for liver cancer diagnosis.

CONCLUSIONS

This work was a comprehensive NLP study, wherein we identified evidences for the diagnosis of liver cancer from Chinese radiology reports, considering both clinical knowledge and radiology findings. The BERT-based deep learning method for the extraction of diagnostic evidence achieved state-of-the-art performance. The high performance proves the feasibility of the BERT-BiLSTM-CRF model in information extraction from Chinese radiology reports. The findings of our study suggest that the deep learning-based method for automatically identifying evidences for diagnosis can be extended to other types of Chinese clinical texts.

摘要

背景

肝癌在中国是一个重大的疾病负担。作为检测肝癌的主要诊断工具之一，动态对比增强计算机断层扫描提供了详细的诊断证据，这些证据以自由文本的形式记录在放射学报告中。

目的

本研究旨在应用深度学习模型和基于规则的自然语言处理（NLP）方法自动识别肝癌诊断的证据。

方法

我们提出了一种基于预训练、微调的 BERT（来自 Transformer 的双向编码器表示）的 BiLSTM-CRF（双向长短期记忆条件随机场）模型，用于识别 APHE（动脉期高增强）和 PDPH（门脉期和延迟期低增强）的短语。为了识别更重要的诊断证据，我们使用传统的基于规则的 NLP 方法提取放射学特征。APHE、PDPH 和其他提取的放射学特征用于通过随机森林设计计算机辅助肝癌诊断框架。

结果

BERT-BiLSTM-CRF 预测 APHE 和 PDPH 的短语的 F1 分数分别为 98.40%和 90.67%。使用组合特征的预测模型的性能（F1 分数为 88.55%）高于仅使用 APHE 和 PDPH（84.88%）或其他提取的放射学特征（83.52%）。APHE 和 PDPH 是肝癌诊断的前 2 个重要特征。

结论

这项工作是一项全面的 NLP 研究，我们从中文放射学报告中识别肝癌诊断的证据，同时考虑临床知识和放射学发现。基于 BERT 的深度学习方法用于提取诊断证据，达到了最先进的性能。高绩效证明了 BERT-BiLSTM-CRF 模型在从中文放射学报告中提取信息方面的可行性。我们的研究结果表明，自动识别诊断证据的基于深度学习的方法可以扩展到其他类型的中文临床文本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6797/7837998/2ce78d1f445b/jmir_v23i1e19689_fig1.jpg

相似文献

Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework.

J Med Internet Res. 2021 Jan 12;23(1):e19689. doi: 10.2196/19689.

Extracting comprehensive clinical information for breast cancer using deep learning methods.

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.

BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.

Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers.

BMC Med Inform Decis Mak. 2021 Sep 11;21(1):262. doi: 10.1186/s12911-021-01623-6.

Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT).

BMC Med Inform Decis Mak. 2022 Jul 30;22(1):200. doi: 10.1186/s12911-022-01946-y.

Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning.

J Biomed Inform. 2020 Aug;108:103473. doi: 10.1016/j.jbi.2020.103473. Epub 2020 Jun 18.

Deep Learning Approach for Negation and Speculation Detection for Automated Important Finding Flagging and Extraction in Radiology Report: Internal Validation and Technique Comparison Study.

JMIR Med Inform. 2023 Apr 25;11:e46348. doi: 10.2196/46348.

Highly accurate classification of chest radiographic reports using a deep learning natural language model pre-trained on 3.8 million text reports.

Bioinformatics. 2021 Jan 29;36(21):5255-5261. doi: 10.1093/bioinformatics/btaa668.

Data governance and Gensini score automatic calculation for coronary angiography with deep-learning-based natural language extraction.

Math Biosci Eng. 2024 Feb 23;21(3):4085-4103. doi: 10.3934/mbe.2024180.

Adversarial active learning for the identification of medical concepts and annotation inconsistency.

J Biomed Inform. 2020 Aug;108:103481. doi: 10.1016/j.jbi.2020.103481. Epub 2020 Jul 18.

引用本文的文献

Knowledge-Informed Machine Learning for Cancer Diagnosis and Prognosis: A Review.

IEEE Trans Autom Sci Eng. 2025;22:10008-10028. doi: 10.1109/tase.2024.3515839. Epub 2024 Dec 18.

MISTIC: a novel approach for metastasis classification in Italian electronic health records using transformers.

BMC Med Inform Decis Mak. 2025 Apr 10;25(1):160. doi: 10.1186/s12911-025-02994-w.

A vision attention driven Language framework for medical report generation.

Sci Rep. 2025 Mar 28;15(1):10704. doi: 10.1038/s41598-025-95666-8.

Large Language Model Applications for Health Information Extraction in Oncology: Scoping Review.

JMIR Cancer. 2025 Mar 28;11:e65984. doi: 10.2196/65984.

Text mining approach for feature extraction and cartilage disease grade classification using knee MRI radiology reports.

Comput Struct Biotechnol J. 2024 Oct 5;24:622-629. doi: 10.1016/j.csbj.2024.10.003. eCollection 2024 Dec.

A foundation systematic review of natural language processing applied to gastroenterology & hepatology.

BMC Gastroenterol. 2025 Feb 6;25(1):58. doi: 10.1186/s12876-025-03608-5.

Chinese Clinical Named Entity Recognition With Segmentation Synonym Sentence Synthesis Mechanism: Algorithm Development and Validation.

JMIR Med Inform. 2024 Nov 21;12:e60334. doi: 10.2196/60334.

A scoping review of large language model based approaches for information extraction from radiology reports.

NPJ Digit Med. 2024 Aug 24;7(1):222. doi: 10.1038/s41746-024-01219-0.

Detecting Ground Glass Opacity Features in Patients With Lung Cancer: Automated Extraction and Longitudinal Analysis via Deep Learning-Based Natural Language Processing.

JMIR AI. 2023 Jun 1;2:e44537. doi: 10.2196/44537.

An Entity Extraction Pipeline for Medical Text Records Using Large Language Models: Analytical Study.

J Med Internet Res. 2024 Mar 29;26:e54580. doi: 10.2196/54580.

本文引用的文献

Evaluating sentence representations for biomedical text: Methods and experimental results.

J Biomed Inform. 2020 Apr;104:103396. doi: 10.1016/j.jbi.2020.103396. Epub 2020 Mar 6.

Research on Chinese medical named entity recognition based on collaborative cooperation of multiple neural network models.

J Biomed Inform. 2020 Apr;104:103395. doi: 10.1016/j.jbi.2020.103395. Epub 2020 Feb 25.

Deep learning in clinical natural language processing: a methodical review.

J Am Med Inform Assoc. 2020 Mar 1;27(3):457-470. doi: 10.1093/jamia/ocz200.

Extracting comprehensive clinical information for breast cancer using deep learning methods.

Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.

The 2019 WHO classification of tumours of the digestive system.

Histopathology. 2020 Jan;76(2):182-188. doi: 10.1111/his.13975. Epub 2019 Nov 13.

Medical Knowledge Extraction and Analysis from Electronic Medical Records Using Deep Learning.

Chin Med Sci J. 2019 Jun 30;34(2):133-139. doi: 10.24920/003589.

Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm.

J Digit Imaging. 2019 Aug;32(4):544-553. doi: 10.1007/s10278-019-00237-9.

Toward Complete Structured Information Extraction from Radiology Reports Using Machine Learning.

J Digit Imaging. 2019 Aug;32(4):554-564. doi: 10.1007/s10278-019-00234-y.

Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review.

JMIR Med Inform. 2019 Apr 27;7(2):e12239. doi: 10.2196/12239.

Post-Structuring Radiology Reports of Breast Cancer Patients for Clinical Quality Assurance.

IEEE/ACM Trans Comput Biol Bioinform. 2020 Nov-Dec;17(6):1883-1894. doi: 10.1109/TCBB.2019.2914678. Epub 2020 Dec 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于 BERT（来自 Transformers 的双向编码器表示）的深度学习方法在提取中文放射学报告证据中的应用：计算机辅助肝癌诊断框架的开发。

Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework.

机构信息

School of Biomedical Engineering, Capital Medical University, Beijing, China.

Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing, China.