School of Biomedical Engineering, Capital Medical University, Beijing, China.
Beijing Key Laboratory of Fundamental Research on Biomechanics in Clinical Application, Capital Medical University, Beijing, China.
J Med Internet Res. 2021 Jan 12;23(1):e19689. doi: 10.2196/19689.
Liver cancer is a substantial disease burden in China. As one of the primary diagnostic tools for detecting liver cancer, dynamic contrast-enhanced computed tomography provides detailed evidences for diagnosis that are recorded in free-text radiology reports.
The aim of our study was to apply a deep learning model and rule-based natural language processing (NLP) method to identify evidences for liver cancer diagnosis automatically.
We proposed a pretrained, fine-tuned BERT (Bidirectional Encoder Representations from Transformers)-based BiLSTM-CRF (Bidirectional Long Short-Term Memory-Conditional Random Field) model to recognize the phrases of APHE (hyperintense enhancement in the arterial phase) and PDPH (hypointense in the portal and delayed phases). To identify more essential diagnostic evidences, we used the traditional rule-based NLP methods for the extraction of radiological features. APHE, PDPH, and other extracted radiological features were used to design a computer-aided liver cancer diagnosis framework by random forest.
The BERT-BiLSTM-CRF predicted the phrases of APHE and PDPH with an F1 score of 98.40% and 90.67%, respectively. The prediction model using combined features had a higher performance (F1 score, 88.55%) than those using APHE and PDPH (84.88%) or other extracted radiological features (83.52%). APHE and PDPH were the top 2 essential features for liver cancer diagnosis.
This work was a comprehensive NLP study, wherein we identified evidences for the diagnosis of liver cancer from Chinese radiology reports, considering both clinical knowledge and radiology findings. The BERT-based deep learning method for the extraction of diagnostic evidence achieved state-of-the-art performance. The high performance proves the feasibility of the BERT-BiLSTM-CRF model in information extraction from Chinese radiology reports. The findings of our study suggest that the deep learning-based method for automatically identifying evidences for diagnosis can be extended to other types of Chinese clinical texts.
肝癌在中国是一个重大的疾病负担。作为检测肝癌的主要诊断工具之一,动态对比增强计算机断层扫描提供了详细的诊断证据,这些证据以自由文本的形式记录在放射学报告中。
本研究旨在应用深度学习模型和基于规则的自然语言处理(NLP)方法自动识别肝癌诊断的证据。
我们提出了一种基于预训练、微调的 BERT(来自 Transformer 的双向编码器表示)的 BiLSTM-CRF(双向长短期记忆条件随机场)模型,用于识别 APHE(动脉期高增强)和 PDPH(门脉期和延迟期低增强)的短语。为了识别更重要的诊断证据,我们使用传统的基于规则的 NLP 方法提取放射学特征。APHE、PDPH 和其他提取的放射学特征用于通过随机森林设计计算机辅助肝癌诊断框架。
BERT-BiLSTM-CRF 预测 APHE 和 PDPH 的短语的 F1 分数分别为 98.40%和 90.67%。使用组合特征的预测模型的性能(F1 分数为 88.55%)高于仅使用 APHE 和 PDPH(84.88%)或其他提取的放射学特征(83.52%)。APHE 和 PDPH 是肝癌诊断的前 2 个重要特征。
这项工作是一项全面的 NLP 研究,我们从中文放射学报告中识别肝癌诊断的证据,同时考虑临床知识和放射学发现。基于 BERT 的深度学习方法用于提取诊断证据,达到了最先进的性能。高绩效证明了 BERT-BiLSTM-CRF 模型在从中文放射学报告中提取信息方面的可行性。我们的研究结果表明,自动识别诊断证据的基于深度学习的方法可以扩展到其他类型的中文临床文本。