Li Zhaoning, Ren Jiangtao
School of Data and Computer Science, Guangdong Province Key Lab of Computational Science, Sun Yat-Sen University, Guangzhou, Guangdong 510006, PR China.
J Biomed Inform. 2020 Aug;108:103492. doi: 10.1016/j.jbi.2020.103492. Epub 2020 Jul 6.
Chest imaging reports describe the results of chest radiography procedures. Automatic extraction of abnormal imaging signs from chest imaging reports has a pivotal role in clinical research and a wide range of downstream medical tasks. However, there are few studies on information extraction from Chinese chest imaging reports. In this paper, we formulate chest abnormal imaging sign extraction as a sequence tagging and matching problem. On this basis, we propose a transferred abnormal imaging signs extractor with pretrained ERNIE as the backbone, named EASON (fine-tuning ERNIE with CRF for Abnormal Signs ExtractiON), which can address the problem of data insufficiency. In addition, to assign the attributes (the body part and degree) to corresponding abnormal imaging signs from the results of the sequence tagging model, we design a simple but effective tag2relation algorithm based on the nature of chest imaging report text. We evaluate our method on the corpus provided by a medical big data company, and the experimental results demonstrate that our method achieves significant and consistent improvement compared to other baselines.
胸部影像报告描述了胸部X光检查程序的结果。从胸部影像报告中自动提取异常影像特征在临床研究和广泛的下游医学任务中具有关键作用。然而,针对中文胸部影像报告的信息提取研究较少。在本文中,我们将胸部异常影像特征提取表述为一个序列标注和匹配问题。在此基础上,我们提出了一种以预训练的ERNIE为骨干的迁移式异常影像特征提取器,名为EASON(使用CRF微调ERNIE进行异常特征提取),它可以解决数据不足的问题。此外,为了从序列标注模型的结果中为相应的异常影像特征赋予属性(身体部位和程度),我们基于胸部影像报告文本的性质设计了一种简单而有效的tag2relation算法。我们在一家医学大数据公司提供的语料库上评估了我们的方法,实验结果表明,与其他基线方法相比,我们的方法取得了显著且一致的改进。