Suppr超能文献

基于神经网络序列标注方法的乳腺 X 线摄影筛查报告的全面词级分类。

Comprehensive Word-Level Classification of Screening Mammography Reports Using a Neural Network Sequence Labeling Approach.

机构信息

Department of Radiology, Duke University Medical Center, 2301 Erwin Road, Box 3808, Durham, NC, 27710, USA.

Scanslated, Inc., Durham, NC, USA.

出版信息

J Digit Imaging. 2019 Oct;32(5):685-692. doi: 10.1007/s10278-018-0141-4.

Abstract

Radiology reports contain a large amount of potentially valuable unstructured data. Recently, neural networks have been employed to perform classification of radiology reports over a few classes at the document level. The success of neural networks in sequence-labeling problems such as named entity recognition and part of speech tagging suggests that they could be used to classify radiology report text with greater granularity. We employed a neural network architecture to comprehensively classify mammography report text at the word level using a sequence labeling approach. Two radiologists devised a comprehensive classification system for screening mammography reports. Each word in each report was manually categorized by a radiologist into one of 33 categories according to the classification system. Tagged words referencing the same finding were grouped into unique sets. We pre-labeled reports with a rule-based algorithm and then manually edited these annotations for 6705 screening mammography reports (25.1%, 66.8%, and 8.1% BI-RADS 0, 1, and 2, respectively). A combined convolutional and recurrent neural network model was used to label words in each sentence of the individual reports. A siamese recurrent neural network was then used to group findings into sets. Performance of the neural network-based method was compared to a rule-based algorithm and a conditional random field (CRF) model. Global accuracy (percentage of documents where all word tags were predicted correctly) and keyword accuracy (percentage of all words that were labeled correctly, excluding words tagged as unimportant) were calculated on an unseen 519 report test set. Two-tailed t tests were used to assess differences between algorithm performance, and p < 0.05 was used to determine statistical significance. The neural network-based approach showed significantly higher global accuracy compared to both the rule-based algorithm (88.3 vs 57.0%, p < 0.001) and the CRF model (88.3% vs. 75.8%, p < 0.001). The neural network also showed significantly higher keyword level accuracy compared to the rule-based algorithm (95.5% vs. 80.9% p < 0.001) and CRF model (95.5% vs. 76.9%, p < 0.001). We demonstrate the potential of neural networks to accurately perform word-level multilabel classification of free text radiology reports across 33 classes, thus showing the utility of a sequence labeling approach to NLP of radiology reports. We found that a neural network classifier outperforms a rule-based algorithm and a CRF classifier for comprehensive multilabel classification of free text screening mammography reports at the word level. By approaching radiology report classification as a sequence-labeling problem, we demonstrate the ability of neural networks to extract data from free text radiology reports at a level of granularity not previously reported.

摘要

放射学报告包含大量潜在有价值的非结构化数据。最近,神经网络已被用于在文档级别对放射学报告进行几类分类。神经网络在命名实体识别和词性标注等序列标记问题上的成功表明,它们可以用于更精细地对放射学报告文本进行分类。我们采用神经网络架构,通过序列标记方法在单词级别上全面分类乳腺 X 光检查报告。两位放射科医生设计了一个全面的分类系统,用于筛查乳腺 X 光检查报告。根据分类系统,每位放射科医生手动将每个报告中的每个单词归类为 33 个类别之一。引用相同发现的标记词被归入唯一的集合。我们使用基于规则的算法对报告进行预标记,然后手动编辑了 6705 份筛查性乳腺 X 光检查报告的这些注释(BI-RADS 0、1 和 2 分别为 25.1%、66.8%和 8.1%)。然后使用卷积和递归神经网络模型对每个报告的句子中的单词进行标记。然后使用孪生递归神经网络将发现分组到集合中。基于神经网络的方法的性能与基于规则的算法和条件随机场(CRF)模型进行了比较。在看不见的 519 份测试报告集上计算了全局准确性(所有文档中所有单词标签都正确预测的百分比)和关键字准确性(所有正确标记的单词的百分比,不包括标记为不重要的单词)。使用双尾 t 检验评估算法性能之间的差异,p<0.05 用于确定统计学意义。与基于规则的算法(88.3%对 57.0%,p<0.001)和 CRF 模型(88.3%对 75.8%,p<0.001)相比,基于神经网络的方法显示出显著更高的全局准确性。与基于规则的算法(95.5%对 80.9%,p<0.001)和 CRF 模型(95.5%对 76.9%,p<0.001)相比,神经网络也显示出显著更高的关键字级别准确性。我们证明了神经网络能够准确地对 33 个类别进行自由文本放射学报告的单词级多标签分类,从而展示了序列标记方法在放射学报告自然语言处理中的实用性。我们发现,与基于规则的算法和 CRF 分类器相比,神经网络分类器在单词级别上对自由文本筛查性乳腺 X 光检查报告进行全面多标签分类的性能更好。通过将放射学报告分类作为序列标记问题,我们展示了神经网络从自由文本放射学报告中提取数据的能力,达到了以前未报告的粒度级别。

相似文献

4
Large Scale Semi-Automated Labeling of Routine Free-Text Clinical Records for Deep Learning.
J Digit Imaging. 2019 Feb;32(1):30-37. doi: 10.1007/s10278-018-0105-8.
5
Automated annotation and classification of BI-RADS assessment from radiology reports.
J Biomed Inform. 2017 May;69:177-187. doi: 10.1016/j.jbi.2017.04.011. Epub 2017 Apr 18.
7
Extraction of BI-RADS findings from breast ultrasound reports in Chinese using deep learning approaches.
Int J Med Inform. 2018 Nov;119:17-21. doi: 10.1016/j.ijmedinf.2018.08.009. Epub 2018 Aug 18.
10
Automatic Disease Annotation From Radiology Reports Using Artificial Intelligence Implemented by a Recurrent Neural Network.
AJR Am J Roentgenol. 2019 Apr;212(4):734-740. doi: 10.2214/AJR.18.19869. Epub 2019 Jan 30.

引用本文的文献

1
Natural Language Processing for Breast Imaging: A Systematic Review.
Diagnostics (Basel). 2023 Apr 14;13(8):1420. doi: 10.3390/diagnostics13081420.
4
Multi-label annotation of text reports from computed tomography of the chest, abdomen, and pelvis using deep learning.
BMC Med Inform Decis Mak. 2022 Apr 15;22(1):102. doi: 10.1186/s12911-022-01843-4.
5
The overview of the deep learning integrated into the medical imaging of liver: a review.
Hepatol Int. 2021 Aug;15(4):868-880. doi: 10.1007/s12072-021-10229-z. Epub 2021 Jul 15.
6
A systematic review of natural language processing applied to radiology reports.
BMC Med Inform Decis Mak. 2021 Jun 3;21(1):179. doi: 10.1186/s12911-021-01533-7.

本文引用的文献

1
Radiology report annotation using intelligent word embeddings: Applied to multi-institutional chest CT cohort.
J Biomed Inform. 2018 Jan;77:11-20. doi: 10.1016/j.jbi.2017.11.012. Epub 2017 Nov 23.
2
Deep Learning to Classify Radiology Free-Text Reports.
Radiology. 2018 Mar;286(3):845-852. doi: 10.1148/radiol.2017171115. Epub 2017 Nov 13.
7
Natural Language Processing in Radiology: A Systematic Review.
Radiology. 2016 May;279(2):329-43. doi: 10.1148/radiol.16142770.
8
Natural Language Processing Technologies in Radiology Research and Clinical Applications.
Radiographics. 2016 Jan-Feb;36(1):176-91. doi: 10.1148/rg.2016150080.
9
The "open letter": radiologists' reports in the era of patient web portals.
J Am Coll Radiol. 2014 Sep;11(9):863-7. doi: 10.1016/j.jacr.2014.03.014. Epub 2014 May 16.
10
Radiologic reporting: structure.
AJR Am J Roentgenol. 1983 Jan;140(1):171-2. doi: 10.2214/ajr.140.1.171.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验