Suppr超能文献

在整合实体识别的放射学文本报告中检测 COVID-19。

COVID-19 detection in radiological text reports integrating entity recognition.

机构信息

SINAI Group, CEATIC, Universidad de Jaén, Campus Las Lagunillas S/N, E-23071, Jaén, Spain.

MRI Unit, Radiology Department, HT Médica, Carmelo Torres 2, 23007, Jaén, Spain.

出版信息

Comput Biol Med. 2020 Dec;127:104066. doi: 10.1016/j.compbiomed.2020.104066. Epub 2020 Oct 22.

Abstract

COVID-19 diagnosis is usually based on PCR test using radiological images, mainly chest Computed Tomography (CT) for the assessment of lung involvement by COVID-19. However, textual radiological reports also contain relevant information for determining the likelihood of presenting radiological signs of COVID-19 involving lungs. The development of COVID-19 automatic detection systems based on Natural Language Processing (NLP) techniques could provide a great help in supporting clinicians and detecting COVID-19 related disorders within radiological reports. In this paper we propose a text classification system based on the integration of different information sources. The system can be used to automatically predict whether or not a patient has radiological findings consistent with COVID-19 on the basis of radiological reports of chest CT. To carry out our experiments we use 295 radiological reports from chest CT studies provided by the ''HT médica" clinic. All of them are radiological requests with suspicions of chest involvement by COVID-19. In order to train our text classification system we apply Machine Learning approaches and Named Entity Recognition. The system takes two sources of information as input: the text of the radiological report and COVID-19 related disorders extracted from SNOMED-CT. The best system is trained using SVM and the baseline results achieve 85% accuracy predicting lung involvement by COVID-19, which already offers competitive values that are difficult to overcome. Moreover, we apply mutual information in order to integrate the best quality information extracted from SNOMED-CT. In this way, we achieve around 90% accuracy improving the baseline results by 5 points.

摘要

COVID-19 的诊断通常基于使用影像学图像的 PCR 测试,主要是胸部计算机断层扫描(CT)来评估 COVID-19 对肺部的影响。然而,文本影像学报告也包含有关信息,可用于确定是否存在 COVID-19 肺部放射学征象的可能性。基于自然语言处理(NLP)技术的 COVID-19 自动检测系统的发展可以为支持临床医生和在影像学报告中检测 COVID-19 相关疾病提供很大帮助。在本文中,我们提出了一种基于不同信息源集成的文本分类系统。该系统可用于根据胸部 CT 的影像学报告自动预测患者是否存在与 COVID-19 相关的肺部放射学表现。为了进行我们的实验,我们使用了来自“HT médica”诊所的 295 份胸部 CT 研究的影像学报告。它们都是怀疑胸部受累 COVID-19 的影像学请求。为了训练我们的文本分类系统,我们应用了机器学习方法和命名实体识别。该系统有两个信息源作为输入:影像学报告的文本和从 SNOMED-CT 中提取的 COVID-19 相关疾病。使用 SVM 训练最佳系统,预测 COVID-19 肺部受累的基线结果达到 85%的准确率,已经提供了难以超越的有竞争力的值。此外,我们应用互信息来整合从 SNOMED-CT 中提取的最佳质量信息。通过这种方式,我们实现了约 90%的准确率,将基线结果提高了 5 个百分点。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验