利用自然语言处理技术对现有非结构化数据集进行脊柱疾病的影像学自动诊断

Automatic Diagnosis of Spinal Disorders on Radiographic Images: Leveraging Existing Unstructured Datasets With Natural Language Processing.

作者信息

Galbusera Fabio, Cina Andrea, Bassani Tito, Panico Matteo, Sconfienza Luca Maria

机构信息

IRCCS Istituto Ortopedico Galeazzi, Milan, Italy.

Department of Chemistry, Materials and Chemical Engineering "Giulio Natta," Politecnico di Milano, Milan, Italy.

出版信息

Global Spine J. 2023 Jun;13(5):1257-1266. doi: 10.1177/21925682211026910. Epub 2021 Jul 5.

DOI:10.1177/21925682211026910

PMID:34219477

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10416592/

Abstract

STUDY DESIGN

Retrospective study.

OBJECTIVES

Huge amounts of images and medical reports are being generated in radiology departments. While these datasets can potentially be employed to train artificial intelligence tools to detect findings on radiological images, the unstructured nature of the reports limits the accessibility of information. In this study, we tested if natural language processing (NLP) can be useful to generate training data for deep learning models analyzing planar radiographs of the lumbar spine.

METHODS

NLP classifiers based on the Bidirectional Encoder Representations from Transformers (BERT) model able to extract structured information from radiological reports were developed and used to generate annotations for a large set of radiographic images of the lumbar spine (N = 10 287). Deep learning (ResNet-18) models aimed at detecting radiological findings directly from the images were then trained and tested on a set of 204 human-annotated images.

RESULTS

The NLP models had accuracies between 0.88 and 0.98 and specificities between 0.84 and 0.99; 7 out of 12 radiological findings had sensitivity >0.90. The ResNet-18 models showed performances dependent on the specific radiological findings with sensitivities and specificities between 0.53 and 0.93.

CONCLUSIONS

NLP generates valuable data to train deep learning models able to detect radiological findings in spine images. Despite the noisy nature of reports and NLP predictions, this approach effectively mitigates the difficulties associated with the manual annotation of large quantities of data and opens the way to the era of for artificial intelligence in musculoskeletal radiology.

摘要

研究设计

回顾性研究。

目的

放射科正在生成大量的图像和医学报告。虽然这些数据集有可能用于训练人工智能工具以检测放射图像上的病变，但报告的非结构化性质限制了信息的可获取性。在本研究中，我们测试了自然语言处理（NLP）是否有助于为分析腰椎平面X线片的深度学习模型生成训练数据。

方法

开发了基于来自变换器的双向编码器表示（BERT）模型的NLP分类器，其能够从放射学报告中提取结构化信息，并用于为一大组腰椎X线图像（N = 10287）生成注释。然后在一组204张人工标注的图像上训练并测试旨在直接从图像中检测放射学病变的深度学习（ResNet-18）模型。

结果

NLP模型的准确率在0.88至0.98之间，特异性在0.84至0.99之间；12种放射学病变中有7种的敏感性>0.90。ResNet-18模型的表现取决于特定的放射学病变，敏感性和特异性在0.53至0.93之间。

结论

NLP生成有价值的数据来训练能够检测脊柱图像中放射学病变的深度学习模型。尽管报告和NLP预测存在噪声，但这种方法有效地减轻了与大量数据手动标注相关的困难，并为肌肉骨骼放射学人工智能时代开辟了道路。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用自然语言处理技术对现有非结构化数据集进行脊柱疾病的影像学自动诊断

Automatic Diagnosis of Spinal Disorders on Radiographic Images: Leveraging Existing Unstructured Datasets With Natural Language Processing.

作者信息

机构信息

出版信息

STUDY DESIGN

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

研究设计

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

利用自然语言处理技术对现有非结构化数据集进行脊柱疾病的影像学自动诊断

Automatic Diagnosis of Spinal Disorders on Radiographic Images: Leveraging Existing Unstructured Datasets With Natural Language Processing.

作者信息

机构信息

出版信息

STUDY DESIGN

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

研究设计

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献