深度学习模型检测气胸：从放射科医生标签与自然语言处理模型生成标签中学习。

Detection of Pneumothorax with Deep Learning Models: Learning From Radiologist Labels vs Natural Language Processing Model Generated Labels.

机构信息

Department of Diagnostic Imaging, National University Hospital, Singapore.

Saw Swee Hock School of Public Health, Institute of Data Science, Yong Loo Lin School of Medicine, National University Health System, National University of Singapore, Singapore.

出版信息

Acad Radiol. 2022 Sep;29(9):1350-1358. doi: 10.1016/j.acra.2021.09.013. Epub 2021 Oct 12.

DOI:10.1016/j.acra.2021.09.013

PMID:34649780

Abstract

RATIONALE AND OBJECTIVES

To compare the performance of pneumothorax deep learning detection models trained with radiologist versus natural language processing (NLP) labels on the NIH ChestX-ray14 dataset.

MATERIALS AND METHODS

The ChestX-ray14 dataset consisted of 112,120 frontal chest radiographs with 5302 positive and 106, 818 negative labels for pneumothorax using NLP (dataset A). All 112,120 radiographs were also inspected by 4 radiologists leaving a visually confirmed set of 5,138 positive and 104,751 negative for pneumothorax (dataset B). Datasets A and B were used independently to train 3 convolutional neural network (CNN) architectures (ResNet-50, DenseNet-121 and EfficientNetB3). All models' area under the receiver operating characteristic curve (AUC) were evaluated with the official NIH test set and an external test set of 525 chest radiographs from our emergency department.

RESULTS

There were significantly higher AUCs on the NIH internal test set for CNN models trained with radiologist vs NLP labels across all architectures. AUCs for the NLP/radiologist-label models were 0.838 (95%CI:0.830, 0.846)/0.881 (95%CI:0.873,0.887) for ResNet-50 (p = 0.034), 0.839 (95%CI:0.831,0.847)/0.880 (95%CI:0.873,0.887) for DenseNet-121, and 0.869 (95%CI: 0.863,0.876)/0.943 (95%CI: 0.939,0.946) for EfficientNetB3 (p ≤0.001). Evaluation with the external test set also showed higher AUCs (p <0.001) for the CNN models trained with radiologist versus NLP labels across all architectures. The AUCs for the NLP/radiologist-label models were 0.686 (95%CI:0.632,0.740)/0.806 (95%CI:0.758,0.854) for ResNet-50, 0.736 (95%CI:0.686, 0.787)/0.871 (95%CI:0.830,0.912) for DenseNet-121, and 0.822 (95%CI: 0.775,0.868)/0.915 (95%CI: 0.882,0.948) for EfficientNetB3.

CONCLUSION

We demonstrated improved performance and generalizability of pneumothorax detection deep learning models trained with radiologist labels compared to models trained with NLP labels.

摘要

背景与目的

比较使用放射科医生标签与自然语言处理（NLP）标签训练的气胸深度学习检测模型在 NIH ChestX-ray14 数据集上的性能。

材料与方法

ChestX-ray14 数据集包含 112120 张正面胸部 X 光片，使用 NLP 对 5302 张阳性和 106818 张阴性气胸进行标签（数据集 A）。所有 112120 张 X 光片均由 4 名放射科医生进行检查，留下一组 5138 张阳性和 104751 张阴性气胸的视觉确认标签（数据集 B）。数据集 A 和 B 分别用于训练 3 个卷积神经网络（CNN）架构（ResNet-50、DenseNet-121 和 EfficientNetB3）。使用官方 NIH 测试集和来自我们急诊室的 525 张胸部 X 光片的外部测试集评估所有模型的接收器工作特征曲线（ROC）下面积（AUC）。

结果

在所有架构中，使用放射科医生标签而非 NLP 标签训练的 CNN 模型在 NIH 内部测试集上的 AUC 显著更高。ResNet-50 上的 AUC 为 0.838（95%CI：0.830，0.846）/0.881（95%CI：0.873，0.887），DenseNet-121 上的 AUC 为 0.839（95%CI：0.831，0.847）/0.880（95%CI：0.873，0.887），EfficientNetB3 上的 AUC 为 0.869（95%CI：0.863，0.876）/0.943（95%CI：0.939，0.946）（p ≤0.001）。使用外部测试集评估也显示，在所有架构中，使用放射科医生标签而非 NLP 标签训练的 CNN 模型的 AUC 更高（p<0.001）。ResNet-50 上的 AUC 为 0.686（95%CI：0.632，0.740）/0.806（95%CI：0.758，0.854），DenseNet-121 上的 AUC 为 0.736（95%CI：0.686，0.787）/0.871（95%CI：0.830，0.912），EfficientNetB3 上的 AUC 为 0.822（95%CI：0.775，0.868）/0.915（95%CI：0.882，0.948）。

结论

与使用 NLP 标签训练的模型相比，我们证明了使用放射科医生标签训练的气胸检测深度学习模型在性能和泛化能力方面有所提高。

相似文献

Detection of Pneumothorax with Deep Learning Models: Learning From Radiologist Labels vs Natural Language Processing Model Generated Labels.

Acad Radiol. 2022 Sep;29(9):1350-1358. doi: 10.1016/j.acra.2021.09.013. Epub 2021 Oct 12.

Comparison of radiologist versus natural language processing-based image annotations for deep learning system for tuberculosis screening on chest radiographs.

Clin Imaging. 2022 Jul;87:34-37. doi: 10.1016/j.clinimag.2022.04.009. Epub 2022 Apr 25.

Chest Radiograph Interpretation with Deep Learning Models: Assessment with Radiologist-adjudicated Reference Standards and Population-adjusted Evaluation.

Radiology. 2020 Feb;294(2):421-431. doi: 10.1148/radiol.2019191293. Epub 2019 Dec 3.

Automated detection of moderate and large pneumothorax on frontal chest X-rays using deep convolutional neural networks: A retrospective study.

PLoS Med. 2018 Nov 20;15(11):e1002697. doi: 10.1371/journal.pmed.1002697. eCollection 2018 Nov.

Effect of Training Data Volume on Performance of Convolutional Neural Network Pneumothorax Classifiers.

J Digit Imaging. 2022 Aug;35(4):881-892. doi: 10.1007/s10278-022-00594-y. Epub 2022 Mar 3.

Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists.

PLoS Med. 2018 Nov 20;15(11):e1002686. doi: 10.1371/journal.pmed.1002686. eCollection 2018 Nov.

German CheXpert Chest X-ray Radiology Report Labeler.

Rofo. 2024 Sep;196(9):956-965. doi: 10.1055/a-2234-8268. Epub 2024 Jan 31.

Deep-Learning-Based Diagnosis of Bedside Chest X-ray in Intensive Care and Emergency Medicine.

Invest Radiol. 2021 Aug 1;56(8):525-534. doi: 10.1097/RLI.0000000000000771.

CheXLocNet: Automatic localization of pneumothorax in chest radiographs using deep convolutional neural networks.

PLoS One. 2020 Nov 9;15(11):e0242013. doi: 10.1371/journal.pone.0242013. eCollection 2020.

Application of deep learning-based computer-aided detection system: detecting pneumothorax on chest radiograph after biopsy.

Eur Radiol. 2019 Oct;29(10):5341-5348. doi: 10.1007/s00330-019-06130-x. Epub 2019 Mar 26.

引用本文的文献

Anomaly detection in medical via multimodal foundation models.

Front Bioeng Biotechnol. 2025 Aug 12;13:1644697. doi: 10.3389/fbioe.2025.1644697. eCollection 2025.

Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: A data-driven approach for improved classification.

Med Image Anal. 2025 Jan;99:103383. doi: 10.1016/j.media.2024.103383. Epub 2024 Nov 10.

Utilizing ChatGPT for Curriculum Learning in Developing a Clinical Grade Pneumothorax Detection Model: A Multisite Validation Study.

J Clin Med. 2024 Jul 10;13(14):4042. doi: 10.3390/jcm13144042.

Deep learning for pneumothorax diagnosis: a systematic review and meta-analysis.

Eur Respir Rev. 2023 Jun 7;32(168). doi: 10.1183/16000617.0259-2022. Print 2023 Jun 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

深度学习模型检测气胸：从放射科医生标签与自然语言处理模型生成标签中学习。

Detection of Pneumothorax with Deep Learning Models: Learning From Radiologist Labels vs Natural Language Processing Model Generated Labels.

机构信息

出版信息

RATIONALE AND OBJECTIVES

MATERIALS AND METHODS

RESULTS

CONCLUSION

背景与目的

材料与方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献