探索大规模公共医学图像数据集。

Exploring Large-scale Public Medical Image Datasets.

机构信息

Australian Institute for Machine Learning, North Terrace, Adelaide, Australia; School of Public Health, University of Adelaide, North Terrace, Adelaide 5000, Australia; Royal Adelaide Hospital, North Terrace, Adelaide, Australia.

出版信息

Acad Radiol. 2020 Jan;27(1):106-112. doi: 10.1016/j.acra.2019.10.006. Epub 2019 Nov 6.

DOI:10.1016/j.acra.2019.10.006

PMID:31706792

Abstract

RATIONALE AND OBJECTIVES

Medical artificial intelligence systems are dependent on well characterized large-scale datasets. Recently released public datasets have been of great interest to the field, but pose specific challenges due to the disconnect they cause between data generation and data usage, potentially limiting the utility of these datasets.

MATERIALS AND METHODS

We visually explore two large public datasets, to determine how accurate the provided labels are and whether other subtle problems exist. The ChestXray14 dataset contains 112,120 frontal chest films, and the Musculoskeletal Radiology (MURA) dataset contains 40,561 upper limb radiographs. A subset of around 700 images from both datasets was reviewed by a board-certified radiologist, and the quality of the original labels was determined.

RESULTS

The ChestXray14 labels did not accurately reflect the visual content of the images, with positive predictive values mostly between 10% and 30% lower than the values presented in the original documentation. There were other significant problems, with examples of hidden stratification and label disambiguation failure. The MURA labels were more accurate, but the original normal/abnormal labels were inaccurate for the subset of cases with degenerative joint disease, with a sensitivity of 60% and a specificity of 82%.

CONCLUSION

Visual inspection of images is a necessary component of understanding large image datasets. We recommend that teams producing public datasets should perform this important quality control procedure and include a thorough description of their findings, along with an explanation of the data generating procedures and labeling rules, in the documentation for their datasets.

摘要

背景与目的

医学人工智能系统依赖于具有良好特征的大规模数据集。最近发布的公共数据集引起了该领域的极大兴趣，但由于它们在数据生成和数据使用之间造成的脱节，这些数据集可能会限制其用途，因此带来了一些特殊的挑战。

材料与方法

我们直观地研究了两个大型公共数据集，以确定提供的标签的准确性以及是否存在其他细微问题。ChestXray14 数据集包含 112120 张胸部正位片，Musculoskeletal Radiology (MURA) 数据集包含 40561 张上肢 X 光片。两个数据集的大约 700 张图像子集由一名经过董事会认证的放射科医生进行了审查，并确定了原始标签的质量。

结果

ChestXray14 标签没有准确反映图像的视觉内容，阳性预测值比原始文档中呈现的值低 10%至 30%左右。还存在其他重大问题，包括隐藏分层和标签歧义失败的例子。MURA 标签更准确，但原始的正常/异常标签对于有退行性关节疾病的病例子集是不准确的，其敏感性为 60%，特异性为 82%。

结论

对图像进行直观检查是理解大型图像数据集的必要组成部分。我们建议制作公共数据集的团队应执行此重要的质量控制程序，并在其数据集的文档中包括对其发现的全面描述，以及对数据生成过程和标记规则的解释。

相似文献

Exploring Large-scale Public Medical Image Datasets.探索大规模公共医学图像数据集。

Acad Radiol. 2020 Jan;27(1):106-112. doi: 10.1016/j.acra.2019.10.006. Epub 2019 Nov 6.

Artificial Intelligence and Machine Learning in Radiology: Opportunities, Challenges, Pitfalls, and Criteria for Success.人工智能和机器学习在放射学中的应用：机遇、挑战、陷阱和成功标准。

J Am Coll Radiol. 2018 Mar;15(3 Pt B):504-508. doi: 10.1016/j.jacr.2017.12.026. Epub 2018 Feb 4.

MKCL: Medical Knowledge with Contrastive Learning model for radiology report generation.MKCL：用于放射学报告生成的具有对比学习模型的医学知识

J Biomed Inform. 2023 Oct;146:104496. doi: 10.1016/j.jbi.2023.104496. Epub 2023 Sep 11.

Quality gaps in public pancreas imaging datasets: Implications & challenges for AI applications.公共胰腺成像数据集的质量差距：对人工智能应用的影响与挑战。

Pancreatology. 2021 Aug;21(5):1001-1008. doi: 10.1016/j.pan.2021.03.016. Epub 2021 Apr 2.

Chest Radiograph Interpretation with Deep Learning Models: Assessment with Radiologist-adjudicated Reference Standards and Population-adjusted Evaluation.深度学习模型在胸部 X 线片解读中的应用：使用经过放射科医师裁定的参考标准和人群校正评估进行评估。

Radiology. 2020 Feb;294(2):421-431. doi: 10.1148/radiol.2019191293. Epub 2019 Dec 3.

Tree-structured CRF models for interactive image labeling.树结构条件随机场模型在交互式图像标注中的应用。

IEEE Trans Pattern Anal Mach Intell. 2013 Feb;35(2):476-89. doi: 10.1109/TPAMI.2012.100.

Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers.利用多中心胸部X光图像评估卷积神经网络在标记噪声中的稳健性

JMIR Med Inform. 2020 Aug 4;8(8):e18089. doi: 10.2196/18089.

Computer-aided diagnosis through medical image retrieval in radiology.放射科中基于医学图像检索的计算机辅助诊断。

Sci Rep. 2022 Dec 1;12(1):20732. doi: 10.1038/s41598-022-25027-2.

Five simultaneous artificial intelligence data challenges on ultrasound, CT, and MRI.五个同时进行的人工智能数据挑战，涵盖超声、CT 和 MRI。

Diagn Interv Imaging. 2019 Apr;100(4):199-209. doi: 10.1016/j.diii.2019.02.001. Epub 2019 Mar 15.

Canadian Association of Radiologists White Paper on Artificial Intelligence in Radiology.加拿大放射学家协会关于放射学人工智能的白皮书。

Can Assoc Radiol J. 2018 May;69(2):120-135. doi: 10.1016/j.carj.2018.02.002. Epub 2018 Apr 11.

引用本文的文献

Incorporating Artificial Intelligence into Fracture Risk Assessment: Using Clinical Imaging to Predict the Unpredictable.将人工智能纳入骨折风险评估：利用临床影像预测不可预测之事。

Endocrinol Metab (Seoul). 2025 Aug;40(4):499-507. doi: 10.3803/EnM.2025.2518. Epub 2025 Aug 4.

MIDAS: a technology-enabled hub-and-spoke system for the collection and dissemination of high-quality medical datasets in India.MIDAS：一种在印度用于收集和传播高质量医学数据集的技术支持的中心辐射式系统。

BMC Med Inform Decis Mak. 2025 Jul 6;25(1):252. doi: 10.1186/s12911-025-03092-7.

Diagnostic accuracy of an artificial intelligence-based software in detecting supernumerary and congenitally missing teeth in panoramic radiographs.基于人工智能的软件在全景X线片中检测多生牙和先天性缺牙的诊断准确性。

Eur J Orthod. 2025 Jun 12;47(4). doi: 10.1093/ejo/cjaf054.

Gender- and Age-Associated Variations in the Prevalence of Atelectasis, Effusion, and Nodules on Chest Radiographs: A Large-Scale Analysis Using the NIH ChestX-Ray8.胸部X光片上肺不张、胸腔积液和结节患病率的性别及年龄相关差异：使用美国国立卫生研究院（NIH）ChestX-Ray8的大规模分析

Diagnostics (Basel). 2025 May 26;15(11):1330. doi: 10.3390/diagnostics15111330.

Artificial intelligence in cancer pathology: Applications, challenges, and future directions.癌症病理学中的人工智能：应用、挑战及未来方向。

Cytojournal. 2025 Apr 19;22:45. doi: 10.25259/Cytojournal_272_2024. eCollection 2025.

Diagnosis and detection of bone fracture in radiographic images using deep learning approaches.使用深度学习方法诊断和检测X光图像中的骨折

Front Med (Lausanne). 2025 Jan 24;11:1506686. doi: 10.3389/fmed.2024.1506686. eCollection 2024.

Assessing the Image Quality of Digitally Reconstructed Radiographs from Chest CT.评估胸部CT数字重建X线片的图像质量。

J Imaging Inform Med. 2025 Jan 27. doi: 10.1007/s10278-025-01406-9.

A scoping review of robustness concepts for machine learning in healthcare.医疗保健领域机器学习稳健性概念的范围综述。

NPJ Digit Med. 2025 Jan 17;8(1):38. doi: 10.1038/s41746-024-01420-1.

Artificial Intelligence in Surgery: A Systematic Review of Use and Validation.外科手术中的人工智能：使用与验证的系统综述

J Clin Med. 2024 Nov 24;13(23):7108. doi: 10.3390/jcm13237108.

Deep learning improves physician accuracy in the comprehensive detection of abnormalities on chest X-rays.深度学习提高了医生在胸部 X 光片中全面检测异常的准确性。

Sci Rep. 2024 Oct 24;14(1):25151. doi: 10.1038/s41598-024-76608-2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

探索大规模公共医学图像数据集。

Exploring Large-scale Public Medical Image Datasets.

机构信息

出版信息

RATIONALE AND OBJECTIVES

MATERIALS AND METHODS

RESULTS

CONCLUSION

背景与目的

材料与方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献