• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对5种用于在胸部X光片上检测肺结节的人工智能软件准确性的独立评估。

Independent evaluation of the accuracy of 5 artificial intelligence software for detecting lung nodules on chest X-rays.

作者信息

Arzamasov Kirill, Vasilev Yuriy, Zelenova Maria, Pestrenin Lev, Busygina Yulia, Bobrovskaya Tatiana, Chetverikov Sergey, Shikhmuradov David, Pankratov Andrey, Kirpichev Yury, Sinitsyn Valentin, Son Irina, Omelyanskaya Olga

机构信息

State Budget-Funded Health Care Institution of the City of Moscow "Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department", Moscow, Russian Federation.

MIREA - Russian Technological University, Moscow, Russian Federation.

出版信息

Quant Imaging Med Surg. 2024 Aug 1;14(8):5288-5303. doi: 10.21037/qims-24-160. Epub 2024 Jul 25.

DOI:10.21037/qims-24-160
PMID:39144030
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11320553/
Abstract

BACKGROUND

The integration of artificial intelligence (AI) into medicine is growing, with some experts predicting its standalone use soon. However, skepticism remains due to limited positive outcomes from independent validations. This research evaluates AI software's effectiveness in analyzing chest X-rays (CXR) to identify lung nodules, a possible lung cancer indicator.

METHODS

This retrospective study analyzed 7,670,212 record pairs from radiological exams conducted between 2020 and 2022 during the Moscow Computer Vision Experiment, focusing on CXR and computed tomography (CT) scans. All images were acquired during clinical routine. The final dataset comprised 100 CXR images (50 with lung nodules, 50 without), selected consecutively and based on inclusion and exclusion criteria, to evaluate the performance of all five AI-based solutions, participating in the Moscow Computer Vision Experiment and analyzing CXR. The evaluation was performed in 3 stages. In the first stage, the probability of a nodule in the lung obtained from AI services was compared with the Ground Truth (1-there is a nodule, 0-there is no nodule). In the second stage, 3 radiologists evaluated the segmentation of nodules performed by the AI services (1-nodule correctly segmented, 0-nodule incorrectly segmented or not segmented at all). In the third stage, the same radiologists additionally evaluated the classification of the nodules (1-nodule correctly segmented and classified, 0-all other cases). The results obtained in stages 2 and 3 were compared with Ground Truth, which was common to all three stages. For each stage, diagnostic accuracy metrics were calculated for each AI service.

RESULTS

Three software solutions (Celsus, Lunit INSIGHT CXR, and qXR) demonstrated diagnostic metrics that matched or surpassed the vendor specifications, and achieved the highest area under the receiver operating characteristic curve (AUC) of 0.956 [95% confidence interval (CI): 0.918 to 0.994]. However, when evaluated by three radiologists for accurate nodule segmentation and classification, all solutions performed below the vendor-declared metrics, with the highest AUC reaching 0.812 (95% CI: 0.744 to 0.879). Meanwhile, all AI services demonstrated 100% specificity at stages 2 and 3 of the study.

CONCLUSIONS

To ensure the reliability and applicability of AI-based software, it is crucial to validate performance metrics using high-quality datasets and engage radiologists in the evaluation process. Developers are recommended to improve the accuracy of the underlying models before allowing the standalone use of the software for lung nodule detection. The dataset created during the study may be accessed at https://mosmed.ai/datasets/mosmeddatargogksnalichiemiotsutstviemlegochnihuzlovtipvii/.

摘要

背景

人工智能(AI)在医学领域的应用日益广泛,一些专家预测其将很快实现独立使用。然而,由于独立验证的积极成果有限,人们仍持怀疑态度。本研究评估了人工智能软件在分析胸部X光(CXR)以识别肺结节(一种可能的肺癌指标)方面的有效性。

方法

这项回顾性研究分析了2020年至2022年莫斯科计算机视觉实验期间进行的放射学检查中的7,670,212对记录,重点是胸部X光和计算机断层扫描(CT)。所有图像均在临床常规检查中获取。最终数据集包括100张胸部X光图像(50张有肺结节,50张无肺结节),根据纳入和排除标准连续选择,以评估参与莫斯科计算机视觉实验并分析胸部X光的所有五种基于人工智能的解决方案的性能。评估分三个阶段进行。在第一阶段,将人工智能服务得出的肺部有结节的概率与真实情况(1-有结节,0-无结节)进行比较。在第二阶段,3名放射科医生评估人工智能服务对结节的分割情况(1-结节分割正确,0-结节分割错误或未分割)。在第三阶段,同样的放射科医生还评估了结节的分类情况(1-结节分割并分类正确,0-所有其他情况)。将第二和第三阶段获得的结果与三个阶段通用的真实情况进行比较。对于每个阶段,计算每个人工智能服务的诊断准确性指标。

结果

三种软件解决方案(塞尔苏斯、Lunit INSIGHT CXR和qXR)展示了与供应商规格匹配或超越供应商规格的诊断指标,并在接收器操作特征曲线(AUC)下达到了最高面积0.956[95%置信区间(CI):0.918至0.994]。然而,当由三名放射科医生评估结节的准确分割和分类时,所有解决方案的表现均低于供应商宣称的指标,最高AUC达到0.812(95%CI:0.744至0.879)。同时,在研究的第二和第三阶段所有人工智能服务的特异性均为100%。

结论

为确保基于人工智能的软件的可靠性和适用性,使用高质量数据集验证性能指标并让放射科医生参与评估过程至关重要。建议开发者在允许软件独立用于肺结节检测之前提高基础模型的准确性。可通过https://mosmed.ai/datasets/mosmeddatargogksnalichiemiotsutstviemlegochnihuzlovtipvii/访问研究期间创建的数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/d8ed8781d98b/qims-14-08-5288-f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/96b08406fbad/qims-14-08-5288-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/2d9dee33aa8a/qims-14-08-5288-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/18f299846c30/qims-14-08-5288-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/e15d905871d2/qims-14-08-5288-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/9fa3abcc0c4f/qims-14-08-5288-f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/d8ed8781d98b/qims-14-08-5288-f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/96b08406fbad/qims-14-08-5288-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/2d9dee33aa8a/qims-14-08-5288-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/18f299846c30/qims-14-08-5288-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/e15d905871d2/qims-14-08-5288-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/9fa3abcc0c4f/qims-14-08-5288-f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95ce/11320553/d8ed8781d98b/qims-14-08-5288-f6.jpg

相似文献

1
Independent evaluation of the accuracy of 5 artificial intelligence software for detecting lung nodules on chest X-rays.对5种用于在胸部X光片上检测肺结节的人工智能软件准确性的独立评估。
Quant Imaging Med Surg. 2024 Aug 1;14(8):5288-5303. doi: 10.21037/qims-24-160. Epub 2024 Jul 25.
2
Retrospectively assessing evaluation and management of artificial-intelligence detected nodules on uninterpreted chest radiographs in the era of radiologists shortage.回顾性评估在放射科医生短缺时代对未经解释的胸部 X 光片中人工智能检测到的结节的评估和管理。
Eur J Radiol. 2024 Jan;170:111241. doi: 10.1016/j.ejrad.2023.111241. Epub 2023 Nov 28.
3
An Artificial Intelligence-Based Chest X-ray Model on Human Nodule Detection Accuracy From a Multicenter Study.基于人工智能的多中心研究中用于检测人类结节的胸部 X 射线模型。
JAMA Netw Open. 2021 Dec 1;4(12):e2141096. doi: 10.1001/jamanetworkopen.2021.41096.
4
Impact of AI-assisted CXR analysis in detecting incidental lung nodules and lung cancers in non-respiratory outpatient clinics.人工智能辅助胸部X线分析在非呼吸科门诊检测偶然发现的肺结节和肺癌中的作用。
Front Med (Lausanne). 2024 Aug 7;11:1449537. doi: 10.3389/fmed.2024.1449537. eCollection 2024.
5
Tuberculosis detection from chest x-rays for triaging in a high tuberculosis-burden setting: an evaluation of five artificial intelligence algorithms.从高结核病负担环境中的胸部 X 光片中检测结核病以进行分诊:五种人工智能算法的评估。
Lancet Digit Health. 2021 Sep;3(9):e543-e554. doi: 10.1016/S2589-7500(21)00116-3.
6
Performance of an AI based CAD system in solid lung nodule detection on chest phantom radiographs compared to radiology residents and fellow radiologists.基于人工智能的计算机辅助检测(CAD)系统在胸部体模X光片上检测实性肺结节的性能与放射科住院医师和放射科专科医师的比较。
J Thorac Dis. 2021 May;13(5):2728-2737. doi: 10.21037/jtd-20-3522.
7
Diagnostic accuracy of three computer-aided detection systems for detecting pulmonary tuberculosis on chest radiography when used for screening: Analysis of an international, multicenter migrants screening study.三种计算机辅助检测系统用于胸部X线筛查肺结核的诊断准确性:一项国际多中心移民筛查研究的分析
PLOS Glob Public Health. 2023 Jul 14;3(7):e0000402. doi: 10.1371/journal.pgph.0000402. eCollection 2023.
8
A deep residual learning network for predicting lung adenocarcinoma manifesting as ground-glass nodule on CT images.基于深度残差学习的 CT 图像磨玻璃结节肺腺癌预测网络
Eur Radiol. 2020 Apr;30(4):1847-1855. doi: 10.1007/s00330-019-06533-w. Epub 2019 Dec 6.
9
[Performance of Deep-learning-based Artificial Intelligence on Detection of Pulmonary Nodules in Chest CT].基于深度学习的人工智能在胸部CT肺结节检测中的性能
Zhongguo Fei Ai Za Zhi. 2019 Jun 20;22(6):336-340. doi: 10.3779/j.issn.1009-3419.2019.06.02.
10
AI-based computer-aided diagnostic system of chest digital tomography synthesis: Demonstrating comparative advantage with X-ray-based AI systems.基于人工智能的胸部数字断层合成计算机辅助诊断系统:与基于 X 射线的人工智能系统比较优势展示。
Comput Methods Programs Biomed. 2023 Oct;240:107643. doi: 10.1016/j.cmpb.2023.107643. Epub 2023 Jun 5.

引用本文的文献

1
Artificial intelligence in automated detection of lung nodules: a narrative review.人工智能在肺结节自动检测中的应用:一项叙述性综述。
Int J Physiol Pathophysiol Pharmacol. 2025 Apr 25;17(2):45-51. doi: 10.62347/YHID9574. eCollection 2025.
2
Evolution of an Artificial Intelligence-Powered Application for Mammography.一款用于乳房X光检查的人工智能驱动应用程序的发展历程。
Diagnostics (Basel). 2025 Mar 24;15(7):822. doi: 10.3390/diagnostics15070822.
3
Ethical Considerations in the Use of Artificial Intelligence in Pain Medicine.疼痛医学中人工智能应用的伦理考量

本文引用的文献

1
Comparison of Commercial AI Software Performance for Radiograph Lung Nodule Detection and Bone Age Prediction.商业人工智能软件在胸片肺结节检测和骨龄预测方面的性能比较。
Radiology. 2024 Jan;310(1):e230981. doi: 10.1148/radiol.230981.
2
Risk of Bias in Chest Radiography Deep Learning Foundation Models.胸部X光深度学习基础模型中的偏倚风险
Radiol Artif Intell. 2023 Sep 27;5(6):e230060. doi: 10.1148/ryai.230060. eCollection 2023 Nov.
3
Multi-Label Classification of Chest X-ray Abnormalities Using Transfer Learning Techniques.使用迁移学习技术对胸部X光异常进行多标签分类
Curr Pain Headache Rep. 2025 Jan 6;29(1):10. doi: 10.1007/s11916-024-01330-7.
J Pers Med. 2023 Sep 22;13(10):1426. doi: 10.3390/jpm13101426.
4
What Is Machine Learning, Artificial Neural Networks and Deep Learning?-Examples of Practical Applications in Medicine.什么是机器学习、人工神经网络和深度学习?——医学中的实际应用示例
Diagnostics (Basel). 2023 Aug 3;13(15):2582. doi: 10.3390/diagnostics13152582.
5
Understanding Biases and Disparities in Radiology AI Datasets: A Review.理解放射学人工智能数据集的偏差和差异:综述。
J Am Coll Radiol. 2023 Sep;20(9):836-841. doi: 10.1016/j.jacr.2023.06.015. Epub 2023 Jul 16.
6
Approaches to Sampling for Quality Control of Artificial Intelligence in Biomedical Research.人工智能在生物医学研究中的质量控制的采样方法。
Sovrem Tekhnologii Med. 2023;15(2):19-25. doi: 10.17691/stm2023.15.2.02. Epub 2023 Mar 29.
7
Artificial neural network based prediction of the lung tissue involvement as an independent in-hospital mortality and mechanical ventilation risk factor in COVID-19.基于人工神经网络的 COVID-19 肺部组织受累预测及其作为院内独立死亡和机械通气风险因素。
J Med Virol. 2023 May;95(5):e28787. doi: 10.1002/jmv.28787.
8
AI-Based CXR First Reading: Current Limitations to Ensure Practical Value.基于人工智能的胸部X光首次阅片:确保实用价值的当前局限性
Diagnostics (Basel). 2023 Apr 16;13(8):1430. doi: 10.3390/diagnostics13081430.
9
Development and Validation of a Deep Learning-Based Synthetic Bone-Suppressed Model for Pulmonary Nodule Detection in Chest Radiographs.基于深度学习的合成骨抑制模型在胸部 X 线片中肺结节检测的开发与验证。
JAMA Netw Open. 2023 Jan 3;6(1):e2253820. doi: 10.1001/jamanetworkopen.2022.53820.
10
AAPM task group report 273: Recommendations on best practices for AI and machine learning for computer-aided diagnosis in medical imaging.AAPM 工作组报告 273:关于医学影像计算机辅助诊断中人工智能和机器学习的最佳实践建议。
Med Phys. 2023 Feb;50(2):e1-e24. doi: 10.1002/mp.16188. Epub 2023 Jan 6.