• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

专家确定参考标准对评估深度学习模型诊断性能的影响:胸部 X 线片上的恶性肺结节检测任务。

Effects of Expert-Determined Reference Standards in Evaluating the Diagnostic Performance of a Deep Learning Model: A Malignant Lung Nodule Detection Task on Chest Radiographs.

机构信息

Institute of Medical and Biological Engineering, Medical Research Center, Seoul National University, Seoul, Korea.

Mathematical Institute, University of Oxford, United Kingdom.

出版信息

Korean J Radiol. 2023 Feb;24(2):155-165. doi: 10.3348/kjr.2022.0548.

DOI:10.3348/kjr.2022.0548
PMID:36725356
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9892220/
Abstract

OBJECTIVE

Little is known about the effects of using different expert-determined reference standards when evaluating the performance of deep learning-based automatic detection (DLAD) models and their added value to radiologists. We assessed the concordance of expert-determined standards with a clinical gold standard (herein, pathological confirmation) and the effects of different expert-determined reference standards on the estimates of radiologists' diagnostic performance to detect malignant pulmonary nodules on chest radiographs with and without the assistance of a DLAD model.

MATERIALS AND METHODS

This study included chest radiographs from 50 patients with pathologically proven lung cancer and 50 controls. Five expert-determined standards were constructed using the interpretations of 10 experts: individual judgment by the most experienced expert, majority vote, consensus judgments of two and three experts, and a latent class analysis (LCA) model. In separate reader tests, additional 10 radiologists independently interpreted the radiographs and then assisted with the DLAD model. Their diagnostic performance was estimated using the clinical gold standard and various expert-determined standards as the reference standard, and the results were compared using the test with Bonferroni correction.

RESULTS

The LCA model (sensitivity, 72.6%; specificity, 100%) was most similar to the clinical gold standard. When expert-determined standards were used, the sensitivities of radiologists and DLAD model alone were overestimated, and their specificities were underestimated (all -values < 0.05). DLAD assistance diminished the overestimation of sensitivity but exaggerated the underestimation of specificity (all -values < 0.001). The DLAD model improved sensitivity and specificity to a greater extent when using the clinical gold standard than when using the expert-determined standards (all -values < 0.001), except for sensitivity with the LCA model ( = 0.094).

CONCLUSION

The LCA model was most similar to the clinical gold standard for malignant pulmonary nodule detection on chest radiographs. Expert-determined standards caused bias in measuring the diagnostic performance of the artificial intelligence model.

摘要

目的

当评估基于深度学习的自动检测(DLAD)模型的性能及其对放射科医生的附加值时,使用不同的专家确定的参考标准的效果知之甚少。我们评估了专家确定的标准与临床金标准(在此为病理证实)的一致性,以及不同的专家确定的参考标准对放射科医生在有无 DLAD 模型辅助下检测胸部 X 线片中恶性肺结节的诊断性能估计的影响。

材料和方法

本研究纳入了 50 例经病理证实的肺癌患者和 50 例对照患者的胸部 X 线片。使用 10 位专家的解释构建了 5 种专家确定的标准:最有经验的专家的个体判断、多数票、两位和三位专家的共识判断以及潜在类别分析(LCA)模型。在单独的读者测试中,另外 10 位放射科医生独立解读 X 线片,然后使用 DLAD 模型辅助。使用临床金标准和各种专家确定的标准作为参考标准来估计他们的诊断性能,并使用校正后的卡方检验比较结果。

结果

LCA 模型(敏感性 72.6%,特异性 100%)与临床金标准最为相似。当使用专家确定的标准时,放射科医生和 DLAD 模型的敏感性被高估,特异性被低估(所有 P 值均 <0.05)。DLAD 辅助减少了敏感性的高估,但夸大了特异性的低估(所有 P 值均 <0.001)。当使用临床金标准时,DLAD 模型对敏感性和特异性的改善程度大于使用专家确定的标准(所有 P 值均 <0.001),除了使用 LCA 模型时的敏感性( P =0.094)。

结论

在检测胸部 X 线片中的恶性肺结节方面,LCA 模型与临床金标准最为相似。专家确定的标准会导致对人工智能模型诊断性能的测量产生偏差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06fb/9892220/5d697061c60c/kjr-24-155-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06fb/9892220/8904f20b1c61/kjr-24-155-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06fb/9892220/001331c27f57/kjr-24-155-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06fb/9892220/5d697061c60c/kjr-24-155-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06fb/9892220/8904f20b1c61/kjr-24-155-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06fb/9892220/001331c27f57/kjr-24-155-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06fb/9892220/5d697061c60c/kjr-24-155-g003.jpg

相似文献

1
Effects of Expert-Determined Reference Standards in Evaluating the Diagnostic Performance of a Deep Learning Model: A Malignant Lung Nodule Detection Task on Chest Radiographs.专家确定参考标准对评估深度学习模型诊断性能的影响:胸部 X 线片上的恶性肺结节检测任务。
Korean J Radiol. 2023 Feb;24(2):155-165. doi: 10.3348/kjr.2022.0548.
2
Development and Validation of Deep Learning-based Automatic Detection Algorithm for Malignant Pulmonary Nodules on Chest Radiographs.基于深度学习的胸部 X 线片恶性肺结节自动检测算法的开发与验证。
Radiology. 2019 Jan;290(1):218-228. doi: 10.1148/radiol.2018180237. Epub 2018 Sep 25.
3
Deep learning-based automatic detection for pulmonary nodules on chest radiographs: The relationship with background lung condition, nodule characteristics, and location.基于深度学习的胸部 X 线片肺结节自动检测:与背景肺状况、结节特征和位置的关系。
Eur J Radiol. 2023 Sep;166:111002. doi: 10.1016/j.ejrad.2023.111002. Epub 2023 Jul 22.
4
Deep Learning-based Automatic Detection Algorithm for Reducing Overlooked Lung Cancers on Chest Radiographs.基于深度学习的降低胸部 X 光片漏诊肺癌的自动检测算法。
Radiology. 2020 Sep;296(3):652-661. doi: 10.1148/radiol.2020200165. Epub 2020 Jul 21.
5
Validation of a Deep Learning Algorithm for the Detection of Malignant Pulmonary Nodules in Chest Radiographs.深度学习算法在胸部 X 光片中检测恶性肺结节的验证。
JAMA Netw Open. 2020 Sep 1;3(9):e2017135. doi: 10.1001/jamanetworkopen.2020.17135.
6
Artificial intelligence-supported lung cancer detection by multi-institutional readers with multi-vendor chest radiographs: a retrospective clinical validation study.人工智能支持的多机构阅片者对多厂家胸部 X 线片的肺癌检测:一项回顾性临床验证研究。
BMC Cancer. 2021 Oct 18;21(1):1120. doi: 10.1186/s12885-021-08847-9.
7
External validation of deep learning-based automated detection algorithm for chest radiograph: practical issues in outpatient clinic.基于深度学习的胸部X光片自动检测算法的外部验证:门诊中的实际问题
Acta Radiol. 2023 Nov;64(11):2898-2907. doi: 10.1177/02841851231202323. Epub 2023 Sep 26.
8
Deep Convolutional Neural Network-based Software Improves Radiologist Detection of Malignant Lung Nodules on Chest Radiographs.基于深度卷积神经网络的软件提高放射科医生在胸部 X 光片上检测恶性肺结节的能力。
Radiology. 2020 Jan;294(1):199-209. doi: 10.1148/radiol.2019182465. Epub 2019 Nov 12.
9
Deep learning-based automated detection algorithm for active pulmonary tuberculosis on chest radiographs: diagnostic performance in systematic screening of asymptomatic individuals.基于深度学习的胸部 X 线片活动性肺结核自动检测算法:在无症状人群系统筛查中的诊断性能。
Eur Radiol. 2021 Feb;31(2):1069-1080. doi: 10.1007/s00330-020-07219-4. Epub 2020 Aug 28.
10
Test-retest reproducibility of a deep learning-based automatic detection algorithm for the chest radiograph.基于深度学习的自动检测算法在胸片中检测结果的重测信度。
Eur Radiol. 2020 Apr;30(4):2346-2355. doi: 10.1007/s00330-019-06589-8. Epub 2020 Jan 3.

引用本文的文献

1
A Systematic Review: The Role of Artificial Intelligence in Lung Cancer Screening in Detecting Lung Nodules on Chest X-Rays.一项系统评价:人工智能在肺癌筛查中对胸部X线片上肺结节检测的作用
Diagnostics (Basel). 2025 Jan 22;15(3):246. doi: 10.3390/diagnostics15030246.
2
Performing a Research Study Using Open-Source Deep Learning Models.使用开源深度学习模型进行研究
Korean J Radiol. 2024 Mar;25(3):217-219. doi: 10.3348/kjr.2023.0869. Epub 2024 Jan 10.
3
The Performance of a Deep Learning-Based Automatic Measurement Model for Measuring the Cardiothoracic Ratio on Chest Radiographs.

本文引用的文献

1
AI in health and medicine.人工智能在医疗中的应用。
Nat Med. 2022 Jan;28(1):31-38. doi: 10.1038/s41591-021-01614-0. Epub 2022 Jan 20.
2
Diagnostic effect of artificial intelligence solution for referable thoracic abnormalities on chest radiography: a multicenter respiratory outpatient diagnostic cohort study.人工智能解决方案对胸部放射摄影中可转诊胸部异常的诊断效果:一项多中心呼吸门诊诊断队列研究。
Eur Radiol. 2022 May;32(5):3469-3479. doi: 10.1007/s00330-021-08397-5. Epub 2022 Jan 1.
3
Evaluation of artificial intelligence on a reference standard based on subjective interpretation.
基于深度学习的胸部X线片心胸比自动测量模型的性能
Bioengineering (Basel). 2023 Sep 12;10(9):1077. doi: 10.3390/bioengineering10091077.
基于主观解读的参考标准对人工智能的评估。
Lancet Digit Health. 2021 Nov;3(11):e693-e695. doi: 10.1016/S2589-7500(21)00216-8. Epub 2021 Sep 21.
4
Deep Learning for Detection of Pulmonary Metastasis on Chest Radiographs.深度学习在胸部 X 光片中肺转移瘤检测中的应用。
Radiology. 2021 Nov;301(2):455-463. doi: 10.1148/radiol.2021210578. Epub 2021 Aug 31.
5
Deep Learning for Malignancy Risk Estimation of Pulmonary Nodules Detected at Low-Dose Screening CT.基于低剂量 CT 扫描检测到的肺部结节的恶性肿瘤风险估计的深度学习。
Radiology. 2021 Aug;300(2):438-447. doi: 10.1148/radiol.2021204433. Epub 2021 May 18.
6
Comparative diagnostic accuracy studies with an imperfect reference standard - a comparison of correction methods.具有不完美参考标准的比较诊断准确性研究 - 校正方法比较。
BMC Med Res Methodol. 2021 Apr 12;21(1):67. doi: 10.1186/s12874-021-01255-4.
7
Added Value of Deep Learning-based Detection System for Multiple Major Findings on Chest Radiographs: A Randomized Crossover Study.深度学习检测系统对胸部 X 线片中多个主要发现的增值作用:一项随机交叉研究。
Radiology. 2021 May;299(2):450-459. doi: 10.1148/radiol.2021202818. Epub 2021 Mar 23.
8
Performance of a Deep Learning Algorithm Compared with Radiologic Interpretation for Lung Cancer Detection on Chest Radiographs in a Health Screening Population.深度学习算法与放射解读在健康筛查人群中对胸部 X 光片肺癌检测的性能比较。
Radiology. 2020 Dec;297(3):687-696. doi: 10.1148/radiol.2020201240. Epub 2020 Sep 22.
9
Interobserver agreement issues in radiology.放射学中的观察者间一致性问题。
Diagn Interv Imaging. 2020 Oct;101(10):639-641. doi: 10.1016/j.diii.2020.09.001. Epub 2020 Sep 18.
10
Development and clinical application of deep learning model for lung nodules screening on CT images.深度学习模型在 CT 图像肺结节筛查中的开发与临床应用。
Sci Rep. 2020 Aug 12;10(1):13657. doi: 10.1038/s41598-020-70629-3.