• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习算法评估的多供应商验证管道。

Validation pipeline for machine learning algorithm assessment for multiple vendors.

机构信息

MGH & BWH Center for Clinical Data Science, Mass General Brigham, Boston, Massachusetts, United States of America.

Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.

出版信息

PLoS One. 2022 Apr 29;17(4):e0267213. doi: 10.1371/journal.pone.0267213. eCollection 2022.

DOI:10.1371/journal.pone.0267213
PMID:35486572
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9053776/
Abstract

A standardized objective evaluation method is needed to compare machine learning (ML) algorithms as these tools become available for clinical use. Therefore, we designed, built, and tested an evaluation pipeline with the goal of normalizing performance measurement of independently developed algorithms, using a common test dataset of our clinical imaging. Three vendor applications for detecting solid, part-solid, and groundglass lung nodules in chest CT examinations were assessed in this retrospective study using our data-preprocessing and algorithm assessment chain. The pipeline included tools for image cohort creation and de-identification; report and image annotation for ground-truth labeling; server partitioning to receive vendor "black box" algorithms and to enable model testing on our internal clinical data (100 chest CTs with 243 nodules) from within our security firewall; model validation and result visualization; and performance assessment calculating algorithm recall, precision, and receiver operating characteristic curves (ROC). Algorithm true positives, false positives, false negatives, recall, and precision for detecting lung nodules were as follows: Vendor-1 (194, 23, 49, 0.80, 0.89); Vendor-2 (182, 270, 61, 0.75, 0.40); Vendor-3 (75, 120, 168, 0.32, 0.39). The AUCs for detection of solid (0.61-0.74), groundglass (0.66-0.86) and part-solid (0.52-0.86) nodules varied between the three vendors. Our ML model validation pipeline enabled testing of multi-vendor algorithms within the institutional firewall. Wide variations in algorithm performance for detection as well as classification of lung nodules justifies the premise for a standardized objective ML algorithm evaluation process.

摘要

需要一种标准化的客观评估方法来比较机器学习 (ML) 算法,因为这些工具即将可用于临床应用。因此,我们设计、构建和测试了一个评估管道,旨在使用我们的临床成像的共同测试数据集来标准化独立开发算法的性能测量。在这项回顾性研究中,使用我们的数据预处理和算法评估链评估了三种用于检测胸部 CT 检查中实性、部分实性和磨玻璃肺结节的供应商应用程序。该管道包括用于创建和去识别图像队列的工具;用于真实标签注释的报告和图像;服务器分区,用于接收供应商“黑盒”算法并在我们的内部临床数据(100 份胸部 CT 和 243 个结节)上进行模型测试,而无需通过我们的安全防火墙;模型验证和结果可视化;以及性能评估,计算算法召回率、精度和接收器操作特征曲线 (ROC)。检测肺结节的算法真阳性、假阳性、假阴性、召回率和精度如下:供应商 1(194、23、49、0.80、0.89);供应商 2(182、270、61、0.75、0.40);供应商 3(75、120、168、0.32、0.39)。三个供应商之间用于检测实性(0.61-0.74)、磨玻璃(0.66-0.86)和部分实性(0.52-0.86)结节的 AUC 各不相同。我们的 ML 模型验证管道使在机构防火墙内测试多供应商算法成为可能。用于检测和分类肺结节的算法性能存在广泛差异,这证明了标准化客观 ML 算法评估过程的前提是合理的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/deb321bb5c8f/pone.0267213.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/5491aeffcbdc/pone.0267213.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/3e6729de6a10/pone.0267213.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/9fbfc3ef403c/pone.0267213.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/386af0b9a8cf/pone.0267213.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/deb321bb5c8f/pone.0267213.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/5491aeffcbdc/pone.0267213.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/3e6729de6a10/pone.0267213.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/9fbfc3ef403c/pone.0267213.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/386af0b9a8cf/pone.0267213.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a47/9053776/deb321bb5c8f/pone.0267213.g005.jpg

相似文献

1
Validation pipeline for machine learning algorithm assessment for multiple vendors.机器学习算法评估的多供应商验证管道。
PLoS One. 2022 Apr 29;17(4):e0267213. doi: 10.1371/journal.pone.0267213. eCollection 2022.
2
Computer-aided diagnosis of lung cancer: the effect of training data sets on classification accuracy of lung nodules.计算机辅助诊断肺癌:训练数据集对肺结节分类准确率的影响。
Phys Med Biol. 2018 Feb 5;63(3):035036. doi: 10.1088/1361-6560/aaa610.
3
Reproducible Machine Learning Methods for Lung Cancer Detection Using Computed Tomography Images: Algorithm Development and Validation.使用计算机断层扫描图像检测肺癌的可重现机器学习方法:算法开发与验证。
J Med Internet Res. 2020 Aug 5;22(8):e16709. doi: 10.2196/16709.
4
High precision localization of pulmonary nodules on chest CT utilizing axial slice number labels.利用轴向切片编号标签对胸部 CT 上的肺结节进行高精度定位。
BMC Med Imaging. 2021 Apr 9;21(1):66. doi: 10.1186/s12880-021-00594-4.
5
Value of a deep learning-based algorithm for detecting Lung-RADS category 4 nodules on chest radiographs in a health checkup population: estimation of the sample size for a randomized controlled trial.深度学习算法在健康体检人群胸片中检测 Lung-RADS 分类 4 结节的价值:一项随机对照试验的样本量估计。
Eur Radiol. 2022 Jan;32(1):213-222. doi: 10.1007/s00330-021-08162-8. Epub 2021 Jul 15.
6
Prediction of pathologic stage in non-small cell lung cancer using machine learning algorithm based on CT image feature analysis.基于 CT 图像特征分析的机器学习算法预测非小细胞肺癌病理分期。
BMC Cancer. 2019 May 17;19(1):464. doi: 10.1186/s12885-019-5646-9.
7
Predicting benign, preinvasive, and invasive lung nodules on computed tomography scans using machine learning.利用机器学习预测 CT 扫描中的良性、癌前和浸润性肺结节。
J Thorac Cardiovasc Surg. 2022 Apr;163(4):1496-1505.e10. doi: 10.1016/j.jtcvs.2021.02.010. Epub 2021 Feb 16.
8
Shape-based computer-aided detection of lung nodules in thoracic CT images.基于形状的胸部CT图像中肺结节的计算机辅助检测
IEEE Trans Biomed Eng. 2009 Jul;56(7):1810-20. doi: 10.1109/TBME.2009.2017027.
9
Localized thin-section CT with radiomics feature extraction and machine learning to classify early-detected pulmonary nodules from lung cancer screening.基于影像组学特征提取和机器学习的局部薄层 CT 对肺癌筛查中早期检出肺结节的分类
Phys Med Biol. 2018 Mar 14;63(6):065005. doi: 10.1088/1361-6560/aaafab.
10
Computer-aided Detection of Subsolid Nodules at Chest CT: Improved Performance with Deep Learning-based CT Section Thickness Reduction.计算机辅助检测胸部 CT 中的亚实性结节:基于深度学习的 CT 层厚减少可提高性能。
Radiology. 2021 Apr;299(1):211-219. doi: 10.1148/radiol.2021203387. Epub 2021 Feb 9.

本文引用的文献

1
FDA-regulated AI Algorithms: Trends, Strengths, and Gaps of Validation Studies.FDA 监管的人工智能算法:验证研究的趋势、优势和差距。
Acad Radiol. 2022 Apr;29(4):559-566. doi: 10.1016/j.acra.2021.09.002. Epub 2021 Dec 27.
2
A fully automatic artificial intelligence-based CT image analysis system for accurate detection, diagnosis, and quantitative severity evaluation of pulmonary tuberculosis.一种基于全自动人工智能的 CT 图像分析系统,用于准确检测、诊断和定量评估肺结核的严重程度。
Eur Radiol. 2022 Apr;32(4):2188-2199. doi: 10.1007/s00330-021-08365-z. Epub 2021 Nov 29.
3
Automated Data Quality Control in FDOPA brain PET Imaging using Deep Learning.
基于深度学习的 FDOPA 脑 PET 成像中数据质量的自动化控制。
Comput Methods Programs Biomed. 2021 Sep;208:106239. doi: 10.1016/j.cmpb.2021.106239. Epub 2021 Jun 22.
4
How medical AI devices are evaluated: limitations and recommendations from an analysis of FDA approvals.医学人工智能设备的评估方式:基于对美国食品药品监督管理局批准情况分析的局限性与建议
Nat Med. 2021 Apr;27(4):582-584. doi: 10.1038/s41591-021-01312-x.
5
Quantitative Clinical Nuclear Cardiology, Part 1: Established Applications.定量临床核医学心脏学,第一部分:既定应用。
J Nucl Cardiol. 2020 Feb;27(1):189-201. doi: 10.1007/s12350-019-01906-6. Epub 2019 Oct 25.
6
Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: The LUNA16 challenge.自动检测 CT 图像中肺结节的算法的验证、比较和组合:LUNA16 挑战赛。
Med Image Anal. 2017 Dec;42:1-13. doi: 10.1016/j.media.2017.06.015. Epub 2017 Jul 13.
7
Creation of an Open Framework for Point-of-Care Computer-Assisted Reporting and Decision Support Tools for Radiologists.为放射科医生创建用于床边计算机辅助报告和决策支持工具的开放框架。
J Am Coll Radiol. 2017 Sep;14(9):1184-1189. doi: 10.1016/j.jacr.2017.04.031. Epub 2017 Jun 23.
8
The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS).多模态脑肿瘤图像分割基准(BRATS)。
IEEE Trans Med Imaging. 2015 Oct;34(10):1993-2024. doi: 10.1109/TMI.2014.2377694. Epub 2014 Dec 4.
9
Productivity costs of cancer mortality in the United States: 2000-2020.美国2000 - 2020年癌症死亡造成的生产力成本
J Natl Cancer Inst. 2008 Dec 17;100(24):1763-70. doi: 10.1093/jnci/djn384. Epub 2008 Dec 9.
10
Assessment of radiologist performance in the detection of lung nodules: dependence on the definition of "truth".放射科医生在检测肺结节方面的表现评估:对“真值”定义的依赖性。
Acad Radiol. 2009 Jan;16(1):28-38. doi: 10.1016/j.acra.2008.05.022.