Suppr超能文献

再训练和数据划分对胸部X光片上COVID-19分类任务中深度学习模型泛化能力的影响。

Impact of retraining and data partitions on the generalizability of a deep learning model in the task of COVID-19 classification on chest radiographs.

作者信息

Shenouda Mena, Whitney Heather M, Giger Maryellen L, Armato Samuel G

机构信息

The University of Chicago, Committee on Medical Physics, Department of Radiology, Chicago, Illinois, United States.

出版信息

J Med Imaging (Bellingham). 2024 Nov;11(6):064503. doi: 10.1117/1.JMI.11.6.064503. Epub 2024 Dec 26.

Abstract

PURPOSE

This study aimed to investigate the impact of different model retraining schemes and data partitioning on model performance in the task of COVID-19 classification on standard chest radiographs (CXRs), in the context of model generalizability.

APPROACH

Two datasets from the same institution were used: Set A (9860 patients, collected from 02/20/2020 to 02/03/2021) and Set B (5893 patients, collected from 03/15/2020 to 01/01/2022). An original deep learning (DL) model trained and tested in the task of COVID-19 classification using the initial partition of Set A achieved an area under the curve (AUC) value of 0.76, whereas Set B yielded a significantly lower value of 0.67. To explore this discrepancy, four separate strategies were undertaken on the original model: (1) retrain using Set B, (2) fine-tune using Set B, (3) regularization, and (4) repartition of the training set from Set A 200 times and report AUC values.

RESULTS

The model achieved the following AUC values (95% confidence interval) for the four methods: (1) 0.61 [0.56, 0.66]; (2) 0.70 [0.66, 0.73], both on Set B; (3) 0.76 [0.72, 0.79] on the initial test partition of Set A and 0.68 [0.66, 0.70] on Set B; and (4) on repartitions of Set A. The lowest AUC value (0.66 [0.62, 0.69]) of the Set A repartitions was no longer significantly different from the initial 0.67 achieved on Set B.

CONCLUSIONS

Different data repartitions of the same dataset used to train a DL model demonstrated significantly different performance values that helped explain the discrepancy between Set A and Set B and further demonstrated the limitations of model generalizability.

摘要

目的

本研究旨在探讨在模型泛化的背景下,不同的模型再训练方案和数据划分对基于标准胸部X光片(CXR)进行COVID-19分类任务中模型性能的影响。

方法

使用了来自同一机构的两个数据集:A组(9860例患者,收集于2020年2月20日至2021年2月3日)和B组(5893例患者,收集于2020年3月15日至2022年1月1日)。一个在使用A组初始划分进行COVID-19分类任务中训练和测试的原始深度学习(DL)模型,其曲线下面积(AUC)值为0.76,而B组的该值显著较低,为0.67。为探究这种差异,对原始模型采取了四种不同策略:(1)使用B组进行再训练,(2)使用B组进行微调,(3)正则化,以及(4)对A组训练集进行200次重新划分并报告AUC值。

结果

该模型对四种方法获得的AUC值(95%置信区间)如下:(1)在B组上为0.61[0.56, 0.66];(2)在B组上为0.70[0.66, 0.73];(3)在A组初始测试划分上为0.76[0.72, 0.79],在B组上为0.68[0.66, 0.70];以及(4)在A组重新划分上。A组重新划分中最低的AUC值(0.66[0.62, 0.69])与在B组上最初获得的0.67不再有显著差异。

结论

用于训练DL模型的同一数据集的不同数据重新划分显示出显著不同的性能值,这有助于解释A组和B组之间的差异,并进一步证明了模型泛化的局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e2f/11670362/e0b936e77ea8/JMI-011-064503-g001.jpg

相似文献

8
Eliciting adverse effects data from participants in clinical trials.从临床试验参与者中获取不良反应数据。
Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验