Suppr超能文献

基于深度学习的疾病检测的 COVID-19 3D CT 数据的泛化能力评估。

Generalizability assessment of COVID-19 3D CT data for deep learning-based disease detection.

机构信息

Center for Advanced Modelling and Geospatial Information Systems, Faculty of Engineering and IT, University of Technology Sydney, Sydney, NSW, 2007, Australia; Department of Nuclear Medicine, Vali-Asr Hospital, Tehran University of Medical Sciences, Tehran, Iran.

Center for Advanced Modelling and Geospatial Information Systems, Faculty of Engineering and IT, University of Technology Sydney, Sydney, NSW, 2007, Australia; School of Science and Technology, Faculty of Science, Agriculture, Business and Law, University of New England, Armidale, NSW, 2351, Australia.

出版信息

Comput Biol Med. 2022 Jun;145:105464. doi: 10.1016/j.compbiomed.2022.105464. Epub 2022 Apr 1.

Abstract

BACKGROUND

Artificial intelligence technologies in classification/detection of COVID-19 positive cases suffer from generalizability. Moreover, accessing and preparing another large dataset is not always feasible and time-consuming. Several studies have combined smaller COVID-19 CT datasets into "supersets" to maximize the number of training samples. This study aims to assess generalizability by splitting datasets into different portions based on 3D CT images using deep learning.

METHOD

Two large datasets, including 1110 3D CT images, were split into five segments of 20% each. Each dataset's first 20% segment was separated as a holdout test set. 3D-CNN training was performed with the remaining 80% from each dataset. Two small external datasets were also used to independently evaluate the trained models.

RESULTS

The total combination of 80% of each dataset has an accuracy of 91% on Iranmehr and 83% on Moscow holdout test datasets. Results indicated that 80% of the primary datasets are adequate for fully training a model. The additional fine-tuning using 40% of a secondary dataset helps the model generalize to a third, unseen dataset. The highest accuracy achieved through transfer learning was 85% on LDCT dataset and 83% on Iranmehr holdout test sets when retrained on 80% of Iranmehr dataset.

CONCLUSION

While the total combination of both datasets produced the best results, different combinations and transfer learning still produced generalizable results. Adopting the proposed methodology may help to obtain satisfactory results in the case of limited external datasets.

摘要

背景

用于分类/检测 COVID-19 阳性病例的人工智能技术存在泛化能力问题。此外,访问和准备另一个大型数据集并不总是可行且耗时。一些研究已经将较小的 COVID-19 CT 数据集组合成“超集”,以最大限度地增加训练样本数量。本研究旨在通过使用深度学习根据 3D CT 图像将数据集划分为不同部分来评估泛化能力。

方法

两个大型数据集,包括 1110 个 3D CT 图像,被分为 5 个 20%的部分。每个数据集的前 20%部分被分离出来作为保留测试集。使用每个数据集的剩余 80%进行 3D-CNN 训练。还使用了两个小型外部数据集来独立评估训练后的模型。

结果

每个数据集的 80%的总组合在伊朗梅尔和莫斯科保留测试数据集上的准确率分别为 91%和 83%。结果表明,80%的主要数据集足以完全训练一个模型。使用辅助数据集的 40%进行额外的微调有助于模型推广到第三个未见过的数据集。通过在伊朗梅尔数据集的 80%上进行再训练,通过迁移学习获得的最高准确率为 85%,在 LDCT 数据集和伊朗梅尔保留测试集上的准确率为 83%。

结论

虽然两个数据集的总和产生了最佳结果,但不同的组合和迁移学习仍然产生了可泛化的结果。采用所提出的方法在外部数据集有限的情况下可能有助于获得满意的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f64d/8971071/4296bd2aebd2/ga1_lrg.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验