Pham Nhat Truong, Ko Jinsol, Shah Masaud, Rakkiyappan Rajan, Woo Hyun Goo, Manavalan Balachandran
Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon, 16419, Gyeonggi-do, Republic of Korea.
Department of Physiology, Ajou University School of Medicine, Suwon, 16499, Republic of Korea; Department of Biomedical Science, Graduate School, Ajou University, Suwon, Republic of Korea.
Comput Biol Med. 2025 Feb;185:109461. doi: 10.1016/j.compbiomed.2024.109461. Epub 2024 Dec 3.
The COVID-19 pandemic has emerged as a global health crisis, impacting millions worldwide. Although chest computed tomography (CT) scan images are pivotal in diagnosing COVID-19, their manual interpretation by radiologists is time-consuming and potentially subjective. Automated computer-aided diagnostic (CAD) frameworks offer efficient and objective solutions. However, machine or deep learning methods often face challenges in their reproducibility due to underlying biases and methodological flaws. To address these issues, we propose XCT-COVID, an explainable, transferable, and reproducible CAD framework based on deep transfer learning to predict COVID-19 infection from CT scan images accurately. This is the first study to develop three distinct models within a unified framework by leveraging a previously unexplored large dataset and two widely used smaller datasets. We employed five known convolutional neural network architectures, both with and without pretrained weights, on the larger dataset. We optimized hyperparameters through extensive grid search and 5-fold cross-validation (CV), significantly enhancing the model performance. Experimental results from the larger dataset showed that the VGG16 architecture (XCT-COVID-L) with pretrained weights consistently outperformed other architectures, achieving the best performance, on both 5-fold CV and independent test. When evaluated with the external datasets, XCT-COVID-L performed well with data with similar distributions, demonstrating its transferability. However, its performance significantly decreased on smaller datasets with lower-quality images. To address this, we developed other models, XCT-COVID-S1 and XCT-COVID-S2, specifically for the smaller datasets, outperforming existing methods. Moreover, eXplainable Artificial Intelligence (XAI) analyses were employed to interpret the models' functionalities. For prediction and reproducibility purposes, the implementation of XCT-COVID is publicly accessible at https://github.com/cbbl-skku-org/XCT-COVID/.
新冠疫情已演变成一场全球健康危机,影响着全球数百万人。尽管胸部计算机断层扫描(CT)图像在新冠诊断中至关重要,但放射科医生对其进行人工解读既耗时又可能存在主观性。自动化计算机辅助诊断(CAD)框架提供了高效且客观的解决方案。然而,由于潜在的偏差和方法缺陷,机器学习或深度学习方法在可重复性方面常常面临挑战。为解决这些问题,我们提出了XCT-COVID,这是一个基于深度迁移学习的可解释、可转移且可重复的CAD框架,用于从CT扫描图像中准确预测新冠感染情况。这是第一项在统一框架内通过利用一个此前未被探索的大型数据集和两个广泛使用的较小数据集开发三种不同模型的研究。我们在较大数据集上使用了五种已知的卷积神经网络架构,包括有预训练权重和无预训练权重的架构。我们通过广泛的网格搜索和五折交叉验证(CV)对超参数进行了优化,显著提高了模型性能。来自较大数据集的实验结果表明,带有预训练权重的VGG16架构(XCT-COVID-L)在五折CV和独立测试中始终优于其他架构,表现最佳。在用外部数据集进行评估时,XCT-COVID-L在分布相似的数据上表现良好,证明了其可转移性。然而,在图像质量较低的较小数据集上,其性能显著下降。为解决这一问题,我们专门为较小数据集开发了其他模型XCT-COVID-S1和XCT-COVID-S2,其性能优于现有方法。此外,我们还采用了可解释人工智能(XAI)分析来解释模型的功能。为了便于预测和实现可重复性,XCT-COVID的实现可在https://github.com/cbbl-skku-org/XCT-COVID/上公开获取。