Department of Bioengineering, University of Illinois Urbana-Champaign, Illinois, USA.
Laboratoire LITIS (EA 4108), Equipe Quantif, University of Rouen, Rouen, France.
Med Phys. 2024 Nov;51(11):8232-8246. doi: 10.1002/mp.17320. Epub 2024 Jul 30.
Deep learning (DL) techniques have been extensively applied in medical image classification. The unique characteristics of medical imaging data present challenges, including small labeled datasets, severely imbalanced class distribution, and significant variations in imaging quality. Recently, generative adversarial network (GAN)-based classification methods have gained attention for their ability to enhance classification accuracy by incorporating realistic GAN-generated images as data augmentation. However, the performance of these GAN-based methods often relies on high-quality generated images, while large amounts of training data are required to train GAN models to achieve optimal performance.
In this study, we propose an adversarial learning-based classification framework to achieve better classification performance. Innovatively, GAN models are employed as supplementary regularization terms to support classification, aiming to address the challenges described above.
The proposed classification framework, GAN-DL, consists of a feature extraction network (F-Net), a classifier, and two adversarial networks, specifically a reconstruction network (R-Net) and a discriminator network (D-Net). The F-Net extracts features from input images, and the classifier uses these features for classification tasks. R-Net and D-Net have been designed following the GAN architecture. R-Net employs the extracted feature to reconstruct the original images, while D-Net is tasked with the discrimination between the reconstructed image and the original images. An iterative adversarial learning strategy is designed to guide model training by incorporating multiple network-specific loss functions. These loss functions, serving as supplementary regularization, are automatically derived during the reconstruction process and require no additional data annotation.
To verify the model's effectiveness, we performed experiments on two datasets, including a COVID-19 dataset with 13 958 chest x-ray images and an oropharyngeal squamous cell carcinoma (OPSCC) dataset with 3255 positron emission tomography images. Thirteen classic DL-based classification methods were implemented on the same datasets for comparison. Performance metrics included precision, sensitivity, specificity, and -score. In addition, we conducted ablation studies to assess the effects of various factors on model performance, including the network depth of F-Net, training image size, training dataset size, and loss function design. Our method achieved superior performance than all comparative methods. On the COVID-19 dataset, our method achieved , , , and in terms of precision, sensitivity, specificity, and -score, respectively. It achieved across all these metrics on the OPSCC dataset. The study to investigate the effects of two adversarial networks highlights the crucial role of D-Net in improving model performance. Ablation studies further provide an in-depth understanding of our methodology.
Our adversarial-based classification framework leverages GAN-based adversarial networks and an iterative adversarial learning strategy to harness supplementary regularization during training. This design significantly enhances classification accuracy and mitigates overfitting issues in medical image datasets. Moreover, its modular design not only demonstrates flexibility but also indicates its potential applicability to various clinical contexts and medical imaging applications.
深度学习(DL)技术已广泛应用于医学图像分类。医学成像数据具有独特的特征,这给分类带来了挑战,包括标注数据集小、类别分布严重不平衡以及成像质量变化大。最近,基于生成对抗网络(GAN)的分类方法引起了人们的关注,因为它们能够通过将逼真的 GAN 生成图像作为数据增强来提高分类准确性。然而,这些基于 GAN 的方法的性能通常依赖于高质量的生成图像,同时需要大量的训练数据来训练 GAN 模型以达到最佳性能。
本研究提出了一种基于对抗学习的分类框架,以实现更好的分类性能。创新之处在于,GAN 模型被用作辅助正则化项来支持分类,旨在解决上述挑战。
所提出的分类框架 GAN-DL 由特征提取网络(F-Net)、分类器和两个对抗网络组成,即重建网络(R-Net)和判别网络(D-Net)。F-Net 从输入图像中提取特征,分类器使用这些特征进行分类任务。R-Net 和 D-Net 是根据 GAN 架构设计的。R-Net 使用提取的特征来重建原始图像,而 D-Net 则负责区分重建图像和原始图像。设计了一个迭代对抗学习策略来指导模型训练,该策略通过引入多个网络特定的损失函数来实现。这些损失函数作为辅助正则化项,在重建过程中自动导出,不需要额外的数据注释。
为了验证模型的有效性,我们在两个数据集上进行了实验,包括一个包含 13958 张胸部 X 射线图像的 COVID-19 数据集和一个包含 3255 张正电子发射断层扫描图像的口咽鳞状细胞癌(OPSCC)数据集。同时在相同的数据集上实现了 13 种经典的基于深度学习的分类方法进行比较。性能指标包括精度、敏感度、特异性和 F1 分数。此外,我们进行了消融研究,以评估各种因素对模型性能的影响,包括 F-Net 的网络深度、训练图像大小、训练数据集大小和损失函数设计。我们的方法在所有比较方法中都表现出了优越的性能。在 COVID-19 数据集上,我们的方法在精度、敏感度、特异性和 F1 分数方面的表现分别为 、 、 、 。在 OPSCC 数据集上,我们的方法在所有这些指标上都达到了 。对两个对抗网络的影响的研究强调了 D-Net 在提高模型性能方面的关键作用。消融研究进一步深入了解了我们的方法。
我们基于对抗的分类框架利用基于 GAN 的对抗网络和迭代对抗学习策略,在训练过程中利用辅助正则化。这种设计显著提高了医学图像数据集的分类准确性,并减轻了过拟合问题。此外,其模块化设计不仅展示了灵活性,还表明了其在各种临床环境和医学成像应用中的潜在适用性。