Hazra Debapriya, Byun Yung-Cheol, Kim Woo Jin, Kang Chul-Ung
Department of Computer Engineering, Jeju National University, Jeju 63243, Korea.
Department of Computer Engineering, Jeju National University, Institute of Information Science & Technology, Jeju 63243, Korea.
Biology (Basel). 2022 Feb 10;11(2):276. doi: 10.3390/biology11020276.
Every year approximately 1.24 million people are diagnosed with blood cancer. While the rate increases each year, the availability of data for each kind of blood cancer remains scarce. It is essential to produce enough data for each blood cell type obtained from bone marrow aspirate smears to diagnose rare types of cancer. Generating data would help easy and quick diagnosis, which are the most critical factors in cancer. Generative adversarial networks (GAN) are the latest emerging framework for generating synthetic images and time-series data. This paper takes microscopic cell images, preprocesses them, and uses a hybrid GAN architecture to generate synthetic images of the cell types containing fewer data. We prepared a single dataset with expert intervention by combining images from three different sources. The final dataset consists of 12 cell types and has 33,177 microscopic cell images. We use the discriminator architecture of auxiliary classifier GAN (AC-GAN) and combine it with the Wasserstein GAN with gradient penalty model (WGAN-GP). We name our model as WGAN-GP-AC. The discriminator in our proposed model works to identify real and generated images and classify every image with a cell type. We provide experimental results demonstrating that our proposed model performs better than existing individual and hybrid GAN models in generating microscopic cell images. We use the generated synthetic data with classification models, and the results prove that the classification rate increases significantly. Classification models achieved 0.95 precision and 0.96 recall value for synthetic data, which is higher than the original, augmented, or combined datasets.
每年约有124万人被诊断患有血癌。尽管发病率逐年上升,但每种血癌的数据仍然稀缺。从骨髓穿刺涂片获得的每种血细胞类型都必须生成足够的数据,以诊断罕见类型的癌症。生成数据将有助于实现轻松快速的诊断,而这是癌症治疗中最关键的因素。生成对抗网络(GAN)是用于生成合成图像和时间序列数据的最新框架。本文获取微观细胞图像,对其进行预处理,并使用混合GAN架构生成数据较少的细胞类型的合成图像。我们通过合并来自三个不同来源的图像,在专家干预下准备了一个单一数据集。最终数据集包含12种细胞类型,有33177张微观细胞图像。我们使用辅助分类器GAN(AC-GAN)的判别器架构,并将其与带梯度惩罚模型的瓦瑟斯坦GAN(WGAN-GP)相结合。我们将我们的模型命名为WGAN-GP-AC。我们提出的模型中的判别器用于识别真实图像和生成的图像,并对每种图像进行细胞类型分类。我们提供的实验结果表明,我们提出的模型在生成微观细胞图像方面比现有的单个和混合GAN模型表现更好。我们将生成的合成数据与分类模型一起使用,结果证明分类率显著提高。分类模型对合成数据的精确率达到0.95,召回率达到0.96,高于原始、增强或组合数据集。