Ahmed Nizar, Yigit Altug, Isik Zerrin, Alpkocak Adil
Department of Computer Engineering, Dokuz Eylul University, 35160 Izmir, Turkey.
Diagnostics (Basel). 2019 Aug 25;9(3):104. doi: 10.3390/diagnostics9030104.
Leukemia is a fatal cancer and has two main types: Acute and chronic. Each type has two more subtypes: Lymphoid and myeloid. Hence, in total, there are four subtypes of leukemia. This study proposes a new approach for diagnosis of all subtypes of leukemia from microscopic blood cell images using convolutional neural networks (CNN), which requires a large training data set. Therefore, we also investigated the effects of data augmentation for an increasing number of training samples synthetically. We used two publicly available leukemia data sources: ALL-IDB and ASH Image Bank. Next, we applied seven different image transformation techniques as data augmentation. We designed a CNN architecture capable of recognizing all subtypes of leukemia. Besides, we also explored other well-known machine learning algorithms such as naive Bayes, support vector machine, -nearest neighbor, and decision tree. To evaluate our approach, we set up a set of experiments and used 5-fold cross-validation. The results we obtained from experiments showed that our CNN model performance has 88.25% and 81.74% accuracy, in leukemia versus healthy and multiclass classification of all subtypes, respectively. Finally, we also showed that the CNN model has a better performance than other wellknown machine learning algorithms.
白血病是一种致命的癌症,主要有两种类型:急性和慢性。每种类型又各有两种亚型:淋巴样和髓样。因此,白血病总共有四种亚型。本研究提出了一种利用卷积神经网络(CNN)从微观血细胞图像诊断所有白血病亚型的新方法,这需要大量的训练数据集。因此,我们还综合研究了数据增强对增加训练样本数量的影响。我们使用了两个公开可用的白血病数据源:ALL-IDB和ASH图像库。接下来,我们应用了七种不同的图像变换技术进行数据增强。我们设计了一种能够识别所有白血病亚型的CNN架构。此外,我们还探索了其他一些知名的机器学习算法,如朴素贝叶斯、支持向量机、k近邻和决策树。为了评估我们的方法,我们设置了一组实验并使用了五折交叉验证。我们从实验中获得的结果表明,我们的CNN模型在白血病与健康样本分类以及所有亚型的多类分类中的准确率分别为88.25%和81.74%。最后,我们还表明,CNN模型比其他知名的机器学习算法具有更好的性能。