Singh Rishav, Ahmed Tanveer, Kumar Abhinav, Singh Amit Kumar, Pandey Anil Kumar, Singh Sanjay Kumar
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):83-93. doi: 10.1109/TCBB.2020.2980831. Epub 2021 Feb 3.
Accurate breast cancer detection using automated algorithms remains a problem within the literature. Although a plethora of work has tried to address this issue, an exact solution is yet to be found. This problem is further exacerbated by the fact that most of the existing datasets are imbalanced, i.e., the number of instances of a particular class far exceeds that of the others. In this paper, we propose a framework based on the notion of transfer learning to address this issue and focus our efforts on histopathological and imbalanced image classification. We use the popular VGG-19 as the base model and complement it with several state-of-the-art techniques to improve the overall performance of the system. With the ImageNet dataset taken as the source domain, we apply the learned knowledge in the target domain consisting of histopathological images. With experimentation performed on a large-scale dataset consisting of 277,524 images, we show that the framework proposed in this paper gives superior performance than those available in the existing literature. Through numerical simulations conducted on a supercomputer, we also present guidelines for work in transfer learning and imbalanced image classification.
使用自动算法进行准确的乳腺癌检测在文献中仍然是一个问题。尽管大量工作试图解决这个问题,但尚未找到确切的解决方案。现有数据集中大多数是不平衡的,即特定类别的实例数量远远超过其他类别的实例数量,这一事实进一步加剧了该问题。在本文中,我们提出了一个基于迁移学习概念的框架来解决这个问题,并将我们的工作重点放在组织病理学和不平衡图像分类上。我们使用流行的VGG - 19作为基础模型,并用几种先进技术对其进行补充,以提高系统的整体性能。以ImageNet数据集作为源域,我们将所学知识应用于由组织病理学图像组成的目标域。通过在包含277,524张图像的大规模数据集上进行实验,我们表明本文提出的框架比现有文献中的方法具有更优的性能。通过在超级计算机上进行数值模拟,我们还给出了迁移学习和不平衡图像分类工作的指导方针。