Sampath Vignesh, Maurtua Iñaki, Aguilar Martín Juan José, Gutierrez Aitor
Autonomous and Intelligent Systems Unit, Tekniker, Member of Basque Research and Technology Alliance, Eibar, Spain.
Design and Manufacturing Engineering Department, Universidad de Zaragoza, 3 María de Luna Street, Torres Quevedo Bld, 50018 Zaragoza, Spain.
J Big Data. 2021;8(1):27. doi: 10.1186/s40537-021-00414-0. Epub 2021 Jan 29.
Any computer vision application development starts off by acquiring images and data, then preprocessing and pattern recognition steps to perform a task. When the acquired images are highly imbalanced and not adequate, the desired task may not be achievable. Unfortunately, the occurrence of imbalance problems in acquired image datasets in certain complex real-world problems such as anomaly detection, emotion recognition, medical image analysis, fraud detection, metallic surface defect detection, disaster prediction, etc., are inevitable. The performance of computer vision algorithms can significantly deteriorate when the training dataset is imbalanced. In recent years, Generative Adversarial Neural Networks (GANs) have gained immense attention by researchers across a variety of application domains due to their capability to model complex real-world image data. It is particularly important that GANs can not only be used to generate synthetic images, but also its fascinating adversarial learning idea showed good potential in restoring balance in imbalanced datasets. In this paper, we examine the most recent developments of GANs based techniques for addressing imbalance problems in image data. The real-world challenges and implementations of synthetic image generation based on GANs are extensively covered in this survey. Our survey first introduces various imbalance problems in computer vision tasks and its existing solutions, and then examines key concepts such as deep generative image models and GANs. After that, we propose a taxonomy to summarize GANs based techniques for addressing imbalance problems in computer vision tasks into three major categories: 1. Image level imbalances in classification, 2. object level imbalances in object detection and 3. pixel level imbalances in segmentation tasks. We elaborate the imbalance problems of each group, and provide GANs based solutions in each group. Readers will understand how GANs based techniques can handle the problem of imbalances and boost performance of the computer vision algorithms.
任何计算机视觉应用程序开发都是从获取图像和数据开始,然后进行预处理和模式识别步骤以执行任务。当获取的图像高度不平衡且不充分时,可能无法实现所需的任务。不幸的是,在某些复杂的现实世界问题(如异常检测、情感识别、医学图像分析、欺诈检测、金属表面缺陷检测、灾难预测等)中,获取的图像数据集中不可避免地会出现不平衡问题。当训练数据集不平衡时,计算机视觉算法的性能可能会显著下降。近年来,生成对抗神经网络(GAN)因其能够对复杂的现实世界图像数据进行建模而受到各个应用领域研究人员的广泛关注。特别重要的是,GAN不仅可以用于生成合成图像,而且其引人入胜的对抗学习思想在恢复不平衡数据集中的平衡方面显示出良好的潜力。在本文中,我们研究了基于GAN的技术在解决图像数据不平衡问题方面的最新进展。本综述广泛涵盖了基于GAN的合成图像生成的现实世界挑战和实现。我们的综述首先介绍了计算机视觉任务中的各种不平衡问题及其现有解决方案,然后研究了诸如深度生成图像模型和GAN等关键概念。之后,我们提出了一种分类法,将基于GAN的解决计算机视觉任务中不平衡问题的技术总结为三大类:1. 分类中的图像级不平衡,2. 目标检测中的目标级不平衡,3. 分割任务中的像素级不平衡。我们详细阐述了每组的不平衡问题,并在每组中提供了基于GAN的解决方案。读者将了解基于GAN的技术如何处理不平衡问题并提高计算机视觉算法的性能。