School of Electronics and Information Engineering, Taiyuan University of Science and Technology, Taiyuan, Shanxi, China.
PLoS One. 2024 May 2;19(5):e0302124. doi: 10.1371/journal.pone.0302124. eCollection 2024.
Image data augmentation plays a crucial role in data augmentation (DA) by increasing the quantity and diversity of labeled training data. However, existing methods have limitations. Notably, techniques like image manipulation, erasing, and mixing can distort images, compromising data quality. Accurate representation of objects without confusion is a challenge in methods like auto augment and feature augmentation. Preserving fine details and spatial relationships also proves difficult in certain techniques, as seen in deep generative models. To address these limitations, we propose OFIDA, an object-focused image data augmentation algorithm. OFIDA implements one-to-many enhancements that not only preserve essential target regions but also elevate the authenticity of simulating real-world settings and data distributions. Specifically, OFIDA utilizes a graph-based structure and object detection to streamline augmentation. Specifically, by leveraging graph properties like connectivity and hierarchy, it captures object essence and context for improved comprehension in real-world scenarios. Then, we introduce DynamicFocusNet, a novel object detection algorithm built on the graph framework. DynamicFocusNet merges dynamic graph convolutions and attention mechanisms to flexibly adjust receptive fields. Finally, the detected target images are extracted to facilitate one-to-many data augmentation. Experimental results validate the superiority of our OFIDA method over state-of-the-art methods across six benchmark datasets.
图像数据增强在数据增强(DA)中起着至关重要的作用,它可以增加标记训练数据的数量和多样性。然而,现有的方法存在局限性。特别是,像图像操纵、擦除和混合这样的技术会扭曲图像,从而影响数据质量。在像自动增强和特征增强这样的方法中,准确地表示没有混淆的物体是一个挑战。在某些技术中,如深度生成模型,精细的细节和空间关系的保留也被证明是困难的。为了解决这些限制,我们提出了 OFIDA,一种面向对象的图像数据增强算法。OFIDA 实现了一对多的增强,不仅可以保留目标区域的关键部分,还可以提高模拟真实世界场景和数据分布的真实性。具体来说,OFIDA 利用基于图的结构和对象检测来简化增强。具体来说,通过利用图的连通性和层次结构等属性,它可以捕获对象的本质和上下文,从而提高在真实场景中的理解能力。然后,我们引入了 DynamicFocusNet,这是一种基于图框架的新的对象检测算法。DynamicFocusNet 融合了动态图卷积和注意力机制,以灵活调整感受野。最后,提取检测到的目标图像,以方便一对多的数据增强。实验结果验证了我们的 OFIDA 方法在六个基准数据集上优于最先进方法的优越性。