用于多风格图像的脑启发式语义数据增强

Brain-inspired semantic data augmentation for multi-style images.

作者信息

Wang Wei, Shang Zhaowei, Li Chengxing

机构信息

College of Computer Science, Chongqing University, Chongqing, China.

出版信息

Front Neurorobot. 2024 Mar 26;18:1382406. doi: 10.3389/fnbot.2024.1382406. eCollection 2024.

DOI:10.3389/fnbot.2024.1382406

PMID:38596181

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11002076/

Abstract

Data augmentation is an effective technique for automatically expanding training data in deep learning. Brain-inspired methods are approaches that draw inspiration from the functionality and structure of the human brain and apply these mechanisms and principles to artificial intelligence and computer science. When there is a large style difference between training data and testing data, common data augmentation methods cannot effectively enhance the generalization performance of the deep model. To solve this problem, we improve modeling Domain Shifts with Uncertainty (DSU) and propose a new brain-inspired computer vision image data augmentation method which consists of two key components, namely, (RCDSU) and (FeatureDA). RCDSU calculates feature statistics (mean and standard deviation) with robust statistics to weaken the influence of outliers, making the statistics close to the real values and improving the robustness of deep learning models. By controlling the coefficient of variance, RCDSU makes the feature statistics shift with semantic preservation and increases shift range. FeatureDA controls the coefficient of variance similarly to generate the augmented features with semantics unchanged and increase the coverage of augmented features. RCDSU and FeatureDA are proposed to perform style transfer and content transfer in the feature space, and improve the generalization ability of the model at the style and content level respectively. On Photo, Art Painting, Cartoon, and Sketch (PACS) multi-style classification task, RCDSU plus FeatureDA achieves competitive accuracy. After adding Gaussian noise to PACS dataset, RCDSU plus FeatureDA shows strong robustness against outliers. FeatureDA achieves excellent results on CIFAR-100 image classification task. RCDSU plus FeatureDA can be applied as a novel brain-inspired semantic data augmentation method with implicit robot automation which is suitable for datasets with large style differences between training and testing data.

摘要

数据增强是深度学习中自动扩展训练数据的有效技术。受脑启发的方法是从人类大脑的功能和结构中汲取灵感，并将这些机制和原理应用于人工智能和计算机科学的方法。当训练数据和测试数据之间存在较大风格差异时，常见的数据增强方法无法有效地提高深度模型的泛化性能。为了解决这个问题，我们改进了不确定性建模域转移（DSU），并提出了一种新的受脑启发的计算机视觉图像数据增强方法，该方法由两个关键组件组成，即鲁棒统计计算域转移（RCDSU）和特征数据增强（FeatureDA）。RCDSU使用鲁棒统计计算特征统计量（均值和标准差），以减弱异常值的影响，使统计量接近真实值，并提高深度学习模型的鲁棒性。通过控制方差系数，RCDSU使特征统计量在语义保留的情况下发生转移，并增加转移范围。FeatureDA类似地控制方差系数，以生成语义不变的增强特征，并增加增强特征的覆盖范围。提出RCDSU和FeatureDA在特征空间中进行风格转移和内容转移，并分别在风格和内容层面提高模型的泛化能力。在照片、艺术绘画、卡通和素描（PACS）多风格分类任务中，RCDSU加FeatureDA实现了有竞争力的准确率。在PACS数据集添加高斯噪声后，RCDSU加FeatureDA对异常值表现出很强的鲁棒性。FeatureDA在CIFAR-100图像分类任务中取得了优异的结果。RCDSU加FeatureDA可作为一种新颖的受脑启发的语义数据增强方法，具有隐式机器人自动化功能，适用于训练和测试数据之间存在较大风格差异的数据集。