Suppr超能文献

SETA:用于领域泛化的语义感知边缘引导令牌增强

SETA: Semantic-Aware Edge-Guided Token Augmentation for Domain Generalization.

作者信息

Guo Jintao, Qi Lei, Shi Yinghuan, Gao Yang

出版信息

IEEE Trans Image Process. 2024;33:5622-5636. doi: 10.1109/TIP.2024.3470517. Epub 2024 Oct 9.

Abstract

Domain generalization (DG) aims to enhance the model robustness against domain shifts without accessing target domains. A prevalent category of methods for DG is data augmentation, which focuses on generating virtual samples to simulate domain shifts. However, existing augmentation techniques in DG are mainly tailored for convolutional neural networks (CNNs), with limited exploration in token-based architectures, i.e., vision transformer (ViT) and multi-layer perceptrons (MLP) models. In this paper, we study the impact of prior CNN-based augmentation methods on token-based models, revealing their performance is suboptimal due to the lack of incentivizing the model to learn holistic shape information. To tackle the issue, we propose the Semantic-aware Edge-guided Token Augmentation (SETA) method. SETA transforms token features by perturbing local edge cues while preserving global shape features, thereby enhancing the model learning of shape information. To further enhance the generalization ability of the model, we introduce two stylized variants of our method combined with two state-of-the-art (SOTA) style augmentation methods in DG. We provide a theoretical insight into our method, demonstrating its effectiveness in reducing the generalization risk bound. Comprehensive experiments on five benchmarks prove that our method achieves SOTA performances across various ViT and MLP architectures. Our code is available at https://github.com/lingeringlight/SETA.

摘要

域泛化(DG)旨在在不访问目标域的情况下增强模型对域转移的鲁棒性。DG的一类流行方法是数据增强,其专注于生成虚拟样本以模拟域转移。然而,DG中现有的增强技术主要是为卷积神经网络(CNN)量身定制的,在基于令牌的架构(即视觉Transformer(ViT)和多层感知器(MLP)模型)方面的探索有限。在本文中,我们研究了基于CNN的先验增强方法对基于令牌的模型的影响,发现由于缺乏激励模型学习整体形状信息,它们的性能并不理想。为了解决这个问题,我们提出了语义感知边缘引导令牌增强(SETA)方法。SETA通过扰动局部边缘线索来变换令牌特征,同时保留全局形状特征,从而增强模型对形状信息的学习。为了进一步提高模型的泛化能力,我们结合DG中的两种最新(SOTA)风格增强方法引入了我们方法的两种风格化变体。我们对我们的方法进行了理论分析,证明了其在降低泛化风险界方面的有效性。在五个基准上的综合实验证明,我们的方法在各种ViT和MLP架构上都取得了SOTA性能。我们的代码可在https://github.com/lingeringlight/SETA获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验