Suppr超能文献

用于细粒度视觉分类的多粒度部分采样注意力机制

Multi-Granularity Part Sampling Attention for Fine-Grained Visual Classification.

作者信息

Wang Jiahui, Xu Qin, Jiang Bo, Luo Bin, Tang Jinhui

出版信息

IEEE Trans Image Process. 2024;33:4529-4542. doi: 10.1109/TIP.2024.3441813. Epub 2024 Aug 23.

Abstract

Fine-grained visual classification aims to classify similar sub-categories with the challenges of large variations within the same sub-category and high visual similarities between different sub-categories. Recently, methods that extract semantic parts of the discriminative regions have attracted increasing attention. However, most existing methods extract the part features via rectangular bounding boxes by object detection module or attention mechanism, which makes it difficult to capture the rich shape information of objects. In this paper, we propose a novel Multi-Granularity Part Sampling Attention (MPSA) network for fine-grained visual classification. First, a novel multi-granularity part retrospect block is designed to extract the part information of different scales and enhance the high-level feature representation with discriminative part features of different granularities. Then, to extract part features of various shapes at each granularity, we propose part sampling attention, which can sample the implicit semantic parts on the feature maps comprehensively. The proposed part sampling attention not only considers the importance of sampled parts but also adopts the part dropout to reduce the overfitting issue. In addition, we propose a novel multi-granularity fusion method to highlight the foreground features and suppress the background noises with the assistance of the gradient class activation map. Experimental results demonstrate that the proposed MPSA achieves state-of-the-art performance on four commonly used fine-grained visual classification benchmarks. The source code is publicly available at https://github.com/mobulan/MPSA.

摘要

细粒度视觉分类旨在对相似的子类别进行分类,面临着同一子类别内变化大以及不同子类别间视觉相似度高的挑战。最近,提取判别区域语义部分的方法受到了越来越多的关注。然而,大多数现有方法通过目标检测模块或注意力机制利用矩形边界框提取部分特征,这使得难以捕捉物体丰富的形状信息。在本文中,我们提出了一种用于细粒度视觉分类的新型多粒度部分采样注意力(MPSA)网络。首先,设计了一种新型的多粒度部分回溯模块,以提取不同尺度的部分信息,并利用不同粒度的判别部分特征增强高级特征表示。然后,为了在每个粒度上提取各种形状的部分特征,我们提出了部分采样注意力,它可以全面地在特征图上采样隐式语义部分。所提出的部分采样注意力不仅考虑了采样部分的重要性,还采用了部分随机失活来减少过拟合问题。此外,我们提出了一种新型的多粒度融合方法,借助梯度类激活图突出前景特征并抑制背景噪声。实验结果表明,所提出的MPSA在四个常用的细粒度视觉分类基准上取得了领先的性能。源代码可在https://github.com/mobulan/MPSA上公开获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验