基于邻域注意力标签校正的鲁棒细粒度视觉识别

Robust Fine-Grained Visual Recognition With Neighbor-Attention Label Correction.

作者信息

Mao Shunan, Zhang Shiliang

出版信息

IEEE Trans Image Process. 2024;33:2614-2626. doi: 10.1109/TIP.2024.3378461. Epub 2024 Apr 3.

DOI:10.1109/TIP.2024.3378461

Abstract

Existing deep learning methods for fine-grained visual recognition often rely on large-scale, well-annotated training data. Obtaining fine-grained annotations in the wild typically requires concentration and expertise, such as fine category annotation for species recognition, instance annotation for person re-identification (re-id) and dense annotation for segmentation, which inevitably leads to label noise. This paper aims to tackle label noise in deep model training for fine-grained visual recognition. We propose a Neighbor-Attention Label Correction (NALC) model to correct labels during the training stage. NALC samples a training batch and a validation batch from the training set. It hence leverages a meta-learning framework to correct labels in the training batch based on the validation batch. To enhance the optimization efficiency, we introduce a novel nested optimization algorithm for the meta-learning framework. The proposed training procedure consistently improves label accuracy in the training batch, consequently enhancing the learned image representation. Experimental results demonstrate that our method significantly increases label accuracy from 70% to over 98% and outperforms recent approaches by up to 13.4% in mean Average Precision (mAP) on various fine-grained image retrieval (FGIR) tasks, including instance retrieval on CUB200 and person re-id on Market1501. We also demonstrate the efficacy of NALC on noisy semantic segmentation datasets generated from Cityscapes, where it achieves a significant 7.8% improvement in mIOU score. NALC also exhibits robustness to different types of noise, including simulated noise such as Asymmetric, Pair-Flip, and Pattern noise, as well as practical noisy labels generated by tracklets and clustering.

摘要

现有的用于细粒度视觉识别的深度学习方法通常依赖于大规模、标注良好的训练数据。在自然环境中获取细粒度标注通常需要专注和专业知识，例如用于物种识别的精细类别标注、用于行人重识别（re-id）的实例标注以及用于分割的密集标注，这不可避免地会导致标签噪声。本文旨在解决细粒度视觉识别深度模型训练中的标签噪声问题。我们提出了一种邻居注意力标签校正（NALC）模型，用于在训练阶段校正标签。NALC从训练集中采样一个训练批次和一个验证批次。因此，它利用元学习框架基于验证批次校正训练批次中的标签。为了提高优化效率，我们为元学习框架引入了一种新颖的嵌套优化算法。所提出的训练过程持续提高训练批次中的标签准确性，从而增强学习到的图像表示。实验结果表明，我们的方法显著提高了标签准确性，从70%提高到超过98%，并且在各种细粒度图像检索（FGIR）任务上，包括在CUB200上的实例检索和在Market1501上的行人重识别，在平均精度均值（mAP）方面比最近的方法高出多达13.4%。我们还证明了NALC在从Cityscapes生成的噪声语义分割数据集上的有效性，在该数据集上它在平均交并比（mIOU）得分上实现了显著的7.8%的提升。NALC对不同类型的噪声也表现出鲁棒性，包括模拟噪声，如不对称噪声、成对翻转噪声和模式噪声，以及由轨迹和聚类生成的实际噪声标签。

相似文献

Robust Fine-Grained Visual Recognition With Neighbor-Attention Label Correction.基于邻域注意力标签校正的鲁棒细粒度视觉识别

IEEE Trans Image Process. 2024;33:2614-2626. doi: 10.1109/TIP.2024.3378461. Epub 2024 Apr 3.

Collaborative Refining for Person Re-Identification With Label Noise.协同精炼与标签噪声的行人再识别。

IEEE Trans Image Process. 2022;31:379-391. doi: 10.1109/TIP.2021.3131937. Epub 2021 Dec 9.

Learning From Pixel-Level Label Noise: A New Perspective for Semi-Supervised Semantic Segmentation.从像素级标签噪声中学习：半监督语义分割的新视角

IEEE Trans Image Process. 2022;31:623-635. doi: 10.1109/TIP.2021.3134142. Epub 2021 Dec 22.

Deep semi-supervised multiple instance learning with self-correction for DME classification from OCT images.用于从光学相干断层扫描（OCT）图像中进行糖尿病性黄斑水肿（DME）分类的带自我校正的深度半监督多实例学习

Med Image Anal. 2023 Jan;83:102673. doi: 10.1016/j.media.2022.102673. Epub 2022 Oct 26.

A Two-Stage Noise-Tolerant Paradigm for Label Corrupted Person Re-Identification.一种用于标签损坏的行人重识别的两阶段抗噪范式。

IEEE Trans Pattern Anal Mach Intell. 2024 Jul;46(7):4944-4956. doi: 10.1109/TPAMI.2024.3361491. Epub 2024 Jun 5.

Selective Convolutional Descriptor Aggregation for Fine-Grained Image Retrieval.选择性卷积描述符聚合用于细粒度图像检索。

IEEE Trans Image Process. 2017 Jun;26(6):2868-2881. doi: 10.1109/TIP.2017.2688133. Epub 2017 Mar 27.

Generative Reasoning Integrated Label Noise Robust Deep Image Representation Learning.生成式推理集成标签噪声鲁棒深度图像表示学习

IEEE Trans Image Process. 2023;32:4529-4542. doi: 10.1109/TIP.2023.3293776. Epub 2023 Aug 10.

S-CUDA: Self-cleansing unsupervised domain adaptation for medical image segmentation.S-CUDA：用于医学图像分割的自清洁无监督域适应

Med Image Anal. 2021 Dec;74:102214. doi: 10.1016/j.media.2021.102214. Epub 2021 Aug 12.

Learning to segment subcortical structures from noisy annotations with a novel uncertainty-reliability aware learning framework.利用一种新颖的不确定性-可靠性感知学习框架，从有噪声的标注中学习分割皮质下结构。

Comput Biol Med. 2022 Dec;151(Pt B):106326. doi: 10.1016/j.compbiomed.2022.106326. Epub 2022 Nov 16.

Co-Learning Meets Stitch-Up for Noisy Multi-Label Visual Recognition.协同学习与拼接技术在嘈杂的多标签视觉识别中的应用。

IEEE Trans Image Process. 2023;32:2508-2519. doi: 10.1109/TIP.2023.3270103. Epub 2023 May 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于邻域注意力标签校正的鲁棒细粒度视觉识别

Robust Fine-Grained Visual Recognition With Neighbor-Attention Label Correction.

作者信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献