MSFFAL：基于多尺度特征融合和注意力学习的小样本目标检测。

MSFFAL: Few-Shot Object Detection via Multi-Scale Feature Fusion and Attentive Learning.

机构信息

Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai 201210, China.

School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China.

出版信息

Sensors (Basel). 2023 Mar 30;23(7):3609. doi: 10.3390/s23073609.

DOI:10.3390/s23073609

PMID:37050671

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10099036/

Abstract

Few-shot object detection (FSOD) is proposed to solve the application problem of traditional detectors in scenarios lacking training samples. The meta-learning methods have attracted the researchers' attention for their excellent generalization performance. They usually select the same class of support features according to the query labels to weight the query features. However, the model cannot possess the ability of active identification only by using the same category support features, and feature selection causes difficulties in the testing process without labels. The single-scale feature of the model also leads to poor performance in small object detection. In addition, the hard samples in the support branch impact the backbone's representation of the support features, thus impacting the feature weighting process. To overcome these problems, we propose a multi-scale feature fusion and attentive learning (MSFFAL) framework for few-shot object detection. We first design the backbone with multi-scale feature fusion and channel attention mechanism to improve the model's detection accuracy on small objects and the representation of hard support samples. Based on this, we propose an attention loss to replace the feature weighting module. The loss allows the model to consistently represent the objects of the same category in the two branches and realizes the active recognition of the model. The model no longer depends on query labels to select features when testing, optimizing the model testing process. The experiments show that MSFFAL outperforms the state-of-the-art (SOTA) by 0.7-7.8% on the Pascal VOC and exhibits 1.61 times the result of the baseline model in MS COCO's small objects detection.

摘要

少样本目标检测 (FSOD) 旨在解决传统探测器在缺乏训练样本的场景下的应用问题。元学习方法因其出色的泛化性能引起了研究人员的关注。它们通常根据查询标签选择相同类别的支持特征来对查询特征进行加权。然而，模型仅通过使用相同类别的支持特征无法具备主动识别的能力，并且在没有标签的测试过程中特征选择会带来困难。模型的单尺度特征也导致在小目标检测中性能较差。此外，支持分支中的硬样本会影响骨干网络对支持特征的表示，从而影响特征加权过程。为了解决这些问题，我们提出了一种用于少样本目标检测的多尺度特征融合和注意学习 (MSFFAL) 框架。我们首先设计了具有多尺度特征融合和通道注意力机制的骨干网络，以提高模型在小目标上的检测精度和硬支持样本的表示能力。在此基础上，我们提出了一种注意力损失来替代特征加权模块。该损失使模型能够在两个分支中一致地表示同一类别的物体，从而实现模型的主动识别。模型在测试时不再依赖查询标签来选择特征，优化了模型的测试过程。实验表明，在 Pascal VOC 上，MSFFAL 比最先进的方法 (SOTA) 提高了 0.7-7.8%，在 MS COCO 的小目标检测中，比基线模型提高了 1.61 倍。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

MSFFAL：基于多尺度特征融合和注意力学习的小样本目标检测。

MSFFAL: Few-Shot Object Detection via Multi-Scale Feature Fusion and Attentive Learning.

机构信息

出版信息

相似文献

本文引用的文献

MSFFAL：基于多尺度特征融合和注意力学习的小样本目标检测。

MSFFAL: Few-Shot Object Detection via Multi-Scale Feature Fusion and Attentive Learning.

机构信息

出版信息

相似文献

本文引用的文献