Qi Di, Hu Jilin, Shen Jianbing
IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):5435-5446. doi: 10.1109/TNNLS.2022.3204597. Epub 2024 Apr 4.
Few-shot object detection (FSOD), which detects novel objects with only a few training instances, has recently attracted more attention. Previous works focus on making the most use of label information of objects. Still, they fail to consider the structural and semantic information of the image itself and solve the misclassification between data-abundant base classes and data-scarce novel classes efficiently. In this article, we propose FSOD with Self-Supervising and Cooperative Classifier ( [Formula: see text]) approach to deal with those concerns. Specifically, we analyze the underlying performance degradation of novel classes in FSOD and discover that false-positive samples are the main reason. By looking into these false-positive samples, we further notice that misclassifying novel classes as base classes are the main cause. Thus, we introduce double RoI heads into the existing Fast-RCNN to learn more specific features for novel classes. We also consider using self-supervised learning (SSL) to learn more structural and semantic information. Finally, we propose a cooperative classifier (CC) with the base-novel regularization to maximize the interclass variance between base and novel classes. In the experiment, [Formula: see text] outperforms all the latest baselines in most cases on PASCAL VOC and COCO.
少样本目标检测(FSOD)能够仅通过少量训练实例来检测新目标,近来受到了更多关注。先前的工作专注于充分利用目标的标签信息。然而,它们未能考虑图像本身的结构和语义信息,也未能有效解决数据丰富的基础类别与数据稀缺的新类别之间的误分类问题。在本文中,我们提出了具有自监督和协作分类器([公式:见正文])的FSOD方法来处理这些问题。具体而言,我们分析了FSOD中新类别潜在的性能下降情况,并发现误报样本是主要原因。通过研究这些误报样本,我们进一步注意到将新类别误分类为基础类别是主要原因。因此,我们在现有的Fast - RCNN中引入双感兴趣区域(RoI)头,以学习新类别的更特定特征。我们还考虑使用自监督学习(SSL)来学习更多的结构和语义信息。最后,我们提出了一种具有基础 - 新类别正则化的协作分类器(CC),以最大化基础类别和新类别之间的类间方差。在实验中,[公式:见正文]在PASCAL VOC和COCO数据集上的大多数情况下优于所有最新的基线方法。