极端区域卷积神经网络：通过样本合成与知识蒸馏实现少样本目标检测

Extreme R-CNN: Few-Shot Object Detection via Sample Synthesis and Knowledge Distillation.

作者信息

Zhang Shenyong, Wang Wenmin, Wang Zhibing, Li Honglei, Li Ruochen, Zhang Shixiong

机构信息

School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China.

School of Computer Technology, Beijing Institute of Technology, Zhuhai 519088, China.

出版信息

Sensors (Basel). 2024 Dec 7;24(23):7833. doi: 10.3390/s24237833.

DOI:10.3390/s24237833

PMID:39686371

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11645051/

Abstract

Traditional object detectors require extensive instance-level annotations for training. Conversely, few-shot object detectors, which are generally fine-tuned using limited data from unknown classes, tend to show biases toward base categories and are susceptible to variations within these unknown samples. To mitigate these challenges, we introduce a Two-Stage Fine-Tuning Approach (TFA) named Extreme R-CNN, designed to operate effectively with extremely limited original samples through the integration of sample synthesis and knowledge distillation. Our approach involves synthesizing new training examples via instance clipping and employing various data-augmentation techniques. We enhance the Faster R-CNN architecture by decoupling the regression and classification components of the Region of Interest (RoI), allowing synthetic samples to train the classification head independently of the object-localization process. Comprehensive evaluations on the Microsoft COCO and PASCAL VOC datasets demonstrate significant improvements over baseline methods. Specifically, on the PASCAL VOC dataset, the average precision for novel categories is enhanced by up to 15 percent, while on the more complex Microsoft COCO benchmark it is enhanced by up to 6.1 percent. Remarkably, in the 1-shot scenario, the AP50 of our model exceeds that of the baseline model in the 10-shot setting within the PASCAL VOC dataset, confirming the efficacy of our proposed method.

摘要

传统的目标检测器需要大量的实例级注释来进行训练。相反，少样本目标检测器通常使用来自未知类别的有限数据进行微调，往往会对基础类别表现出偏差，并且容易受到这些未知样本内变化的影响。为了缓解这些挑战，我们引入了一种名为极端区域卷积神经网络（Extreme R-CNN）的两阶段微调方法（TFA），旨在通过整合样本合成和知识蒸馏，在极其有限的原始样本上有效运行。我们的方法包括通过实例裁剪合成新的训练示例，并采用各种数据增强技术。我们通过解耦感兴趣区域（RoI）的回归和分类组件来增强更快区域卷积神经网络（Faster R-CNN）架构，使合成样本能够独立于目标定位过程训练分类头。在微软COCO和PASCAL VOC数据集上的综合评估表明，相对于基线方法有显著改进。具体而言，在PASCAL VOC数据集上，新类别平均精度提高了高达15%，而在更复杂的微软COCO基准测试中提高了高达6.1%。值得注意的是，在单样本场景中，我们模型在PASCAL VOC数据集中的AP50超过了基线模型在十样本设置下的AP50，证实了我们所提出方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9500/11645051/6267fd91b539/sensors-24-07833-g001.jpg

相似文献

Extreme R-CNN: Few-Shot Object Detection via Sample Synthesis and Knowledge Distillation.极端区域卷积神经网络：通过样本合成与知识蒸馏实现少样本目标检测

Sensors (Basel). 2024 Dec 7;24(23):7833. doi: 10.3390/s24237833.

Proposal Distribution Calibration for Few-Shot Object Detection.用于少样本目标检测的提议分布校准

IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):1911-1918. doi: 10.1109/TNNLS.2023.3331648. Epub 2025 Jan 7.

ECEA: Extensible Co-Existing Attention for Few-Shot Object Detection.ECEA：用于少样本目标检测的可扩展共存注意力机制

IEEE Trans Image Process. 2024;33:5564-5576. doi: 10.1109/TIP.2024.3411771. Epub 2024 Oct 4.

Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection.通过嵌入辅助信息实现的少样本目标检测的广义语义对比学习

IEEE Trans Pattern Anal Mach Intell. 2025 Aug;47(8):6496-6514. doi: 10.1109/TPAMI.2025.3560033.

Category Knowledge-Guided Parameter Calibration for Few-Shot Object Detection.用于少样本目标检测的类别知识引导参数校准

IEEE Trans Image Process. 2023;32:1092-1107. doi: 10.1109/TIP.2023.3239197. Epub 2023 Feb 3.

Expandable-RCNN: toward high-efficiency incremental few-shot object detection.可扩展区域卷积神经网络：迈向高效增量少样本目标检测

Front Artif Intell. 2024 Apr 23;7:1377337. doi: 10.3389/frai.2024.1377337. eCollection 2024.

One-shot segmentation of novel white matter tracts via extensive data augmentation and adaptive knowledge transfer.通过广泛的数据增强和自适应知识转移实现新型白质束的一次性分割。

Med Image Anal. 2023 Dec;90:102968. doi: 10.1016/j.media.2023.102968. Epub 2023 Sep 15.

Localization Distillation for Object Detection.用于目标检测的局部蒸馏

IEEE Trans Pattern Anal Mach Intell. 2023 Aug;45(8):10070-10083. doi: 10.1109/TPAMI.2023.3248583. Epub 2023 Jun 30.

Decoupled Metric Network for Single-Stage Few-Shot Object Detection.用于单阶段少样本目标检测的解耦度量网络

IEEE Trans Cybern. 2023 Jan;53(1):514-525. doi: 10.1109/TCYB.2022.3149825. Epub 2022 Dec 23.

Synthesizing Knowledge-Enhanced Features for Real-World Zero-Shot Food Detection.合成知识增强特征用于真实世界的零样本食物检测。

IEEE Trans Image Process. 2024;33:1285-1298. doi: 10.1109/TIP.2024.3360899. Epub 2024 Feb 13.

引用本文的文献

FIAEPI-KD: A novel knowledge distillation approach for precise detection of missing insulators in transmission lines.FIAEPI-KD：一种用于精确检测输电线路中缺失绝缘子的新型知识蒸馏方法。

PLoS One. 2025 May 30;20(5):e0324524. doi: 10.1371/journal.pone.0324524. eCollection 2025.

本文引用的文献

Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild.基于小样本的野外目标检测与视角估计

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3090-3106. doi: 10.1109/TPAMI.2022.3174072. Epub 2023 Feb 3.

Focal Loss for Dense Object Detection.用于密集目标检测的焦散损失

IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):318-327. doi: 10.1109/TPAMI.2018.2858826. Epub 2018 Jul 23.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN：基于区域建议网络的实时目标检测。

IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

极端区域卷积神经网络：通过样本合成与知识蒸馏实现少样本目标检测

Extreme R-CNN: Few-Shot Object Detection via Sample Synthesis and Knowledge Distillation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献