Suppr超能文献

极端区域卷积神经网络:通过样本合成与知识蒸馏实现少样本目标检测

Extreme R-CNN: Few-Shot Object Detection via Sample Synthesis and Knowledge Distillation.

作者信息

Zhang Shenyong, Wang Wenmin, Wang Zhibing, Li Honglei, Li Ruochen, Zhang Shixiong

机构信息

School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China.

School of Computer Technology, Beijing Institute of Technology, Zhuhai 519088, China.

出版信息

Sensors (Basel). 2024 Dec 7;24(23):7833. doi: 10.3390/s24237833.

Abstract

Traditional object detectors require extensive instance-level annotations for training. Conversely, few-shot object detectors, which are generally fine-tuned using limited data from unknown classes, tend to show biases toward base categories and are susceptible to variations within these unknown samples. To mitigate these challenges, we introduce a Two-Stage Fine-Tuning Approach (TFA) named Extreme R-CNN, designed to operate effectively with extremely limited original samples through the integration of sample synthesis and knowledge distillation. Our approach involves synthesizing new training examples via instance clipping and employing various data-augmentation techniques. We enhance the Faster R-CNN architecture by decoupling the regression and classification components of the Region of Interest (RoI), allowing synthetic samples to train the classification head independently of the object-localization process. Comprehensive evaluations on the Microsoft COCO and PASCAL VOC datasets demonstrate significant improvements over baseline methods. Specifically, on the PASCAL VOC dataset, the average precision for novel categories is enhanced by up to 15 percent, while on the more complex Microsoft COCO benchmark it is enhanced by up to 6.1 percent. Remarkably, in the 1-shot scenario, the AP50 of our model exceeds that of the baseline model in the 10-shot setting within the PASCAL VOC dataset, confirming the efficacy of our proposed method.

摘要

传统的目标检测器需要大量的实例级注释来进行训练。相反,少样本目标检测器通常使用来自未知类别的有限数据进行微调,往往会对基础类别表现出偏差,并且容易受到这些未知样本内变化的影响。为了缓解这些挑战,我们引入了一种名为极端区域卷积神经网络(Extreme R-CNN)的两阶段微调方法(TFA),旨在通过整合样本合成和知识蒸馏,在极其有限的原始样本上有效运行。我们的方法包括通过实例裁剪合成新的训练示例,并采用各种数据增强技术。我们通过解耦感兴趣区域(RoI)的回归和分类组件来增强更快区域卷积神经网络(Faster R-CNN)架构,使合成样本能够独立于目标定位过程训练分类头。在微软COCO和PASCAL VOC数据集上的综合评估表明,相对于基线方法有显著改进。具体而言,在PASCAL VOC数据集上,新类别平均精度提高了高达15%,而在更复杂的微软COCO基准测试中提高了高达6.1%。值得注意的是,在单样本场景中,我们模型在PASCAL VOC数据集中的AP50超过了基线模型在十样本设置下的AP50,证实了我们所提出方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9500/11645051/6267fd91b539/sensors-24-07833-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验