Yang Ze, Zhang Chi, Li Ruibo, Xu Yi, Lin Guosheng
IEEE Trans Image Process. 2023;32:321-334. doi: 10.1109/TIP.2022.3228162. Epub 2022 Dec 21.
Few-shot object detection (FSOD), which aims at learning a generic detector that can adapt to unseen tasks with scarce training samples, has witnessed consistent improvement recently. However, most existing methods ignore the efficiency issues, e.g., high computational complexity and slow adaptation speed. Notably, efficiency has become an increasingly important evaluation metric for few-shot techniques due to an emerging trend toward embedded AI. To this end, we present an efficient pretrain-transfer framework (PTF) baseline with no computational increment, which achieves comparable results with previous state-of-the-art (SOTA) methods. Upon this baseline, we devise an initializer named knowledge inheritance (KI) to reliably initialize the novel weights for the box classifier, which effectively facilitates the knowledge transfer process and boosts the adaptation speed. Within the KI initializer, we propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights. Finally, our approach not only achieves the SOTA results across three public benchmarks, i.e., PASCAL VOC, COCO and LVIS, but also exhibits high efficiency with $1.8-100\times $ faster adaptation speed against the other methods on COCO/LVIS benchmark during few-shot transfer. To our best knowledge, this is the first work to consider the efficiency problem in FSOD. We hope to motivate a trend toward powerful yet efficient few-shot technique development. The codes are publicly available at https://github.com/Ze-Yang/Efficient-FSOD.
少样本目标检测(FSOD)旨在学习一种通用检测器,使其能够在训练样本稀缺的情况下适应未见任务,近年来取得了持续的进展。然而,大多数现有方法都忽略了效率问题,例如计算复杂度高和适应速度慢。值得注意的是,由于嵌入式人工智能的新兴趋势,效率已成为少样本技术越来越重要的评估指标。为此,我们提出了一种高效的预训练-迁移框架(PTF)基线,其没有计算增量,与先前的最先进(SOTA)方法取得了可比的结果。在此基线上,我们设计了一种名为知识继承(KI)的初始化器,以可靠地初始化框分类器的新权重,这有效地促进了知识转移过程并提高了适应速度。在KI初始化器中,我们提出了一种自适应长度重新缩放(ALR)策略,以缓解预测的新权重与预训练的基础权重之间的向量长度不一致问题。最后,我们的方法不仅在三个公共基准(即PASCAL VOC、COCO和LVIS)上取得了SOTA结果,而且在少样本迁移期间,在COCO/LVIS基准上,其适应速度比其他方法快1.8至100倍,展现出了高效率。据我们所知,这是第一项考虑FSOD中效率问题的工作。我们希望推动一种朝着强大而高效的少样本技术发展的趋势。代码可在https://github.com/Ze-Yang/Efficient-FSOD上公开获取。