Suppr超能文献

基于小样本的野外目标检测与视角估计

Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3090-3106. doi: 10.1109/TPAMI.2022.3174072. Epub 2023 Feb 3.

Abstract

Detecting objects and estimating their viewpoints in images are key tasks of 3D scene understanding. Recent approaches have achieved excellent results on very large benchmarks for object detection and viewpoint estimation. However, performances are still lagging behind for novel object categories with few samples. In this paper, we tackle the problems of few-shot object detection and few-shot viewpoint estimation. We demonstrate on both tasks the benefits of guiding the network prediction with class-representative features extracted from data in different modalities: image patches for object detection, and aligned 3D models for viewpoint estimation. Despite its simplicity, our method outperforms state-of-the-art methods by a large margin on a range of datasets, including PASCAL and COCO for few-shot object detection, and Pascal3D+ and ObjectNet3D for few-shot viewpoint estimation. Furthermore, when the 3D model is not available, we introduce a simple category-agnostic viewpoint estimation method by exploiting geometrical similarities and consistent pose labeling across different classes. While it moderately reduces performance, this approach still obtains better results than previous methods in this setting. Last, for the first time, we tackle the combination of both few-shot tasks, on three challenging benchmarks for viewpoint estimation in the wild, ObjectNet3D, Pascal3D+ and Pix3D, showing very promising results.

摘要

在图像中检测目标并估计其视角是 3D 场景理解的关键任务。最近的方法在用于目标检测和视角估计的大型基准测试中取得了优异的结果。然而,对于新的、样本较少的物体类别,性能仍然落后。在本文中,我们解决了少样本目标检测和少样本视角估计的问题。我们在这两个任务中都展示了从不同模态的数据中提取出具有代表性的特征来指导网络预测的好处:用于目标检测的图像补丁,以及用于视角估计的对齐的 3D 模型。尽管我们的方法很简单,但它在一系列数据集上的性能都优于最先进的方法,包括用于少样本目标检测的 PASCAL 和 COCO 数据集,以及用于少样本视角估计的 Pascal3D+ 和 ObjectNet3D 数据集。此外,当 3D 模型不可用时,我们通过利用不同类别之间的几何相似性和一致的姿态标注,引入了一种简单的类别不可知的视角估计方法。虽然它适度降低了性能,但在这种设置下,这种方法仍然比以前的方法获得了更好的结果。最后,我们首次解决了这两个少样本任务的组合问题,在三个具有挑战性的野外视角估计基准上,即 ObjectNet3D、Pascal3D+ 和 Pix3D,取得了非常有前景的结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验