Suppr超能文献

用于精确目标检测的分层回归与分类

Hierarchical Regression and Classification for Accurate Object Detection.

作者信息

Cao Jiale, Pang Yanwei, Han Jungong, Li Xuelong

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2425-2439. doi: 10.1109/TNNLS.2021.3106641. Epub 2023 May 2.

Abstract

Accurate object detection requires correct classification and high-quality localization. Currently, most of the single shot detectors (SSDs) conduct simultaneous classification and regression using a fully convolutional network. Despite high efficiency, this structure has some inappropriate designs for accurate object detection. The first one is the mismatch of bounding box classification, where the classification results of the default bounding boxes are improperly treated as the results of the regressed bounding boxes during the inference. The second one is that only one-time regression is not good enough for high-quality object localization. To solve the problem of classification mismatch, we propose a novel reg-offset-cls (ROC) module including three hierarchical steps: the regression of the default bounding box, the prediction of new feature sampling locations, and the classification of the regressed bounding box with more accurate features. For high-quality localization, we stack two ROC modules together. The input of the second ROC module is the output of the first ROC module. In addition, we inject a feature enhanced (FE) module between two stacked ROC modules to extract more contextual information. The experiments on three different datasets (i.e., MS COCO, PASCAL VOC, and UAVDT) are performed to demonstrate the effectiveness and superiority of our method. Without any bells or whistles, our proposed method outperforms state-of-the-art one-stage methods at a real-time speed. The source code is available at https://github.com/JialeCao001/HSD.

摘要

精确的目标检测需要正确的分类和高质量的定位。目前,大多数单发检测器(SSD)使用全卷积网络同时进行分类和回归。尽管效率很高,但这种结构在精确目标检测方面存在一些不合理的设计。第一个问题是边界框分类不匹配,即在推理过程中,默认边界框的分类结果被不适当地当作回归边界框的结果。第二个问题是,仅进行一次回归对于高质量的目标定位来说是不够的。为了解决分类不匹配的问题,我们提出了一种新颖的回归偏移分类(ROC)模块,它包括三个层次步骤:默认边界框的回归、新特征采样位置的预测以及使用更精确特征对回归边界框进行分类。为了实现高质量定位,我们将两个ROC模块堆叠在一起。第二个ROC模块的输入是第一个ROC模块的输出。此外,我们在两个堆叠的ROC模块之间注入一个特征增强(FE)模块,以提取更多的上下文信息。我们在三个不同的数据集(即MS COCO、PASCAL VOC和UAVDT)上进行了实验,以证明我们方法的有效性和优越性。在没有任何花里胡哨的东西的情况下,我们提出的方法以实时速度优于当前最先进的单阶段方法。源代码可在https://github.com/JialeCao001/HSD获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验