Suppr超能文献

基于新型轴线表示和回归模式的行人检测。

Pedestrian Detection by Novel Axis-Line Representation and Regression Pattern.

机构信息

School of Software Engineering, South China University of Technology, Guangzhou 510006, China.

出版信息

Sensors (Basel). 2021 May 11;21(10):3312. doi: 10.3390/s21103312.

Abstract

The pattern of bounding box representation and regression has long been dominant in CNN-based pedestrian detectors. Despite the method's success, it cannot accurately represent location, and introduces unnecessary background information, while pedestrian features are mainly located in axis-line areas. Other object representations, such as corner-pairs, are not easy to obtain by regression because the corners are far from the axis-line and are greatly affected by background features. In this paper, we propose a novel detection pattern, named Axis-line Representation and Regression (ALR), for pedestrian detection in road scenes. Specifically, we design a 3-d axis-line representation for pedestrians and use it as the regression target during network training. A line-box transformation method is also proposed to fit the widely used box-annotations. Meanwhile, we explore the influence of deformable convolution base-offset on detection performance and propose a base-offset initialization strategy to further promote the gain brought by ALR. Notably, the proposed ALR pattern can be introduced into both anchor-based and anchor-free frameworks. We validate the effectiveness of ALR on the Caltech-USA and CityPersons datasets. Experimental results show that our approach outperforms the baseline significantly through simple modifications and achieves competitive accuracy with other methods without bells and whistles.

摘要

基于卷积神经网络的行人检测器长期以来一直采用边界框表示和回归的模式。尽管该方法取得了成功,但它无法准确表示位置,并且引入了不必要的背景信息,而行人特征主要位于轴线区域。其他对象表示形式,如角点对,由于角点远离轴线并且受到背景特征的很大影响,因此不容易通过回归获得。在本文中,我们提出了一种新的检测模式,称为轴线表示和回归(ALR),用于道路场景中的行人检测。具体来说,我们设计了一种用于行人的 3-d 轴线表示,并在网络训练期间将其用作回归目标。还提出了一种线盒转换方法来拟合常用的盒注释。同时,我们探讨了可变形卷积基偏移对检测性能的影响,并提出了一种基偏移初始化策略,以进一步提高 ALR 带来的增益。值得注意的是,所提出的 ALR 模式可以引入基于锚点和无锚点的框架中。我们在 Caltech-USA 和 CityPersons 数据集上验证了 ALR 的有效性。实验结果表明,通过简单的修改,我们的方法在基线显著提高,并且在没有花哨功能的情况下与其他方法具有竞争力的准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/371d/8150842/54807fc893f6/sensors-21-03312-g001.jpg

相似文献

3
Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection.用于多方向目标检测的水平边界框上的滑动顶点
IEEE Trans Pattern Anal Mach Intell. 2021 Apr;43(4):1452-1459. doi: 10.1109/TPAMI.2020.2974745. Epub 2021 Mar 5.

本文引用的文献

5
Learning Complexity-Aware Cascades for Pedestrian Detection.学习复杂度感知级联用于行人检测。
IEEE Trans Pattern Anal Mach Intell. 2020 Sep;42(9):2195-2211. doi: 10.1109/TPAMI.2019.2910514. Epub 2019 Apr 11.
6
Focal Loss for Dense Object Detection.用于密集目标检测的焦散损失
IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):318-327. doi: 10.1109/TPAMI.2018.2858826. Epub 2018 Jul 23.
7
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
8
Pedestrian detection: an evaluation of the state of the art.行人检测:现状评估。
IEEE Trans Pattern Anal Mach Intell. 2012 Apr;34(4):743-61. doi: 10.1109/TPAMI.2011.155.
9
Object detection with discriminatively trained part-based models.基于判别式训练的部件模型的目标检测。
IEEE Trans Pattern Anal Mach Intell. 2010 Sep;32(9):1627-45. doi: 10.1109/TPAMI.2009.167.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验