School of Software Engineering, South China University of Technology, Guangzhou 510006, China.
Sensors (Basel). 2021 May 11;21(10):3312. doi: 10.3390/s21103312.
The pattern of bounding box representation and regression has long been dominant in CNN-based pedestrian detectors. Despite the method's success, it cannot accurately represent location, and introduces unnecessary background information, while pedestrian features are mainly located in axis-line areas. Other object representations, such as corner-pairs, are not easy to obtain by regression because the corners are far from the axis-line and are greatly affected by background features. In this paper, we propose a novel detection pattern, named Axis-line Representation and Regression (ALR), for pedestrian detection in road scenes. Specifically, we design a 3-d axis-line representation for pedestrians and use it as the regression target during network training. A line-box transformation method is also proposed to fit the widely used box-annotations. Meanwhile, we explore the influence of deformable convolution base-offset on detection performance and propose a base-offset initialization strategy to further promote the gain brought by ALR. Notably, the proposed ALR pattern can be introduced into both anchor-based and anchor-free frameworks. We validate the effectiveness of ALR on the Caltech-USA and CityPersons datasets. Experimental results show that our approach outperforms the baseline significantly through simple modifications and achieves competitive accuracy with other methods without bells and whistles.
基于卷积神经网络的行人检测器长期以来一直采用边界框表示和回归的模式。尽管该方法取得了成功,但它无法准确表示位置,并且引入了不必要的背景信息,而行人特征主要位于轴线区域。其他对象表示形式,如角点对,由于角点远离轴线并且受到背景特征的很大影响,因此不容易通过回归获得。在本文中,我们提出了一种新的检测模式,称为轴线表示和回归(ALR),用于道路场景中的行人检测。具体来说,我们设计了一种用于行人的 3-d 轴线表示,并在网络训练期间将其用作回归目标。还提出了一种线盒转换方法来拟合常用的盒注释。同时,我们探讨了可变形卷积基偏移对检测性能的影响,并提出了一种基偏移初始化策略,以进一步提高 ALR 带来的增益。值得注意的是,所提出的 ALR 模式可以引入基于锚点和无锚点的框架中。我们在 Caltech-USA 和 CityPersons 数据集上验证了 ALR 的有效性。实验结果表明,通过简单的修改,我们的方法在基线显著提高,并且在没有花哨功能的情况下与其他方法具有竞争力的准确性。