Suppr超能文献

锚生成优化和感兴趣区域分配在车辆检测中的应用。

Anchor Generation Optimization and Region of Interest Assignment for Vehicle Detection.

机构信息

State Key Laboratory of Automotive Simulation and Control, Jilin University, Changchun 130025, China.

Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing 100191, China.

出版信息

Sensors (Basel). 2019 Mar 3;19(5):1089. doi: 10.3390/s19051089.

Abstract

Region proposal network (RPN) based object detection, such as Faster Regions with CNN (Faster R-CNN), has gained considerable attention due to its high accuracy and fast speed. However, it has room for improvements when used in special application situations, such as the on-board vehicle detection. Original RPN locates multiscale anchors uniformly on each pixel of the last feature map and classifies whether an anchor is part of the foreground or background with one pixel in the last feature map. The receptive field of each pixel in the last feature map is fixed in the original faster R-CNN and does not coincide with the anchor size. Hence, only a certain part can be seen for large vehicles and too much useless information is contained in the feature for small vehicles. This reduces detection accuracy. Furthermore, the perspective projection results in the vehicle bounding box size becoming related to the bounding box position, thereby reducing the effectiveness and accuracy of the uniform anchor generation method. This reduces both detection accuracy and computing speed. After the region proposal stage, many regions of interest (ROI) are generated. The ROI pooling layer projects an ROI to the last feature map and forms a new feature map with a fixed size for final classification and box regression. The number of feature map pixels in the projected region can also influence the detection performance but this is not accurately controlled in former works. In this paper, the original faster R-CNN is optimized, especially for the on-board vehicle detection. This paper tries to solve these above-mentioned problems. The proposed method is tested on the KITTI dataset and the result shows a significant improvement without too many tricky parameter adjustments and training skills. The proposed method can also be used on other objects with obvious foreshortening effects, such as on-board pedestrian detection. The basic idea of the proposed method does not rely on concrete implementation and thus, most deep learning based object detectors with multiscale feature maps can be optimized with it.

摘要

基于区域提议网络(RPN)的目标检测,如具有卷积神经网络的快速区域提议(Faster R-CNN),由于其高精度和快速速度而受到广泛关注。然而,在特殊应用情况下,如车载车辆检测,它还有改进的空间。原始 RPN 在最后一个特征图的每个像素上均匀定位多尺度锚点,并使用最后一个特征图中的一个像素来分类锚点是否是前景或背景的一部分。在原始的更快 R-CNN 中,最后一个特征图中每个像素的感受野是固定的,并且与锚点大小不重合。因此,对于大型车辆,只能看到某个部分,而对于小型车辆,则包含太多无用信息。这降低了检测精度。此外,透视投影导致车辆边界框大小与边界框位置相关,从而降低了均匀锚点生成方法的有效性和准确性。这降低了检测精度和计算速度。在区域提议阶段之后,生成了许多感兴趣区域(ROI)。ROI 池化层将 ROI 投影到最后一个特征图上,并形成一个新的固定大小的特征图,用于最终分类和框回归。投影区域中的特征图像素数量也会影响检测性能,但在以前的工作中,这并没有得到准确控制。在本文中,对原始的更快 R-CNN 进行了优化,特别是针对车载车辆检测。本文试图解决上述问题。所提出的方法在 KITTI 数据集上进行了测试,结果表明,在不需要过多复杂参数调整和训练技巧的情况下,性能有了显著提高。该方法还可以用于具有明显缩短效果的其他对象,例如车载行人检测。所提出的方法的基本思想不依赖于具体的实现,因此,具有多尺度特征图的大多数基于深度学习的目标检测器都可以使用它进行优化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dee/6427343/794fbcd518c3/sensors-19-01089-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验