Suppr超能文献

学习用于行人检测的多层通道特征。

Learning Multilayer Channel Features for Pedestrian Detection.

出版信息

IEEE Trans Image Process. 2017 Jul;26(7):3210-3220. doi: 10.1109/TIP.2017.2694224. Epub 2017 Apr 26.

Abstract

Pedestrian detection based on the combination of convolutional neural network (CNN) and traditional handcrafted features (i.e., HOG+LUV) has achieved great success. In general, HOG+LUV are used to generate the candidate proposals and then CNN classifies these proposals. Despite its success, there is still room for improvement. For example, CNN classifies these proposals by the fully connected layer features, while proposal scores and the features in the inner-layers of CNN are ignored. In this paper, we propose a unifying framework called multi-layer channel features (MCF) to overcome the drawback. It first integrates HOG+LUV with each layer of CNN into a multi-layer image channels. Based on the multi-layer image channels, a multi-stage cascade AdaBoost is then learned. The weak classifiers in each stage of the multi-stage cascade are learned from the image channels of corresponding layer. Experiments on Caltech data set, INRIA data set, ETH data set, TUD-Brussels data set, and KITTI data set are conducted. With more abundant features, an MCF achieves the state of the art on Caltech pedestrian data set (i.e., 10.40% miss rate). Using new and accurate annotations, an MCF achieves 7.98% miss rate. As many non-pedestrian detection windows can be quickly rejected by the first few stages, it accelerates detection speed by 1.43 times. By eliminating the highly overlapped detection windows with lower scores after the first stage, it is 4.07 times faster than negligible performance loss.

摘要

基于卷积神经网络 (CNN) 和传统手工制作特征 (即 HOG+LUV) 的结合的行人检测已经取得了巨大的成功。通常,HOG+LUV 用于生成候选提案,然后 CNN 对这些提案进行分类。尽管取得了成功,但仍有改进的空间。例如,CNN 通过全连接层特征对这些提案进行分类,而提案得分和 CNN 内部层的特征被忽略。在本文中,我们提出了一个称为多层通道特征 (MCF) 的统一框架来克服这一缺点。它首先将 HOG+LUV 与 CNN 的每一层集成到一个多层图像通道中。基于多层图像通道,然后学习多阶段级联 AdaBoost。多阶段级联中的每个阶段的弱分类器都是从相应层的图像通道中学习到的。在 Caltech 数据集、INRIA 数据集、ETH 数据集、TUD-Brussels 数据集和 KITTI 数据集上进行了实验。由于具有更丰富的特征,MCF 在 Caltech 行人数据集上达到了最新水平(即 10.40%的误报率)。使用新的和准确的注释,MCF 实现了 7.98%的误报率。由于可以通过前几个阶段快速拒绝许多非行人检测窗口,因此它将检测速度提高了 1.43 倍。通过在第一阶段后消除得分较低的高度重叠的检测窗口,它的速度比忽略性能损失快 4.07 倍。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验