Suppr超能文献

太远而无法看清?并非如此!—具有尺度感知本地化策略的行人检测。

Too Far to See? Not Really!-Pedestrian Detection With Scale-Aware Localization Policy.

出版信息

IEEE Trans Image Process. 2018 Aug;27(8):3703-3715. doi: 10.1109/TIP.2018.2818018.

Abstract

A major bottleneck of pedestrian detection lies on the sharp performance deterioration in the presence of small-size pedestrians that are relatively far from the camera. Motivated by the observation that pedestrians of disparate spatial scales exhibit distinct visual appearances, we propose in this paper an active pedestrian detector that explicitly operates over multiple-layer neuronal representations of the input still image. More specifically, convolutional neural nets, such as ResNet and faster R-CNNs, are exploited to provide a rich and discriminative hierarchy of feature representations, as well as initial pedestrian proposals. Here each pedestrian observation of distinct size could be best characterized in terms of the ResNet feature representation at a certain layer of the hierarchy. Meanwhile, initial pedestrian proposals are attained by the faster R-CNNs techniques, i.e., region proposal network and follow-up region of interesting pooling layer employed right after the specific ResNet convolutional layer of interest, to produce joint predictions on the bounding-box proposals' locations and categories (i.e., pedestrian or not). This is engaged as an input to our active detector, where for each initial pedestrian proposal, a sequence of coordinate transformation actions is carried out to determine its proper x-y 2D location and the layer of feature representation, or eventually terminated as being background. Empirically our approach is demonstrated to produce overall lower detection errors on widely used benchmarks, and it works particularly well with far-scale pedestrians. For example, compared with 60.51% log-average miss rate of the state-of-the-art MS-CNN for far-scale pedestrians (those below 80 pixels in bounding-box height) of the Caltech benchmark, the miss rate of our approach is 41.85%, with a notable reduction of 18.66%.

摘要

行人检测的一个主要瓶颈在于,当小尺寸行人相对远离相机时,性能会急剧下降。受行人在不同空间尺度上表现出不同视觉外观的观察结果启发,我们在本文中提出了一种主动行人检测器,该检测器明确地在输入静态图像的多层神经元表示上进行操作。具体来说,我们利用卷积神经网络(如 ResNet 和更快的 R-CNN)来提供丰富而具有区分度的特征表示层次结构,以及初始行人建议。在这里,不同大小的每个行人观察都可以根据层次结构中特定层的 ResNet 特征表示来最佳描述。同时,通过更快的 R-CNN 技术(即区域提议网络和后续感兴趣区域池化层)获得初始行人建议,该技术在特定 ResNet 卷积层之后使用,以便对边界框提议的位置和类别(即行人或非行人)进行联合预测。这被用作我们的主动检测器的输入,对于每个初始行人建议,都会执行一系列坐标变换操作以确定其适当的 x-y 2D 位置和特征表示层,或者最终被终止为背景。在广泛使用的基准测试中,我们的方法被证明可以产生总体较低的检测错误,并且在远距离行人方面效果尤其好。例如,与 Caltech 基准测试中最先进的 MS-CNN 对于远距离行人(边界框高度低于 80 像素的那些)的 60.51%对数平均漏报率相比,我们的方法的漏报率为 41.85%,显著降低了 18.66%。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验