IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):973-986. doi: 10.1109/TPAMI.2017.2700460. Epub 2017 May 2.
Encouraged by the recent progress in pedestrian detection, we investigate the gap between current state-of-the-art methods and the "perfect single frame detector". We enable our analysis by creating a human baseline for pedestrian detection (over the Caltech pedestrian dataset). After manually clustering the frequent errors of a top detector, we characterise both localisation and background-versus-foreground errors. To address localisation errors we study the impact of training annotation noise on the detector performance, and show that we can improve results even with a small portion of sanitised training data. To address background/foreground discrimination, we study convnets for pedestrian detection, and discuss which factors affect their performance. Other than our in-depth analysis, we report top performance on the Caltech pedestrian dataset, and provide a new sanitised set of training and test annotations.
受行人检测近期进展的鼓舞,我们研究了当前最先进的方法与“完美单帧检测器”之间的差距。我们通过为行人检测创建一个人为的基准(在 Caltech 行人数据集上)来实现我们的分析。在手动聚类了顶级检测器的常见错误之后,我们对定位和背景与前景错误进行了特征描述。为了解决定位错误,我们研究了训练标注噪声对检测器性能的影响,并表明即使使用一小部分清理后的训练数据,我们也可以提高结果。为了解决背景/前景识别问题,我们研究了用于行人检测的卷积神经网络,并讨论了影响其性能的因素。除了深入的分析之外,我们还在 Caltech 行人数据集上报告了最佳性能,并提供了一组新的清理后的训练和测试标注。