Department of Electrical Engineering, California Institute of Technology, MC 136-93, 1200 E. California Blvd., Pasadena, CA 91125, USA.
IEEE Trans Pattern Anal Mach Intell. 2012 Apr;34(4):743-61. doi: 10.1109/TPAMI.2011.155.
Pedestrian detection is a key problem in computer vision, with several applications that have the potential to positively impact quality of life. In recent years, the number of approaches to detecting pedestrians in monocular images has grown steadily. However, multiple data sets and widely varying evaluation protocols are used, making direct comparisons difficult. To address these shortcomings, we perform an extensive evaluation of the state of the art in a unified framework. We make three primary contributions: 1) We put together a large, well-annotated, and realistic monocular pedestrian detection data set and study the statistics of the size, position, and occlusion patterns of pedestrians in urban scenes, 2) we propose a refined per-frame evaluation methodology that allows us to carry out probing and informative comparisons, including measuring performance in relation to scale and occlusion, and 3) we evaluate the performance of sixteen pretrained state-of-the-art detectors across six data sets. Our study allows us to assess the state of the art and provides a framework for gauging future efforts. Our experiments show that despite significant progress, performance still has much room for improvement. In particular, detection is disappointing at low resolutions and for partially occluded pedestrians.
行人检测是计算机视觉中的一个关键问题,有几个应用程序有可能对提高生活质量产生积极影响。近年来,用于单目图像中检测行人的方法数量稳步增长。然而,使用了多个数据集和广泛不同的评估协议,使得直接比较变得困难。为了解决这些缺点,我们在一个统一的框架中对行人检测的最新技术进行了广泛的评估。我们主要有三个贡献:1)我们收集了一个大型的、注释良好的、逼真的单目行人检测数据集,并研究了城市场景中行人的大小、位置和遮挡模式的统计数据,2)我们提出了一种改进的逐帧评估方法,使我们能够进行探测和有意义的比较,包括根据比例和遮挡来衡量性能,3)我们在六个数据集上评估了十六个预先训练的最先进的探测器的性能。我们的研究使我们能够评估最新技术,并为未来的工作提供一个评估框架。我们的实验表明,尽管取得了重大进展,但性能仍有很大的改进空间。特别是,在低分辨率和部分遮挡的行人检测方面,性能令人失望。