Video Processing and Understanding Laboratory (VPULab), Universidad Autónoma de Madrid, 28049 Madrid, Spain.
Sensors (Basel). 2018 Dec 11;18(12):4385. doi: 10.3390/s18124385.
Finding optimal parametrizations for people detectors is a complicated task due to the large number of parameters and the high variability of application scenarios. In this paper, we propose a framework to adapt and improve any detector automatically in multi-camera scenarios where people are observed from various viewpoints. By accurately transferring detector results between camera viewpoints and by self-correlating these transferred results, the best configuration (in this paper, the detection threshold) for each detector-viewpoint pair is identified online without requiring any additional manually-labeled ground truth apart from the offline training of the detection model. Such a configuration consists of establishing the confidence detection threshold present in every people detector, which is a critical parameter affecting detection performance. The experimental results demonstrate that the proposed framework improves the performance of four different state-of-the-art detectors (DPM , ACF, faster R-CNN, and YOLO9000) whose Optimal Fixed Thresholds (OFTs) have been determined and fixed during training time using standard datasets.
由于参数数量多且应用场景变化大,找到最佳的人体检测器参数配置是一项复杂的任务。在本文中,我们提出了一种框架,能够在多摄像机场景中自动适应和改进任何检测器,这些场景中人们从不同的视角进行观察。通过在摄像机视角之间准确地传输检测器结果,并通过对这些传输结果进行自相关,我们可以在线识别每个检测器-摄像机视角对的最佳配置(在本文中,即检测阈值),而无需除检测模型离线训练之外的任何额外的手动标记的真实数据。这样的配置包括建立每个人体检测器中存在的置信度检测阈值,这是影响检测性能的关键参数。实验结果表明,所提出的框架提高了四种不同的最先进的检测器(DPM、ACF、faster R-CNN 和 YOLO9000)的性能,这些检测器的最佳固定阈值(OFT)在训练时间内使用标准数据集进行了确定和固定。