Liu Chunsheng, Guo Yu, Li Shuang, Chang Faliang
School of Control Science and Engineering, Shandong University, Ji'nan 250061, China.
Sensors (Basel). 2019 Jun 13;19(12):2671. doi: 10.3390/s19122671.
You Only Look Once (YOLO) deep network can detect objects quickly with high precision and has been successfully applied in many detection problems. The main shortcoming of YOLO network is that YOLO network usually cannot achieve high precision when dealing with small-size object detection in high resolution images. To overcome this problem, we propose an effective region proposal extraction method for YOLO network to constitute an entire detection structure named ACF-PR-YOLO, and take the cyclist detection problem to show our methods. Instead of directly using the generated region proposals for classification or regression like most region proposal methods do, we generate large-size potential regions containing objects for the following deep network. The proposed ACF-PR-YOLO structure includes three main parts. Firstly, a region proposal extraction method based on aggregated channel feature (ACF) is proposed, called ACF based region proposal (ACF-PR) method. In ACF-PR, ACF is firstly utilized to fast extract candidates and then a bounding boxes merging and extending method is designed to merge the bounding boxes into correct region proposals for the following YOLO net. Secondly, we design suitable YOLO net for fine detection in the region proposals generated by ACF-PR. Lastly, we design a post-processing step, in which the results of YOLO net are mapped into the original image outputting the detection and localization results. Experiments performed on the Tsinghua-Daimler Cyclist Benchmark with high resolution images and complex scenes show that the proposed method outperforms the other tested representative detection methods in average precision, and that it outperforms YOLOv3 by 13.69 % average precision and outperforms SSD by 25.27 % average precision.
你只看一次(YOLO)深度网络能够快速且高精度地检测物体,并已成功应用于许多检测问题。YOLO网络的主要缺点是,在处理高分辨率图像中的小尺寸物体检测时,YOLO网络通常无法实现高精度。为了克服这个问题,我们提出了一种针对YOLO网络的有效区域提议提取方法,以构成一个名为ACF-PR-YOLO的完整检测结构,并以自行车骑行者检测问题来展示我们的方法。与大多数区域提议方法直接使用生成的区域提议进行分类或回归不同,我们为后续的深度网络生成包含物体的大尺寸潜在区域。所提出的ACF-PR-YOLO结构包括三个主要部分。首先,提出了一种基于聚合通道特征(ACF)的区域提议提取方法,称为基于ACF的区域提议(ACF-PR)方法。在ACF-PR中,首先利用ACF快速提取候选区域,然后设计一种边界框合并和扩展方法,将边界框合并为正确的区域提议,以供后续的YOLO网络使用。其次,我们设计合适的YOLO网络,对ACF-PR生成的区域提议进行精细检测。最后,我们设计了一个后处理步骤,将YOLO网络的结果映射到原始图像上,输出检测和定位结果。在具有高分辨率图像和复杂场景的清华-戴姆勒自行车骑行者基准数据集上进行的实验表明,所提出的方法在平均精度方面优于其他经过测试的代表性检测方法,并且在平均精度上比YOLOv3高出13.69%,比SSD高出25.27%。