School of Computer, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu 210023, China.
School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu 210023, China.
Comput Intell Neurosci. 2022 Jul 14;2022:6106853. doi: 10.1155/2022/6106853. eCollection 2022.
UAV (unmanned aerial vehicle) captured images have small pedestrian targets and loss of key information after multiple down sampling, which are difficult to overcome by existing methods. We propose an improved YOLOv4 model for pedestrian detection and counting in UAV images, named YOLO-CC. We used the lightweight YOLOv4 for pedestrian detection, which replaces the backbone with CSPDarknet-34, and two feature layers are fused by FPN (Feature Pyramid Networks). We expanded the perception field using multiscale convolution based on the high-level feature map and generated the population density map by feature dimension reduction. By embedding the density map generation method into the network for end-to-end training, our model can effectively improve the accuracy of detection and counting and make feature extraction more focused on small targets. Our experiments demonstrate that YOLO-CC achieves 21.76 points AP higher than that of the original YOLOv4 on the VisDrone2021-counting data set while running faster than the original YOLOv4.
无人机(UAV)拍摄的图像中行人目标较小,经过多次下采样后关键信息丢失,现有方法难以克服。我们提出了一种用于无人机图像中行人检测和计数的改进型 YOLOv4 模型,命名为 YOLO-CC。我们使用轻量级的 YOLOv4 进行行人检测,用 CSPDarknet-34 替换了骨干网,并通过 FPN(特征金字塔网络)融合了两个特征层。我们通过基于高层特征图的多尺度卷积来扩展感知域,并通过特征降维生成人口密度图。通过将密度图生成方法嵌入网络进行端到端训练,我们的模型可以有效地提高检测和计数的准确性,并使特征提取更专注于小目标。我们的实验表明,在 VisDrone2021 计数数据集上,YOLO-CC 比原始 YOLOv4 高出 21.76 个点的 AP,同时比原始 YOLOv4 运行速度更快。