School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan 232001, China.
College of Electronics and Information Engineering, Tongji University, Shanghai 200092, China.
Sensors (Basel). 2022 Aug 10;22(16):5980. doi: 10.3390/s22165980.
Despite the rapid development of pedestrian detection algorithms, the balance between detection accuracy and efficiency is still far from being achieved due to edge GPUs (low computing power) limiting the parameters of the model. To address this issue, we propose the YOLOv4-TP-Tiny based on the YOLOv4 model, which mainly includes two modules, two-dimensional attention (TA) and pedestrian-based feature extraction (PFM). First, we integrate the TA mechanism into the backbone network, which increases the attention of the network to the visible area of pedestrians and improves the accuracy of pedestrian detection. Then, the PFM is used to replace the original spatial pyramid pooling (SPP) structure in the YOLOv4 to obtain the YOLOv4-TP algorithm, which can adapt to different sizes of people to obtain higher detection accuracy. To maintain detection speed, we replaced the normal convolution with a ghost network with a TA mechanism, resulting in more feature maps with fewer parameters. We constructed a one-way multi-scale feature fusion structure to replace the down-sampling process, thereby reducing network parameters to obtain the YOLOv4-TP-Tiny model. The experimental results show that the YOLOv4-TP-tiny has 58.3% AP and 31 FPS in the winder person pedestrian dataset. With the same hardware conditions and dataset, the AP of the YOLOv4-tiny is 55.9%, and the FPS is 29.
尽管行人检测算法发展迅速,但由于边缘 GPU(计算能力低)限制了模型的参数,检测精度和效率之间的平衡仍远未实现。针对这个问题,我们提出了基于 YOLOv4 模型的 YOLOv4-TP-Tiny,主要包括两个模块,二维注意力(TA)和基于行人的特征提取(PFM)。首先,我们将 TA 机制集成到骨干网络中,这增加了网络对行人可见区域的注意力,提高了行人检测的准确性。然后,使用 PFM 替换 YOLOv4 中的原始空间金字塔池化(SPP)结构,得到 YOLOv4-TP 算法,该算法可以适应不同大小的人,以获得更高的检测精度。为了保持检测速度,我们用具有 TA 机制的幽灵网络替换了正常卷积,从而用更少的参数获得了更多的特征图。我们构建了一个单向多尺度特征融合结构来替换下采样过程,从而减少网络参数,得到 YOLOv4-TP-Tiny 模型。实验结果表明,在 winder person 行人数据集上,YOLOv4-TP-tiny 的 AP 为 58.3%,FPS 为 31。在相同的硬件条件和数据集下,YOLOv4-tiny 的 AP 为 55.9%,FPS 为 29。