Department of Physical Education, Gansu Agricultural University, Lanzhou 730070, China.
College of Information Science and Technology, Gansu Agricultural University, Lanzhou 730070, China.
Sensors (Basel). 2024 May 10;24(10):3036. doi: 10.3390/s24103036.
In response to the numerous challenges faced by traditional human pose recognition methods in practical applications, such as dense targets, severe edge occlusion, limited application scenarios, complex backgrounds, and poor recognition accuracy when targets are occluded, this paper proposes a YOLO-Pose algorithm for human pose estimation. The specific improvements are divided into four parts. Firstly, in the Backbone section of the YOLO-Pose model, lightweight GhostNet modules are introduced to reduce the model's parameter count and computational requirements, making it suitable for deployment on unmanned aerial vehicles (UAVs). Secondly, the ACmix attention mechanism is integrated into the Neck section to improve detection speed during object judgment and localization. Furthermore, in the Head section, key points are optimized using coordinate attention mechanisms, significantly enhancing key point localization accuracy. Lastly, the paper improves the loss function and confidence function to enhance the model's robustness. Experimental results demonstrate that the improved model achieves a 95.58% improvement in mAP50 and a 69.54% improvement in mAP50-95 compared to the original model, with a reduction of 14.6 M parameters. The model achieves a detection speed of 19.9 ms per image, optimized by 30% and 39.5% compared to the original model. Comparisons with other algorithms such as Faster R-CNN, SSD, YOLOv4, and YOLOv7 demonstrate varying degrees of performance improvement.
针对传统人体姿态识别方法在实际应用中面临的密集目标、严重边缘遮挡、应用场景有限、复杂背景以及目标遮挡时识别精度差等诸多挑战,本文提出了一种用于人体姿态估计的 YOLO-Pose 算法。具体改进分为四部分。首先,在 YOLO-Pose 模型的 Backbone 部分引入轻量级 GhostNet 模块,减少模型的参数量和计算需求,使其适用于无人机(UAV)上的部署。其次,在 Neck 部分集成 ACmix 注意力机制,提高目标判断和定位时的检测速度。此外,在 Head 部分,使用坐标注意力机制优化关键点,显著提高关键点定位精度。最后,改进损失函数和置信函数,增强模型的鲁棒性。实验结果表明,改进后的模型在 mAP50 上提高了 95.58%,在 mAP50-95 上提高了 69.54%,参数量减少了 14.6M。模型的检测速度为每张图像 19.9ms,比原始模型优化了 30%和 39.5%。与 Faster R-CNN、SSD、YOLOv4 和 YOLOv7 等其他算法的比较表明,该算法在性能上有不同程度的提高。