使用带有集成SimDLKA注意力机制和DCIOU损失函数的YOLOv8进行增强型人体姿态估计：人体行为和姿势分析

Enhanced human pose estimation using YOLOv8 with Integrated SimDLKA attention mechanism and DCIOU loss function: Analysis of human body behavior and posture.

作者信息

Xu Xunqian, Wu Tao, Du Zhongbao, Rong Hui, Wang Siwen, Li Shue, Chen Dakai

机构信息

School of Transportation and Civil Engineering, Nantong University, Nantong, China.

Nantong Highway Development Center, Nantong, China.

出版信息

PLoS One. 2025 May 7;20(5):e0318578. doi: 10.1371/journal.pone.0318578. eCollection 2025.

DOI:10.1371/journal.pone.0318578

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12057905/

Abstract

Pose estimation is a crucial task in the field of human motion analysis, and detecting poses is a topic of significant interest. Traditional detection algorithms are not only time-consuming and labor-intensive but also suffer from deficiencies in accuracy and objectivity. To address these issues, we propose an improved pose estimation algorithm based on the YOLOv8 framework. By incorporating a novel attention mechanism, SimDLKA, into the original YOLOv8 model, we enhance the model's ability to selectively focus on input data, thereby improving its decoupling and flexibility. In the feature fusion module of YOLOv8, we replace the original Bottleneck module with the SimDLKA module and integrate it with the C2F module to form the C2F-SimDLKA structure, which more effectively fuses global semantics, especially for medium to large targets. Furthermore, we introduce a new loss function, DCIOU, based on the YOLOv8 loss function, to improve the forward propagation of model training. Results indicate that our new loss function has a 3-5 loss value reduction compared to other loss functions. Additionally, we have independently constructed a large-scale pose estimation dataset, HP, employing various data augmentation strategies, and utilized the open-source COCO and MPII datasets for model training. Experimental results demonstrate that, compared to the traditional YOLOv8, our improved YOLOv8 algorithm increases the mAP value on the pose estimation dataset by 2.7% and the average frame rate by approximately 3 frames. This method provides a valuable reference for pose detection in pose estimation.

摘要

姿态估计是人体运动分析领域中的一项关键任务，而检测姿态是一个备受关注的话题。传统的检测算法不仅耗时费力，而且在准确性和客观性方面存在不足。为了解决这些问题，我们提出了一种基于YOLOv8框架的改进姿态估计算法。通过将一种新颖的注意力机制SimDLKA融入原始的YOLOv8模型中，我们增强了模型选择性关注输入数据的能力，从而提高了其解耦性和灵活性。在YOLOv8的特征融合模块中，我们用SimDLKA模块替换了原来的瓶颈模块，并将其与C2F模块集成，形成C2F - SimDLKA结构，该结构能更有效地融合全局语义，尤其适用于中大型目标。此外，我们基于YOLOv8损失函数引入了一种新的损失函数DCIOU，以改善模型训练的前向传播。结果表明，与其他损失函数相比，我们的新损失函数的损失值降低了3 - 5。此外，我们自主构建了一个大规模姿态估计数据集HP，采用了各种数据增强策略，并利用开源的COCO和MPII数据集进行模型训练。实验结果表明，与传统的YOLOv8相比，我们改进后的YOLOv8算法在姿态估计数据集上的mAP值提高了2.7%，平均帧率提高了约3帧。该方法为姿态估计中的姿态检测提供了有价值的参考。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05c4/12057905/773eb58bed97/pone.0318578.g001.jpg

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。