• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

KSL-POSE:一种基于改进 YOLOv8-Pose 框架的实时 2D 人体姿态估计方法。

KSL-POSE: A Real-Time 2D Human Pose Estimation Method Based on Modified YOLOv8-Pose Framework.

机构信息

School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, China.

出版信息

Sensors (Basel). 2024 Sep 26;24(19):6249. doi: 10.3390/s24196249.

DOI:10.3390/s24196249
PMID:39409288
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11479243/
Abstract

Two-dimensional human pose estimation aims to equip computers with the ability to accurately recognize human keypoints and comprehend their spatial contexts within media content. However, the accuracy of real-time human pose estimation diminishes when processing images with occluded body parts or overlapped individuals. To address these issues, we propose a method based on the YOLO framework. We integrate the convolutional concepts of Kolmogorov-Arnold Networks (KANs) through introducing non-linear activation functions to enhance the feature extraction capabilities of the convolutional kernels. Moreover, to improve the detection of small target keypoints, we integrate the cross-stage partial (CSP) approach and utilize the small object enhance pyramid (SOEP) module for feature integration. We also innovatively incorporate a layered shared convolution with batch normalization detection head (LSCB), consisting of multiple shared convolutional layers and batch normalization layers, to enable cross-stage feature fusion and address the low utilization of model parameters. Given the structure and purpose of the proposed model, we name it KSL-POSE. Compared to the baseline model YOLOv8l-POSE, KSL-POSE achieves significant improvements, increasing the average detection accuracy by 1.5% on the public MS COCO 2017 data set. Furthermore, the model also demonstrates competitive performance on the CrowdPOSE data set, thus validating its generalization ability.

摘要

二维人体姿态估计旨在使计算机具备准确识别人体关键点并理解媒体内容中人体空间关系的能力。然而,在处理带有遮挡身体部位或重叠个体的图像时,实时人体姿态估计的准确性会降低。为了解决这些问题,我们提出了一种基于 YOLO 框架的方法。我们通过引入非线性激活函数,将 Kolmogorov-Arnold 网络(KAN)的卷积概念集成到卷积核中,以增强卷积核的特征提取能力。此外,为了提高小目标关键点的检测能力,我们集成了跨阶段部分(CSP)方法,并使用小目标增强金字塔(SOEP)模块进行特征融合。我们还创新性地采用了分层共享卷积和批量归一化检测头(LSCB),它由多个共享卷积层和批量归一化层组成,以实现跨阶段特征融合,并解决模型参数利用率低的问题。考虑到所提出模型的结构和目的,我们将其命名为 KSL-POSE。与基线模型 YOLOv8l-POSE 相比,KSL-POSE 在公共 MS COCO 2017 数据集上的平均检测精度提高了 1.5%。此外,该模型在 CrowdPOSE 数据集上也表现出了竞争性能,从而验证了其泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/c45f723b43bb/sensors-24-06249-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/51884a703fc4/sensors-24-06249-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/7b19315f6a2f/sensors-24-06249-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/213d0ed558d7/sensors-24-06249-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/12cf2f450fc5/sensors-24-06249-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/a00756a4936d/sensors-24-06249-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/9bfb887f5218/sensors-24-06249-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/cc12051696d7/sensors-24-06249-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/c45f723b43bb/sensors-24-06249-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/51884a703fc4/sensors-24-06249-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/7b19315f6a2f/sensors-24-06249-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/213d0ed558d7/sensors-24-06249-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/12cf2f450fc5/sensors-24-06249-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/a00756a4936d/sensors-24-06249-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/9bfb887f5218/sensors-24-06249-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/cc12051696d7/sensors-24-06249-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ab70/11479243/c45f723b43bb/sensors-24-06249-g008.jpg

相似文献

1
KSL-POSE: A Real-Time 2D Human Pose Estimation Method Based on Modified YOLOv8-Pose Framework.KSL-POSE:一种基于改进 YOLOv8-Pose 框架的实时 2D 人体姿态估计方法。
Sensors (Basel). 2024 Sep 26;24(19):6249. doi: 10.3390/s24196249.
2
An enhanced real-time human pose estimation method based on modified YOLOv8 framework.基于改进 YOLOv8 框架的增强实时人体姿态估计方法。
Sci Rep. 2024 Apr 5;14(1):8012. doi: 10.1038/s41598-024-58146-z.
3
Detection, segmentation, and 3D pose estimation of surgical tools using convolutional neural networks and algebraic geometry.使用卷积神经网络和代数几何进行手术工具的检测、分割和三维姿态估计。
Med Image Anal. 2021 May;70:101994. doi: 10.1016/j.media.2021.101994. Epub 2021 Feb 7.
4
Improved Convolutional Pose Machines for Human Pose Estimation Using Image Sensor Data.基于图像传感器数据的改进卷积位姿机人体位姿估计
Sensors (Basel). 2019 Feb 10;19(3):718. doi: 10.3390/s19030718.
5
Recognition of Forward Head Posture Through 3D Human Pose Estimation With a Graph Convolutional Network: Development and Feasibility Study.基于图卷积网络的 3D 人体姿态估计识别探颈姿势:开发与可行性研究。
JMIR Form Res. 2024 Aug 26;8:e55476. doi: 10.2196/55476.
6
RSE-YOLOv8: An Algorithm for Underwater Biological Target Detection.RSE-YOLOv8:一种水下生物目标检测算法。
Sensors (Basel). 2024 Sep 18;24(18):6030. doi: 10.3390/s24186030.
7
Multi-Person Pose Estimation Using an Orientation and Occlusion Aware Deep Learning Network.基于方向和遮挡感知的深度学习网络的多人姿态估计。
Sensors (Basel). 2020 Mar 12;20(6):1593. doi: 10.3390/s20061593.
8
YOLOv8-RMDA: Lightweight YOLOv8 Network for Early Detection of Small Target Diseases in Tea.YOLOv8-RMDA:用于茶中早期检测小目标疾病的轻量级 YOLOv8 网络。
Sensors (Basel). 2024 May 1;24(9):2896. doi: 10.3390/s24092896.
9
Research on Human Posture Estimation Algorithm Based on YOLO-Pose.基于 YOLO-Pose 的人体姿态估计算法研究。
Sensors (Basel). 2024 May 10;24(10):3036. doi: 10.3390/s24103036.
10
Learning shared template representation with augmented feature for multi-object pose estimation.利用增强特征学习共享模板表示进行多目标姿态估计。
Neural Netw. 2024 Aug;176:106352. doi: 10.1016/j.neunet.2024.106352. Epub 2024 Apr 30.

引用本文的文献

1
MambaPose: A Human Pose Estimation Based on Gated Feedforward Network and Mamba.曼巴姿态:一种基于门控前馈网络和曼巴的人体姿态估计方法
Sensors (Basel). 2024 Dec 20;24(24):8158. doi: 10.3390/s24248158.

本文引用的文献

1
BalanceHRNet: An effective network for bottom-up human pose estimation.平衡HRNet:一种用于自底向上人体姿态估计的有效网络。
Neural Netw. 2023 Apr;161:297-305. doi: 10.1016/j.neunet.2023.01.036. Epub 2023 Feb 3.
2
Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.长期递归卷积网络的视觉识别与描述。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):677-691. doi: 10.1109/TPAMI.2016.2599174. Epub 2016 Sep 1.