• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HPnet:用于人体姿态估计的混合并行网络。

HPnet: Hybrid Parallel Network for Human Pose Estimation.

机构信息

School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.

出版信息

Sensors (Basel). 2023 Apr 30;23(9):4425. doi: 10.3390/s23094425.

DOI:10.3390/s23094425
PMID:37177628
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10181615/
Abstract

Hybrid models which combine the convolution and transformer model achieve impressive performance on human pose estimation. However, the existing hybrid models on human pose estimation, which typically stack self-attention modules after convolution, are prone to mutual conflict. The mutual conflict enforces one type of module to dominate over these hybrid sequential models. Consequently, the performance of higher-precision keypoints localization is not consistent with overall performance. To alleviate this mutual conflict, we developed a hybrid parallel network by parallelizing the self-attention modules and the convolution modules, which conduce to leverage the complementary capabilities effectively. The parallel network ensures that the self-attention branch tends to model the long-range dependency to enhance the semantic representation, whereas the local sensitivity of the convolution branch contributes to high-precision localization simultaneously. To further mitigate the conflict, we proposed a cross-branches attention module to gate the features generated by both branches along the channel dimension. The hybrid parallel network achieves 75.6% and 75.4% on COCO validation and test-dev sets and achieves consistent performance on both higher-precision localization and overall performance. The experiments show that our hybrid parallel network is on par with the state-of-the-art human pose estimation models.

摘要

混合模型结合卷积和变形金刚模型在人体姿态估计上取得了令人印象深刻的性能。然而,现有的人体姿态估计混合模型,通常在卷积后堆叠自注意模块,容易相互冲突。相互冲突迫使一种类型的模块主导这些混合序列模型。因此,高精度关键点定位的性能与整体性能不一致。为了缓解这种相互冲突,我们通过并行化自注意模块和卷积模块开发了一种混合并行网络,有效地利用了互补能力。并行网络确保自注意分支倾向于建模远程依赖关系,以增强语义表示,而卷积分支的局部敏感性同时有助于高精度定位。为了进一步减轻冲突,我们提出了一种跨分支注意模块,沿通道维度对两个分支生成的特征进行门控。混合并行网络在 COCO 验证集和测试集上分别达到 75.6%和 75.4%,在高精度定位和整体性能上都具有一致的性能。实验表明,我们的混合并行网络与最先进的人体姿态估计模型相当。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/df63b0a22e41/sensors-23-04425-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/70fc2a241e70/sensors-23-04425-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/4796c23fb2dd/sensors-23-04425-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/a50e7bcd04b3/sensors-23-04425-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/e18ecde84dc8/sensors-23-04425-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/2d0777b6c82e/sensors-23-04425-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/401fdb96b550/sensors-23-04425-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/2bb8538f3791/sensors-23-04425-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/8b6b3554ca3d/sensors-23-04425-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/27161917cb54/sensors-23-04425-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/df63b0a22e41/sensors-23-04425-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/70fc2a241e70/sensors-23-04425-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/4796c23fb2dd/sensors-23-04425-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/a50e7bcd04b3/sensors-23-04425-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/e18ecde84dc8/sensors-23-04425-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/2d0777b6c82e/sensors-23-04425-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/401fdb96b550/sensors-23-04425-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/2bb8538f3791/sensors-23-04425-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/8b6b3554ca3d/sensors-23-04425-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/27161917cb54/sensors-23-04425-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b63/10181615/df63b0a22e41/sensors-23-04425-g010.jpg

相似文献

1
HPnet: Hybrid Parallel Network for Human Pose Estimation.HPnet:用于人体姿态估计的混合并行网络。
Sensors (Basel). 2023 Apr 30;23(9):4425. doi: 10.3390/s23094425.
2
An improved lightweight high-resolution network based on multi-dimensional weighting for human pose estimation.基于多维加权的改进型轻量化高分辨率网络的人体姿态估计
Sci Rep. 2023 May 4;13(1):7284. doi: 10.1038/s41598-023-33938-x.
3
DSPose: Dual-Space-Driven Keypoint Topology Modeling for Human Pose Estimation.DSPose:用于人体姿态估计的双空间驱动关键点拓扑建模。
Sensors (Basel). 2023 Sep 3;23(17):7626. doi: 10.3390/s23177626.
4
An efficient and accurate 2D human pose estimation method using VTTransPose network.基于 VTTransPose 网络的高效准确 2D 人体姿态估计方法。
Sci Rep. 2024 Mar 31;14(1):7608. doi: 10.1038/s41598-024-58175-8.
5
UniFormer: Unifying Convolution and Self-Attention for Visual Recognition.统一卷积与自注意力机制用于视觉识别的UniFormer
IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12581-12600. doi: 10.1109/TPAMI.2023.3282631. Epub 2023 Sep 5.
6
Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation.移位姿势:一种用于人体姿势估计的轻量级类 Transformer 神经网络。
Sensors (Basel). 2022 Sep 25;22(19):7264. doi: 10.3390/s22197264.
7
BalanceHRNet: An effective network for bottom-up human pose estimation.平衡HRNet:一种用于自底向上人体姿态估计的有效网络。
Neural Netw. 2023 Apr;161:297-305. doi: 10.1016/j.neunet.2023.01.036. Epub 2023 Feb 3.
8
Estimating Human Pose Efficiently by Parallel Pyramid Networks.通过并行金字塔网络高效估计人体姿势。
IEEE Trans Image Process. 2021;30:6785-6800. doi: 10.1109/TIP.2021.3097836. Epub 2021 Jul 30.
9
Object detectors involving a NAS-gate convolutional module and capsule attention module.基于 NAS 门控卷积模块和胶囊注意力模块的目标探测器。
Sci Rep. 2022 Mar 10;12(1):3916. doi: 10.1038/s41598-022-07898-7.
10
MuTr: Multi-Stage Transformer for Hand Pose Estimation from Full-Scene Depth Image.多阶段 Transformer 用于从全景深度图像估计手姿势。
Sensors (Basel). 2023 Jun 12;23(12):5509. doi: 10.3390/s23125509.

引用本文的文献

1
PoseNet++: A multi-scale and optimized feature extraction network for high-precision human pose estimation.PoseNet++:一种用于高精度人体姿态估计的多尺度优化特征提取网络。
PLoS One. 2025 Jun 25;20(6):e0326232. doi: 10.1371/journal.pone.0326232. eCollection 2025.
2
Multi-person dance tiered posture recognition with cross progressive multi-resolution representation integration.基于交叉渐进多分辨率表示融合的多人舞蹈分层姿态识别
PLoS One. 2024 Jun 13;19(6):e0300837. doi: 10.1371/journal.pone.0300837. eCollection 2024.
3
Research on Human Posture Estimation Algorithm Based on YOLO-Pose.

本文引用的文献

1
Shift Pose: A Lightweight Transformer-like Neural Network for Human Pose Estimation.移位姿势:一种用于人体姿势估计的轻量级类 Transformer 神经网络。
Sensors (Basel). 2022 Sep 25;22(19):7264. doi: 10.3390/s22197264.
2
Progressive and Aligned Pose Attention Transfer for Person Image Generation.递进式和对齐式姿势注意转移在人像图像生成中的应用。
IEEE Trans Pattern Anal Mach Intell. 2022 Aug;44(8):4306-4320. doi: 10.1109/TPAMI.2021.3068236. Epub 2022 Jul 1.
3
Dual-Path Deep Fusion Network for Face Image Hallucination.用于面部图像超分辨率的双路径深度融合网络。
基于 YOLO-Pose 的人体姿态估计算法研究。
Sensors (Basel). 2024 May 10;24(10):3036. doi: 10.3390/s24103036.
IEEE Trans Neural Netw Learn Syst. 2022 Jan;33(1):378-391. doi: 10.1109/TNNLS.2020.3027849. Epub 2022 Jan 5.
4
Dual-path Attention Network for Compressed Sensing Image Reconstruction.用于压缩感知图像重建的双路径注意力网络。
IEEE Trans Image Process. 2020 Sep 17;PP. doi: 10.1109/TIP.2020.3023629.
5
Articulated human detection with flexible mixtures of parts.具有灵活部件混合的关节式人体检测。
IEEE Trans Pattern Anal Mach Intell. 2013 Dec;35(12):2878-90. doi: 10.1109/TPAMI.2012.261.