• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过注意力跨维度匹配实现单视图多人姿态估计

Single-view multi-human pose estimation by attentive cross-dimension matching.

作者信息

Tian Wei, Gao Zhong, Tan Dayi

机构信息

Institute of Intelligent Vehicles, School of Automotive Studies, Tongji University, Shanghai, China.

出版信息

Front Neurosci. 2023 Jul 19;17:1201088. doi: 10.3389/fnins.2023.1201088. eCollection 2023.

DOI:10.3389/fnins.2023.1201088
PMID:37539382
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10394227/
Abstract

Vision-based human pose estimation has been widely applied in tasks such as augmented reality, action recognition and human-machine interaction. Current approaches favor the keypoint detection-based paradigm, as it eases the learning by circumventing the highly non-linear problem of direct regressing keypoint coordinates. However, in such a paradigm, each keypoint is predicted based on its small surrounding region in a Gaussian-like heatmap, resulting in a huge waste of information from the rest regions and even limiting the model optimization. In this paper, we design a new k-block multi-person pose estimation architecture with a voting mechanism on the entire heatmap to simultaneously infer the key points and their uncertainties. To further improve the keypoint estimation, this architecture leverages the SMPL 3D human body model, and iteratively mines the information of human body structure to correct the pose estimation from a single image. By experiments on the 3DPW dataset, it improves the state-of-the-art performance by about 8 mm on MPJPE metric and 5 mm on PA-MPJPE metric. Furthermore, its capability to be employed in real-time provides potential applications for multi-person pose estimation to be conducted in complex scenarios.

摘要

基于视觉的人体姿态估计已广泛应用于增强现实、动作识别和人机交互等任务中。当前的方法倾向于基于关键点检测的范式,因为它通过规避直接回归关键点坐标这一高度非线性问题来简化学习过程。然而,在这种范式中,每个关键点是基于高斯热图中其小的周围区域进行预测的,这导致其余区域的信息被大量浪费,甚至限制了模型优化。在本文中,我们设计了一种新的k块多人姿态估计架构,该架构在整个热图上采用投票机制来同时推断关键点及其不确定性。为了进一步改进关键点估计,该架构利用SMPL 3D人体模型,并迭代挖掘人体结构信息以从单张图像中校正姿态估计。通过在3DPW数据集上进行实验,在MPJPE指标上它将当前最优性能提高了约8毫米,在PA - MPJPE指标上提高了5毫米。此外,其能够实时应用为在复杂场景中进行多人姿态估计提供了潜在应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/2f8f38ebfd37/fnins-17-1201088-g0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/b401dbb4f249/fnins-17-1201088-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/4e1a3f5ce35c/fnins-17-1201088-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/8620a6cbfacb/fnins-17-1201088-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/9c01bdb6a1e8/fnins-17-1201088-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/e77416f63c6e/fnins-17-1201088-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/f41fbf3bc5c2/fnins-17-1201088-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/64421b4eeca6/fnins-17-1201088-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/c655136d18b4/fnins-17-1201088-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/db2788d0128f/fnins-17-1201088-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/a9ac0ab8c84c/fnins-17-1201088-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/ff8876404880/fnins-17-1201088-g0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/2f8f38ebfd37/fnins-17-1201088-g0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/b401dbb4f249/fnins-17-1201088-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/4e1a3f5ce35c/fnins-17-1201088-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/8620a6cbfacb/fnins-17-1201088-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/9c01bdb6a1e8/fnins-17-1201088-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/e77416f63c6e/fnins-17-1201088-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/f41fbf3bc5c2/fnins-17-1201088-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/64421b4eeca6/fnins-17-1201088-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/c655136d18b4/fnins-17-1201088-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/db2788d0128f/fnins-17-1201088-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/a9ac0ab8c84c/fnins-17-1201088-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/ff8876404880/fnins-17-1201088-g0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2c9/10394227/2f8f38ebfd37/fnins-17-1201088-g0012.jpg

相似文献

1
Single-view multi-human pose estimation by attentive cross-dimension matching.通过注意力跨维度匹配实现单视图多人姿态估计
Front Neurosci. 2023 Jul 19;17:1201088. doi: 10.3389/fnins.2023.1201088. eCollection 2023.
2
Robust 6-DoF Pose Estimation under Hybrid Constraints.混合约束下的稳健 6-DoF 位姿估计。
Sensors (Basel). 2022 Nov 12;22(22):8758. doi: 10.3390/s22228758.
3
Estimating Ground Reaction Forces from Two-Dimensional Pose Data: A Biomechanics-Based Comparison of AlphaPose, BlazePose, and OpenPose.基于二维姿态数据估计地面反作用力:AlphaPose、BlazePose 和 OpenPose 的生物力学比较。
Sensors (Basel). 2022 Dec 21;23(1):78. doi: 10.3390/s23010078.
4
6-D Object Pose Estimation Based on Point Pair Matching for Robotic Grasp Detection.基于点对匹配的6D物体姿态估计用于机器人抓取检测
IEEE Trans Neural Netw Learn Syst. 2025 Jul;36(7):11902-11916. doi: 10.1109/TNNLS.2024.3442433.
5
PVNet: Pixel-Wise Voting Network for 6DoF Object Pose Estimation.PVNet:用于 6DoF 对象位姿估计的像素级投票网络。
IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):3212-3223. doi: 10.1109/TPAMI.2020.3047388. Epub 2022 May 5.
6
Head Pose Estimation through Keypoints Matching between Reconstructed 3D Face Model and 2D Image.基于重建 3D 人脸模型和 2D 图像关键点匹配的头部姿势估计
Sensors (Basel). 2021 Mar 6;21(5):1841. doi: 10.3390/s21051841.
7
DetPoseNet: Improving Multi-Person Pose Estimation via Coarse-Pose Filtering.DetPoseNet:通过粗粒度姿势过滤提高多人姿势估计。
IEEE Trans Image Process. 2022;31:2782-2795. doi: 10.1109/TIP.2022.3161081. Epub 2022 Apr 4.
8
Multi-Person Pose Estimation Using an Orientation and Occlusion Aware Deep Learning Network.基于方向和遮挡感知的深度学习网络的多人姿态估计。
Sensors (Basel). 2020 Mar 12;20(6):1593. doi: 10.3390/s20061593.
9
Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images.姿势遮挡:一种基于模型的增强方法,用于使用监控图像进行课堂场景中的 2D 姿势估计。
Sensors (Basel). 2022 Oct 30;22(21):8331. doi: 10.3390/s22218331.
10
Regression-Based Camera Pose Estimation through Multi-Level Local Features and Global Features.基于多层次局部特征和全局特征的回归相机位姿估计。
Sensors (Basel). 2023 Apr 18;23(8):4063. doi: 10.3390/s23084063.

本文引用的文献

1
Deep Affinity Network for Multiple Object Tracking.用于多目标跟踪的深度亲和网络
IEEE Trans Pattern Anal Mach Intell. 2021 Jan;43(1):104-119. doi: 10.1109/TPAMI.2019.2929520. Epub 2020 Dec 4.
2
Sparse Representation for 3D Shape Estimation: A Convex Relaxation Approach.基于稀疏表示的三维形状估计:一种凸松弛方法。
IEEE Trans Pattern Anal Mach Intell. 2017 Aug;39(8):1648-1661. doi: 10.1109/TPAMI.2016.2605097. Epub 2016 Sep 1.
3
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
4
Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments.Human3.6M:自然环境中 3D 人体感应的大规模数据集和预测方法。
IEEE Trans Pattern Anal Mach Intell. 2014 Jul;36(7):1325-39. doi: 10.1109/TPAMI.2013.248.