• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

单目3D人体姿态估计的技术现状综述:方法、基准和挑战

A Survey of the State of the Art in Monocular 3D Human Pose Estimation: Methods, Benchmarks, and Challenges.

作者信息

Guo Yan, Gao Tianhan, Dong Aoshuang, Jiang Xinbei, Zhu Zichen, Wang Fuxin

机构信息

Software College, Northeastern University, Shenyang 110004, China.

出版信息

Sensors (Basel). 2025 Apr 10;25(8):2409. doi: 10.3390/s25082409.

DOI:10.3390/s25082409
PMID:40285099
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12031093/
Abstract

Three-dimensional human pose estimation (3D HPE) from monocular RGB cameras is a fundamental yet challenging task in computer vision, forming the basis of a wide range of applications such as action recognition, metaverse, self-driving, and healthcare. Recent advances in deep learning have significantly propelled the field, particularly with the incorporation of state-space models (SSMs) and diffusion models. However, systematic reviews that comprehensively cover these emerging techniques remain limited. This survey contributes to the literature by providing the first comprehensive analysis of recent innovative approaches, featuring diffusion models and SSMs within 3D HPE. It categorizes and analyzes various techniques, highlighting their strengths, limitations, and notable innovations. Additionally, it provides a detailed overview of commonly employed datasets and evaluation metrics. Furthermore, this survey offers an in-depth discussion on key challenges, particularly depth ambiguity and occlusion issues arising from single-view setups, thoroughly reviewing effective solutions proposed in recent studies. Finally, current applications and promising avenues for future research are highlighted to guide and inspire ongoing innovation in the area, with emerging trends such as integrating large language models (LLMs) to provide semantic priors and prompt-based supervision for improved 3D pose estimation.

摘要

从单目RGB相机进行三维人体姿态估计(3D HPE)是计算机视觉中一项基础但具有挑战性的任务,它构成了诸如动作识别、元宇宙、自动驾驶和医疗保健等广泛应用的基础。深度学习的最新进展显著推动了该领域的发展,特别是通过纳入状态空间模型(SSM)和扩散模型。然而,全面涵盖这些新兴技术的系统综述仍然有限。本综述通过对近期创新方法进行首次全面分析,为该文献做出了贡献,这些方法以3D HPE中的扩散模型和SSM为特色。它对各种技术进行了分类和分析,突出了它们的优势、局限性和显著创新。此外,它还提供了常用数据集和评估指标的详细概述。此外,本综述对关键挑战进行了深入讨论,特别是单视图设置中出现的深度模糊和遮挡问题,并全面回顾了近期研究中提出的有效解决方案。最后,强调了当前的应用和未来研究的有前景的途径,以指导和激发该领域的持续创新,包括整合大语言模型(LLM)以提供语义先验和基于提示的监督以改进3D姿态估计等新兴趋势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/76e6aa7e5a0f/sensors-25-02409-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/57f8d8434482/sensors-25-02409-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/5b02c144495e/sensors-25-02409-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/fd1fd95a2a39/sensors-25-02409-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/6573973cfa8a/sensors-25-02409-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/4261a62be636/sensors-25-02409-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/3933d8117c85/sensors-25-02409-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/32c0acb175fe/sensors-25-02409-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/8c55eb630b86/sensors-25-02409-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/1f61c61fe6c3/sensors-25-02409-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/8e37ffeff340/sensors-25-02409-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/44185a62435b/sensors-25-02409-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/bdffdf359c95/sensors-25-02409-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/9d5a8f4c152f/sensors-25-02409-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/26eae475d3fa/sensors-25-02409-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/76e6aa7e5a0f/sensors-25-02409-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/57f8d8434482/sensors-25-02409-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/5b02c144495e/sensors-25-02409-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/fd1fd95a2a39/sensors-25-02409-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/6573973cfa8a/sensors-25-02409-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/4261a62be636/sensors-25-02409-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/3933d8117c85/sensors-25-02409-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/32c0acb175fe/sensors-25-02409-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/8c55eb630b86/sensors-25-02409-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/1f61c61fe6c3/sensors-25-02409-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/8e37ffeff340/sensors-25-02409-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/44185a62435b/sensors-25-02409-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/bdffdf359c95/sensors-25-02409-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/9d5a8f4c152f/sensors-25-02409-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/26eae475d3fa/sensors-25-02409-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1818/12031093/76e6aa7e5a0f/sensors-25-02409-g015.jpg

相似文献

1
A Survey of the State of the Art in Monocular 3D Human Pose Estimation: Methods, Benchmarks, and Challenges.单目3D人体姿态估计的技术现状综述:方法、基准和挑战
Sensors (Basel). 2025 Apr 10;25(8):2409. doi: 10.3390/s25082409.
2
Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos.基于单目视频的多人 3D 绝对姿态估计的自顶向下系统。
Sensors (Basel). 2022 May 28;22(11):4109. doi: 10.3390/s22114109.
3
A Systematic Review of Recent Deep Learning Approaches for 3D Human Pose Estimation.近期用于三维人体姿态估计的深度学习方法的系统综述。
J Imaging. 2023 Dec 12;9(12):275. doi: 10.3390/jimaging9120275.
4
Human Joint Angle Estimation Using Deep Learning-Based Three-Dimensional Human Pose Estimation for Application in a Real Environment.基于深度学习的三维人体姿态估计的人体关节角度估计及其在真实环境中的应用。
Sensors (Basel). 2024 Jun 13;24(12):3823. doi: 10.3390/s24123823.
5
3D Human Pose Machines with Self-Supervised Learning.基于自监督学习的 3D 人体姿态估计
IEEE Trans Pattern Anal Mach Intell. 2020 May;42(5):1069-1082. doi: 10.1109/TPAMI.2019.2892452. Epub 2019 Jan 14.
6
HDPose: Post-Hierarchical Diffusion with Conditioning for 3D Human Pose Estimation.HDPose:基于条件化的后分层扩散方法用于三维人体姿态估计
Sensors (Basel). 2024 Jan 26;24(3):829. doi: 10.3390/s24030829.
7
GTIGNet: Global Topology Interaction Graphormer Network for 3D hand pose estimation.GTIGNet:用于3D手部姿态估计的全局拓扑交互图变换器网络
Neural Netw. 2025 May;185:107221. doi: 10.1016/j.neunet.2025.107221. Epub 2025 Feb 4.
8
Accuracy Evaluation of 3D Pose Reconstruction Algorithms Through Stereo Camera Information Fusion for Physical Exercises with MediaPipe Pose.通过基于MediaPipe姿态的立体相机信息融合对三维姿态重建算法进行准确性评估以用于体育锻炼
Sensors (Basel). 2024 Dec 4;24(23):7772. doi: 10.3390/s24237772.
9
Dual Networks Based 3D Multi-Person Pose Estimation From Monocular Video.基于双网络的单目视频3D多人姿态估计
IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):1636-1651. doi: 10.1109/TPAMI.2022.3170353. Epub 2023 Jan 6.
10
Center point to pose: Multiple views 3D human pose estimation for multi-person.中心点姿态:多人多角度三维人体姿态估计
PLoS One. 2022 Sep 13;17(9):e0274450. doi: 10.1371/journal.pone.0274450. eCollection 2022.

引用本文的文献

1
Posture Estimation from Tactile Signals Using a Masked Forward Diffusion Model.使用掩码前向扩散模型从触觉信号进行姿势估计
Sensors (Basel). 2025 Aug 9;25(16):4926. doi: 10.3390/s25164926.

本文引用的文献

1
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-Scale Attention.CrossFormer++:一种基于跨尺度注意力的通用视觉Transformer
IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):3123-3136. doi: 10.1109/TPAMI.2023.3341806. Epub 2024 Apr 3.
2
Dual Networks Based 3D Multi-Person Pose Estimation From Monocular Video.基于双网络的单目视频3D多人姿态估计
IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):1636-1651. doi: 10.1109/TPAMI.2022.3170353. Epub 2023 Jan 6.
3
Vision-based Estimation of MDS-UPDRS Gait Scores for Assessing Parkinson's Disease Motor Severity.
基于视觉的MDS-UPDRS步态评分估计用于评估帕金森病运动严重程度
Med Image Comput Comput Assist Interv. 2020 Oct;12263:637-647. doi: 10.1007/978-3-030-59716-0_61. Epub 2020 Sep 29.
4
3D Human Pose Machines with Self-Supervised Learning.基于自监督学习的 3D 人体姿态估计
IEEE Trans Pattern Anal Mach Intell. 2020 May;42(5):1069-1082. doi: 10.1109/TPAMI.2019.2892452. Epub 2019 Jan 14.
5
Human Pose Estimation from Monocular Images: A Comprehensive Survey.单目图像人体姿态估计:全面综述
Sensors (Basel). 2016 Nov 25;16(12):1966. doi: 10.3390/s16121966.
6
Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments.Human3.6M:自然环境中 3D 人体感应的大规模数据集和预测方法。
IEEE Trans Pattern Anal Mach Intell. 2014 Jul;36(7):1325-39. doi: 10.1109/TPAMI.2013.248.
7
A survey on model based approaches for 2D and 3D visual human pose recovery.基于模型的二维和三维视觉人体姿态恢复方法的调查。
Sensors (Basel). 2014 Mar 3;14(3):4189-210. doi: 10.3390/s140304189.