Suppr超能文献

基于单目视频的多人 3D 绝对姿态估计的自顶向下系统。

Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos.

机构信息

Université Clermont-Auvergne, CNRS, Mines de Saint-Étienne, Clermont-Auvergne-INP, LIMOS, 63000 Clermont-Ferrand, France.

Alqualsadi Research Team, Rabat IT Center, ENSIAS, Mohammed V University in Rabat, Rabat 10112, Morocco.

出版信息

Sensors (Basel). 2022 May 28;22(11):4109. doi: 10.3390/s22114109.

Abstract

Two-dimensional (2D) multi-person pose estimation and three-dimensional (3D) root-relative pose estimation from a monocular RGB camera have made significant progress recently. Yet, real-world applications require depth estimations and the ability to determine the distances between people in a scene. Therefore, it is necessary to recover the 3D absolute poses of several people. However, this is still a challenge when using cameras from single points of view. Furthermore, the previously proposed systems typically required a significant amount of resources and memory. To overcome these restrictions, we herein propose a real-time framework for multi-person 3D absolute pose estimation from a monocular camera, which integrates a human detector, a 2D pose estimator, a 3D root-relative pose reconstructor, and a root depth estimator in a top-down manner. The proposed system, called Root-GAST-Net, is based on modified versions of GAST-Net and RootNet networks. The efficiency of the proposed Root-GAST-Net system is demonstrated through quantitative and qualitative evaluations on two benchmark datasets, Human3.6M and MuPoTS-3D. On all evaluated metrics, our experimental results on the MuPoTS-3D dataset outperform the current state-of-the-art by a significant margin, and can run in real-time at 15 fps on the Nvidia GeForce GTX 1080.

摘要

从单目 RGB 相机进行二维(2D)多人姿态估计和三维(3D)根相对姿态估计最近取得了显著进展。然而,实际应用需要深度估计和确定场景中人与人之间距离的能力。因此,有必要恢复多个人的 3D 绝对姿态。但是,当使用来自单点视角的相机时,这仍然是一个挑战。此外,以前提出的系统通常需要大量的资源和内存。为了克服这些限制,我们提出了一种从单目相机实时进行多人 3D 绝对姿态估计的框架,该框架自上而下集成了人体探测器、2D 姿态估计器、3D 根相对姿态重构器和根深度估计器。该系统称为 Root-GAST-Net,它基于 GAST-Net 和 RootNet 网络的修改版本。通过在 Human3.6M 和 MuPoTS-3D 两个基准数据集上进行定量和定性评估,证明了所提出的 Root-GAST-Net 系统的效率。在所有评估指标上,我们在 MuPoTS-3D 数据集上的实验结果明显优于当前的最先进技术,并且可以在 Nvidia GeForce GTX 1080 上以 15 fps 的实时速度运行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8cc/9185275/6326233257e9/sensors-22-04109-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验