Deng Xiaoming, Zuo Dexin, Zhang Yinda, Cui Zhaopeng, Cheng Jian, Tan Ping, Chang Liang, Pollefeys Marc, Fanello Sean, Wang Hongan
IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):932-945. doi: 10.1109/TPAMI.2022.3159725. Epub 2022 Dec 5.
3D hand pose estimation is a challenging problem in computer vision due to the high degrees-of-freedom of hand articulated motion space and large viewpoint variation. As a consequence, similar poses observed from multiple views can be dramatically different. In order to deal with this issue, view-independent features are required to achieve state-of-the-art performance. In this paper, we investigate the impact of view-independent features on 3D hand pose estimation from a single depth image, and propose a novel recurrent neural network for 3D hand pose estimation, in which a cascaded 3D pose-guided alignment strategy is designed for view-independent feature extraction and a recurrent hand pose module is designed for modeling the dependencies among sequential aligned features for 3D hand pose estimation. In particular, our cascaded pose-guided 3D alignments are performed in 3D space in a coarse-to-fine fashion. First, hand joints are predicted and globally transformed into a canonical reference frame; Second, the palm of the hand is detected and aligned; Third, local transformations are applied to the fingers to refine the final predictions. The proposed recurrent hand pose module for aligned 3D representation can extract recurrent pose-aware features and iteratively refines the estimated hand pose. Our recurrent module could be utilized for both single-view estimation and sequence-based estimation with 3D hand pose tracking. Experiments show that our method improves the state-of-the-art by a large margin on popular benchmarks with the simple yet efficient alignment and network architectures.
由于手部关节运动空间的高自由度和大视角变化,3D手部姿态估计是计算机视觉中的一个具有挑战性的问题。因此,从多个视角观察到的相似姿态可能会有显著差异。为了解决这个问题,需要视图无关特征来实现最优性能。在本文中,我们研究了视图无关特征对从单深度图像进行3D手部姿态估计的影响,并提出了一种用于3D手部姿态估计的新型循环神经网络,其中设计了一种级联3D姿态引导对齐策略用于视图无关特征提取,设计了一个循环手部姿态模块用于对3D手部姿态估计的顺序对齐特征之间的依赖关系进行建模。特别地,我们的级联姿态引导3D对齐在3D空间中以粗到细的方式进行。首先,预测手部关节并将其全局变换到一个规范参考系;其次,检测并对齐手掌;第三,对手指应用局部变换以细化最终预测。所提出的用于对齐3D表示的循环手部姿态模块可以提取循环姿态感知特征并迭代细化估计的手部姿态。我们的循环模块可用于单视图估计和基于序列的3D手部姿态跟踪估计。实验表明,我们的方法通过简单而有效的对齐和网络架构在流行基准上大幅提高了当前最优水平。