Salman Sartaj Ahmed, Zakir Ali, Takahashi Hiroki
Department of Informatics, Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo 182-8585, Japan.
Artificial Intelligence Exploration Research Center/Meta-Networking Research Center, The University of Electro-Communications, Tokyo 182-8585, Japan.
Sensors (Basel). 2023 Nov 10;23(22):9088. doi: 10.3390/s23229088.
In the field of computer vision, hand pose estimation (HPE) has attracted significant attention from researchers, especially in the fields of human-computer interaction (HCI) and virtual reality (VR). Despite advancements in 2D HPE, challenges persist due to hand dynamics and occlusions. Accurate extraction of hand features, such as edges, textures, and unique patterns, is crucial for enhancing HPE. To address these challenges, we propose SDFPoseGraphNet, a novel framework that combines the strengths of the VGG-19 architecture with spatial attention (SA), enabling a more refined extraction of deep feature maps from hand images. By incorporating the Pose Graph Model (PGM), the network adaptively processes these feature maps to provide tailored pose estimations. First Inference Module (FIM) potentials, alongside adaptively learned parameters, contribute to the PGM's final pose estimation. The SDFPoseGraphNet, with its end-to-end trainable design, optimizes across all components, ensuring enhanced precision in hand pose estimation. Our proposed model outperforms existing state-of-the-art methods, achieving an average precision of 7.49% against the Convolution Pose Machine (CPM) and 3.84% in comparison to the Adaptive Graphical Model Network (AGMN).
在计算机视觉领域,手部姿态估计(HPE)已引起研究人员的广泛关注,尤其是在人机交互(HCI)和虚拟现实(VR)领域。尽管二维手部姿态估计取得了进展,但由于手部动态和遮挡问题,挑战依然存在。准确提取手部特征,如边缘、纹理和独特图案,对于增强手部姿态估计至关重要。为应对这些挑战,我们提出了SDFPoseGraphNet,这是一个新颖的框架,它将VGG - 19架构的优势与空间注意力(SA)相结合,能够从手部图像中更精细地提取深度特征图。通过整合姿态图模型(PGM),该网络自适应地处理这些特征图,以提供定制化的姿态估计。第一推理模块(FIM)势,连同自适应学习的参数,有助于PGM的最终姿态估计。SDFPoseGraphNet采用端到端可训练设计,对所有组件进行优化,确保手部姿态估计的精度得到提高。我们提出的模型优于现有的最先进方法,与卷积姿态机(CPM)相比,平均精度提高了7.49%,与自适应图形模型网络(AGMN)相比提高了3.84%。