Zhu Chenfei, Hu Boce, Chen Jiawei, Ai Xupeng, Agrawal Sunil K
Department of Mechanical Engineering, Columbia University, New York, NY 10027, USA.
Department of Rehabilitation Medicine, Columbia University, New York, NY 10027, USA.
Bioengineering (Basel). 2023 Jan 17;10(2):126. doi: 10.3390/bioengineering10020126.
Hand pose estimation (HPE) plays an important role during the functional assessment of the hand and in potential rehabilitation. It is a challenge to predict the pose of the hand conveniently and accurately during functional tasks, and this limits the application of HPE. In this paper, we propose a novel architecture of a shifted attention regression network (SARN) to perform HPE. Given a depth image, SARN first predicts the spatial relationships between points in the depth image and a group of hand keypoints that determine the pose of the hand. Then, SARN uses these spatial relationships to infer the 3D position of each hand keypoint. To verify the effectiveness of the proposed method, we conducted experiments on three open-source datasets of 3D hand poses: NYU, ICVL, and MSRA. The proposed method achieved state-of-the-art performance with 7.32 mm, 5.91 mm, and 7.17 mm of mean error at the hand keypoints, i.e., mean Euclidean distance between the predicted and ground-truth hand keypoint positions. Additionally, to test the feasibility of SARN in hand movement recognition, a hand movement dataset of 26K depth images from 17 healthy subjects was constructed based on the finger tapping test, an important component of neurological exams administered to Parkinson's patients. Each image was annotated with the tips of the index finger and the thumb. For this dataset, the proposed method achieved a mean error of 2.99 mm at the hand keypoints and comparable performance on three task-specific metrics: the distance, velocity, and acceleration of the relative movement of the two fingertips. Results on the open-source datasets demonstrated the effectiveness of the proposed method, and results on our finger tapping dataset validated its potential for applications in functional task characterization.
手部姿态估计(HPE)在手部功能评估和潜在康复中发挥着重要作用。在功能任务中方便且准确地预测手部姿态是一项挑战,这限制了HPE的应用。在本文中,我们提出了一种新颖的移位注意力回归网络(SARN)架构来执行HPE。给定一幅深度图像,SARN首先预测深度图像中的点与一组决定手部姿态的手部关键点之间的空间关系。然后,SARN利用这些空间关系来推断每个手部关键点的三维位置。为了验证所提方法的有效性,我们在三个三维手部姿态的开源数据集上进行了实验:纽约大学(NYU)、国际计算机视觉实验室(ICVL)和微软亚洲研究院(MSRA)。所提方法在手部关键点处实现了最先进的性能,平均误差分别为7.32毫米、5.91毫米和7.17毫米,即预测的和真实的手部关键点位置之间的平均欧几里得距离。此外,为了测试SARN在手部运动识别中的可行性,基于对帕金森病患者进行的神经学检查的一个重要组成部分——手指敲击测试,构建了一个来自17名健康受试者的包含26000幅深度图像的手部运动数据集。每幅图像都标注了食指和拇指的指尖。对于这个数据集,所提方法在手部关键点处的平均误差为2.99毫米,并且在三个特定任务指标上具有可比的性能:两个指尖相对运动的距离、速度和加速度。在开源数据集上的结果证明了所提方法的有效性,在我们的手指敲击数据集上的结果验证了其在功能任务表征中的应用潜力。