IEEE Trans Neural Syst Rehabil Eng. 2024;32:2864-2872. doi: 10.1109/TNSRE.2024.3438436. Epub 2024 Aug 12.
Hand function assessments in a clinical setting are critical for upper limb rehabilitation after spinal cord injury (SCI) but may not accurately reflect performance in an individual's home environment. When paired with computer vision models, egocentric videos from wearable cameras provide an opportunity for remote hand function assessment during real activities of daily living (ADLs). This study demonstrates the use of computer vision models to predict clinical hand function assessment scores from egocentric video. SlowFast, MViT, and MaskFeat models were trained and validated on a custom SCI dataset, which contained a variety of ADLs carried out in a simulated home environment. The dataset was annotated with clinical hand function assessment scores using an adapted scale applicable to a wide range of object interactions. An accuracy of 0.551±0.139, mean absolute error (MAE) of 0.517±0.184, and F1 score of 0.547±0.151 was achieved on the 5-class classification task. An accuracy of 0.724±0.135, MAE of 0.290±0.140, and F1 score of 0.733±0.144 was achieved on a consolidated 3-class classification task. This novel approach, for the first time, demonstrates the prediction of hand function assessment scores from egocentric video after SCI.
在临床环境中进行手部功能评估对于脊髓损伤(SCI)后的上肢康复至关重要,但可能无法准确反映个体在家庭环境中的表现。当与计算机视觉模型结合使用时,可穿戴相机的自拍摄像头提供了在日常生活活动(ADL)中进行远程手部功能评估的机会。本研究展示了如何使用计算机视觉模型从自拍摄像预测临床手部功能评估得分。SlowFast、MViT 和 MaskFeat 模型在一个定制的 SCI 数据集上进行了训练和验证,该数据集包含了在模拟家庭环境中进行的各种 ADL。该数据集使用适用于广泛物体交互的改编量表,对临床手部功能评估得分进行了注释。在 5 类分类任务中,实现了 0.551±0.139 的准确率、0.517±0.184 的平均绝对误差(MAE)和 0.547±0.151 的 F1 得分。在统一的 3 类分类任务中,实现了 0.724±0.135 的准确率、0.290±0.140 的 MAE 和 0.733±0.144 的 F1 得分。这种新方法首次证明了可以从 SCI 后的自拍摄像预测手部功能评估得分。