School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China.
The Engineering Research Center of Intelligent Theranostics Technology and Instruments, Ministry of Education, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, People's Republic of China.
J Neural Eng. 2024 Apr 9;21(2). doi: 10.1088/1741-2552/ad39a5.
Recent studies have shown that integrating inertial measurement unit (IMU) signals with surface electromyographic (sEMG) can greatly improve hand gesture recognition (HGR) performance in applications such as prosthetic control and rehabilitation training. However, current deep learning models for multimodal HGR encounter difficulties in invasive modal fusion, complex feature extraction from heterogeneous signals, and limited inter-subject model generalization. To address these challenges, this study aims to develop an end-to-end and inter-subject transferable model that utilizes non-invasively fused sEMG and acceleration (ACC) data.The proposed non-invasive modal fusion-transformer (NIMFT) model utilizes 1D-convolutional neural networks-based patch embedding for local information extraction and employs a multi-head cross-attention (MCA) mechanism to non-invasively integrate sEMG and ACC signals, stabilizing the variability induced by sEMG. The proposed architecture undergoes detailed ablation studies after hyperparameter tuning. Transfer learning is employed by fine-tuning a pre-trained model on new subject and a comparative analysis is performed between the fine-tuning and subject-specific model. Additionally, the performance of NIMFT is compared to state-of-the-art fusion models.The NIMFT model achieved recognition accuracies of 93.91%, 91.02%, and 95.56% on the three action sets in the Ninapro DB2 dataset. The proposed embedding method and MCA outperformed the traditional invasive modal fusion transformer by 2.01% (embedding) and 1.23% (fusion), respectively. In comparison to subject-specific models, the fine-tuning model exhibited the highest average accuracy improvement of 2.26%, achieving a final accuracy of 96.13%. Moreover, the NIMFT model demonstrated superiority in terms of accuracy, recall, precision, and F1-score compared to the latest modal fusion models with similar model scale.The NIMFT is a novel end-to-end HGR model, utilizes a non-invasive MCA mechanism to integrate long-range intermodal information effectively. Compared to recent modal fusion models, it demonstrates superior performance in inter-subject experiments and offers higher training efficiency and accuracy levels through transfer learning than subject-specific approaches.
最近的研究表明,将惯性测量单元(IMU)信号与表面肌电(sEMG)信号相结合,可以极大地提高手 gestures 识别(HGR)性能,在假肢控制和康复训练等应用中。然而,当前用于多模态 HGR 的深度学习模型在侵入模态融合、从异构信号中提取复杂特征以及有限的受试者间模型泛化方面存在困难。为了解决这些挑战,本研究旨在开发一种端到端和受试者间可转移的模型,利用非侵入性融合的 sEMG 和加速度(ACC)数据。
所提出的非侵入性模态融合-Transformer(NIMFT)模型利用基于 1D 卷积神经网络的 patch 嵌入进行局部信息提取,并采用多头交叉注意(MCA)机制对 sEMG 和 ACC 信号进行非侵入性融合,稳定 sEMG 引起的变异性。在所提出的架构经过超参数调整后,进行了详细的消融研究。通过在新受试者上微调预训练模型进行迁移学习,并对微调模型和受试者特定模型进行了比较分析。此外,还将 NIMFT 的性能与最新的融合模型进行了比较。
在 Ninapro DB2 数据集的三个动作集中,NIMFT 模型的识别准确率分别达到了 93.91%、91.02%和 95.56%。所提出的嵌入方法和 MCA 分别比传统的侵入性模态融合 Transformer 高出 2.01%(嵌入)和 1.23%(融合)。与受试者特定模型相比,微调模型的平均准确率提高了 2.26%,最终准确率达到了 96.13%。此外,与具有相似模型规模的最新模态融合模型相比,NIMFT 模型在精度、召回率、精度和 F1 分数方面表现出优越性。
NIMFT 是一种新颖的端到端 HGR 模型,利用非侵入性 MCA 机制有效地整合远程模态间信息。与最近的模态融合模型相比,它在受试者间实验中表现出更好的性能,并通过迁移学习提供比受试者特定方法更高的训练效率和准确率。