Suppr
超能文献

基于混合 sEMG 和 A 型超声传感的手势识别的多模态多层次融合注意网络。

A Multimodal Multilevel Converged Attention Network for Hand Gesture Recognition With Hybrid sEMG and A-Mode Ultrasound Sensing.

出版信息

IEEE Trans Cybern. 2023 Dec;53(12):7723-7734. doi: 10.1109/TCYB.2022.3204343. Epub 2023 Nov 29.

DOI:10.1109/TCYB.2022.3204343

Abstract

Gesture recognition based on surface electromyography (sEMG) has been widely used in the field of human-machine interaction (HMI). However, sEMG has limitations, such as low signal-to-noise ratio and insensitivity to fine finger movements, so we consider adding A-mode ultrasound (AUS) to enhance the recognition impact. To explore the influence of multisource sensing data on gesture recognition and better integrate the features of different modules. We proposed a multimodal multilevel converged attention network (MMCANet) model for multisource signals composed of sEMG and AUS. The proposed model extracts the hidden features of the AUS signal with a convolutional neural network (CNN). Meanwhile, a CNN-LSTM (long-short memory network) hybrid structure extracts some spatial-temporal features from the sEMG signal. Then, two types of CNN features from AUS and sEMG are spliced and transmitted to a transformer encoder to fuse the information and interact with sEMG features to produce hybrid features. Finally, the classification results are output employing fully connected layers. Attention mechanisms are used to adjust the weights of feature channels. We compared MMCANet's feature extraction and classification performance with that of manually extracted sEMG-AUS features using four traditional machine-learning (ML) algorithms. The recognition accuracy increased by at least 5.15%. In addition, we tried deep learning (DL) methods with CNN on single modals. The experimental results showed that the proposed model improved 14.31% and 3.80% over the CNN method with single sEMG and AUS, respectively. Compared with some state-of-the-art fusion techniques, our method also achieved better results.

摘要

基于表面肌电信号 (sEMG) 的手势识别已广泛应用于人机交互 (HMI) 领域。然而，sEMG 存在一些局限性，例如信噪比低，对精细手指运动不敏感，因此我们考虑添加 A 型超声 (AUS) 以增强识别效果。为了探索多源传感数据对手势识别的影响，并更好地整合不同模块的特征，我们提出了一种基于 sEMG 和 AUS 的多源信号的多模态多层次融合注意网络 (MMCANet) 模型。所提出的模型使用卷积神经网络 (CNN) 提取 AUS 信号的隐藏特征。同时，一个 CNN-LSTM (长短时记忆网络) 混合结构从 sEMG 信号中提取一些时空特征。然后，将 AUS 和 sEMG 的两种类型的 CNN 特征拼接并传输到一个变压器编码器中，以融合信息并与 sEMG 特征相互作用，产生混合特征。最后，使用全连接层输出分类结果。注意力机制用于调整特征通道的权重。我们将 MMCANet 的特征提取和分类性能与使用四种传统机器学习 (ML) 算法手动提取的 sEMG-AUS 特征进行了比较，识别准确率至少提高了 5.15%。此外，我们尝试了基于单模态的深度学习 (DL) 方法与 CNN，实验结果表明，与单模态 sEMG 和 AUS 的 CNN 方法相比，所提出的模型分别提高了 14.31%和 3.80%。与一些最先进的融合技术相比，我们的方法也取得了更好的结果。