CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China.
Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, People's Republic of China.
J Neural Eng. 2021 Jan 25;18(1). doi: 10.1088/1741-2552/abca14.
. Silent speech recognition (SSR) based on surface electromyography (sEMG) is an attractive non-acoustic modality of human-machine interfaces that convert the neuromuscular electrophysiological signals into computer-readable textual messages. The speaking process involves complex neuromuscular activities spanning a large area over the facial and neck muscles, thus the locations of the sEMG electrodes considerably affected the performance of the SSR system. However, most of the previous studies used only a quite limited number of electrodes that were placed empirically without prior quantitative analysis, resulting in uncertainty and unreliability of the SSR outcomes.. In this study, the technique of high-density sEMG was proposed to provide a full representation of the articulatory muscle activities so that the optimal electrode configuration for SSR could be systemically explored. A total of 120 closely spaced electrodes were placed on the facial and neck muscles to collect the high-density sEMG signals for classifying ten digits (0-9) silently spoken in both English and Chinese. The sequential forward selection algorithm was adopted to explore the optimal electrodes configurations.The results showed that the classification accuracy increased rapidly and became saturated quickly when the number of selected electrodes increased from 1 to 120. Using only ten optimal electrodes could achieve a classification accuracy of 86% for English and 94% for Chinese, whereas as many as 40 non-optimized electrodes were required to obtain comparable accuracies. Also, the optimally selected electrodes seemed to be mostly distributed on the neck instead of the facial region, and more electrodes were required for English recognition to achieve the same accuracy.. The findings of this study can provide useful guidelines about electrode placement for developing a clinically feasible SSR system and implementing a promising approach of human-machine interface, especially for patients with speaking difficulties.
基于表面肌电图 (sEMG) 的语音识别是一种很有吸引力的非声学人机接口模态,它将神经肌肉电生理信号转换为计算机可读的文本信息。说话过程涉及到跨越面部和颈部肌肉的复杂神经肌肉活动,因此 sEMG 电极的位置对 SSR 系统的性能有很大的影响。然而,之前的大多数研究只使用了相当有限数量的电极,这些电极是根据经验放置的,没有事先进行定量分析,导致 SSR 结果的不确定性和不可靠性。在这项研究中,提出了高密度 sEMG 技术,以提供发音肌活动的完整表示,从而可以系统地探索用于 SSR 的最佳电极配置。在面部和颈部肌肉上总共放置了 120 个紧密间隔的电极,以采集用于无声说出英语和汉语的十个数字 (0-9) 的高密度 sEMG 信号。采用顺序前向选择算法来探索最佳电极配置。结果表明,当从 1 个电极增加到 120 个电极时,分类精度会快速增加并迅速饱和。仅使用十个最佳电极,就可以实现 86%的英语识别准确率和 94%的汉语识别准确率,而需要多达 40 个非优化电极才能获得可比的准确率。此外,最佳选择的电极似乎主要分布在颈部而不是面部区域,并且需要更多的电极来实现相同的英语识别准确率。这项研究的发现可以为开发临床可行的 SSR 系统和实现有前途的人机接口方法提供有用的电极放置指南,特别是对于有说话困难的患者。