Suppr超能文献

基于时空神经网络的高密度表面肌电图解码无声语音

Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network.

作者信息

Chen Xi, Zhang Xu, Chen Xiang, Chen Xun

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2023;31:2069-2078. doi: 10.1109/TNSRE.2023.3266299. Epub 2023 Apr 26.

Abstract

Finer-grained decoding at a phoneme or syllable level is a key technology for continuous recognition of silent speech based on surface electromyogram (sEMG). This paper aims at developing a novel syllable-level decoding method for continuous silent speech recognition (SSR) using spatio-temporal end-to-end neural network. In the proposed method, the high-density sEMG (HD-sEMG) was first converted into a series of feature images, and then a spatio-temporal end-to-end neural network was applied to extract discriminative feature representations and to achieve syllable-level decoding. The effectiveness of the proposed method was verified with HD-sEMG data recorded by four pieces of 64-channel electrode arrays placed over facial and laryngeal muscles of fifteen subjects subvocalizing 33 Chinese phrases consisting of 82 syllables. The proposed method outperformed the benchmark methods by achieving the highest phrase classification accuracy (97.17 ± 1.53%, ), and lower character error rate (3.11 ± 1.46%, ). This study provides a promising way of decoding sEMG towards SSR, which has great potential applications in instant communication and remote control.

摘要

在音素或音节层面进行更细粒度的解码是基于表面肌电图(sEMG)的无声语音连续识别的一项关键技术。本文旨在开发一种使用时空端到端神经网络的新型音节级解码方法,用于连续无声语音识别(SSR)。在所提出的方法中,首先将高密度sEMG(HD-sEMG)转换为一系列特征图像,然后应用时空端到端神经网络来提取判别性特征表示并实现音节级解码。通过由放置在15名受试者面部和喉部肌肉上的四套64通道电极阵列记录的HD-sEMG数据,验证了所提出方法的有效性,这些受试者默读了由82个音节组成的33个中文短语。所提出的方法通过实现最高的短语分类准确率(97.17 ± 1.53%)和较低的字符错误率(3.11 ± 1.46%),优于基准方法。本研究为朝着SSR方向解码sEMG提供了一种有前景的方法,其在即时通信和远程控制方面具有巨大的潜在应用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验