Suppr超能文献

基于神经肌肉活动的频谱图特征的无声语音解码

Silent Speech Decoding Using Spectrogram Features Based on Neuromuscular Activities.

作者信息

Wang You, Zhang Ming, Wu RuMeng, Gao Han, Yang Meng, Luo Zhiyuan, Li Guang

机构信息

State Key Laboratory of Industrial Control Technology, Institute of Cyber Systems and Control, Zhejiang University, Hangzhou 310027, China.

Department of Computer Science and Technology, School of Mechanical Electronic and Information Engineering, China University of Mining and Technology, Beijing 100083, China.

出版信息

Brain Sci. 2020 Jul 11;10(7):442. doi: 10.3390/brainsci10070442.

Abstract

Silent speech decoding is a novel application of the Brain-Computer Interface (BCI) based on articulatory neuromuscular activities, reducing difficulties in data acquirement and processing. In this paper, spatial features and decoders that can be used to recognize the neuromuscular signals are investigated. Surface electromyography (sEMG) data are recorded from human subjects in mimed speech situations. Specifically, we propose to utilize transfer learning and deep learning methods by transforming the sEMG data into spectrograms that contain abundant information in time and frequency domains and are regarded as channel-interactive. For transfer learning, a pre-trained model of Xception on the large image dataset is used for feature generation. Three deep learning methods, Multi-Layer Perception, Convolutional Neural Network and bidirectional Long Short-Term Memory, are then trained using the extracted features and evaluated for recognizing the articulatory muscles' movements in our word set. The proposed decoders successfully recognized the silent speech and bidirectional Long Short-Term Memory achieved the best accuracy of 90%, outperforming the other two algorithms. Experimental results demonstrate the validity of spectrogram features and deep learning algorithms.

摘要

无声语音解码是基于发音神经肌肉活动的脑机接口(BCI)的一种新型应用,减少了数据采集和处理的难度。本文研究了可用于识别神经肌肉信号的空间特征和解码器。在模拟语音情境下从人类受试者记录表面肌电图(sEMG)数据。具体而言,我们建议通过将sEMG数据转换为在时域和频域中包含丰富信息且被视为通道交互的频谱图来利用迁移学习和深度学习方法。对于迁移学习,在大型图像数据集上预训练的Xception模型用于特征生成。然后使用提取的特征训练三种深度学习方法,即多层感知器、卷积神经网络和双向长短期记忆,并对其在我们的单词集中识别发音肌肉运动的能力进行评估。所提出的解码器成功识别了无声语音,双向长短期记忆实现了90%的最佳准确率,优于其他两种算法。实验结果证明了频谱图特征和深度学习算法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1529/7407985/3b14e8baf228/brainsci-10-00442-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验