Suppr超能文献

通过胶囊神经网络提取的言语意象特征的深度分析解读。

Interpretation of a deep analysis of speech imagery features extracted by a capsule neural network.

机构信息

Tecnológico Nacional de México/IT Chihuahua, Av. Tecnológico 2909, Chihuahua, 31310, Chihuahua, Mexico.

Tecnológico Nacional de México/IT Chihuahua, Av. Tecnológico 2909, Chihuahua, 31310, Chihuahua, Mexico.

出版信息

Comput Biol Med. 2023 Jun;159:106909. doi: 10.1016/j.compbiomed.2023.106909. Epub 2023 Apr 14.

Abstract

Speech imagery has been successfully employed in developing Brain-Computer Interfaces because it is a novel mental strategy that generates brain activity more intuitively than evoked potentials or motor imagery. There are many methods to analyze speech imagery signals, but those based on deep neural networks achieve the best results. However, more research is necessary to understand the properties and features that describe imagined phonemes and words. In this paper, we analyze the statistical properties of speech imagery EEG signals from the KaraOne dataset to design a method that classifies imagined phonemes and words. With this analysis, we propose a Capsule Neural Network that categorizes speech imagery patterns into bilabial, nasal, consonant-vocal, and vowels/iy/ and/uw/. The method is called Capsules for Speech Imagery Analysis (CapsK-SI). The input of CapsK-SI is a set of statistical features of EEG speech imagery signals. The architecture of the Capsule Neural Network is composed of a convolution layer, a primary capsule layer, and a class capsule layer. The average accuracy reached is 90.88%±7 for bilabial, 90.15%±8 for nasal, 94.02%±6 for consonant-vowel, 89.70%±8 for word-phoneme, 94.33%± for/iy/ vowel and, 94.21%±3 for/uw/ vowel detection. Finally, with the activity vectors of the CapsK-SI capsules, we generated brain maps to represent brain activity in the production of bilabial, nasal, and consonant-vocal signals.

摘要

语音意象已成功应用于脑机接口的开发,因为它是一种新颖的心理策略,比诱发电位或运动意象更直观地产生脑活动。有许多方法可以分析语音意象信号,但基于深度神经网络的方法可以取得最佳效果。然而,需要更多的研究来了解描述想象中的音素和单词的特性和特征。在本文中,我们分析了来自 KaraOne 数据集的语音意象 EEG 信号的统计特性,以设计一种分类想象中的音素和单词的方法。通过这种分析,我们提出了一种胶囊神经网络,将语音意象模式分类为双唇音、鼻音、辅音-元音和元音/i/和/u/。该方法称为语音意象分析胶囊网络(CapsK-SI)。CapsK-SI 的输入是一组 EEG 语音意象信号的统计特征。胶囊神经网络的结构由卷积层、初级胶囊层和类胶囊层组成。达到的平均准确率为:双唇音为 90.88%±7,鼻音为 90.15%±8,辅音-元音为 94.02%±6,单词-音素为 89.70%±8,/iy/元音为 94.33%±,/uw/元音为 94.21%±3。最后,通过 CapsK-SI 胶囊的活动向量,我们生成了大脑图谱,以表示产生双唇音、鼻音和辅音-元音信号时的大脑活动。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验