Suppr超能文献

使用基于口语训练的 Transformer 对 ECoG 中的隐蔽语音进行解码的可行性。

Feasibility of decoding covert speech in ECoG with a Transformer trained on overt speech.

机构信息

Department of Electronic and Information Engineering, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo, 184-8588, Japan.

Department of Neurosurgery, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo, 113-8421, Japan.

出版信息

Sci Rep. 2024 May 20;14(1):11491. doi: 10.1038/s41598-024-62230-9.

Abstract

Several attempts for speech brain-computer interfacing (BCI) have been made to decode phonemes, sub-words, words, or sentences using invasive measurements, such as the electrocorticogram (ECoG), during auditory speech perception, overt speech, or imagined (covert) speech. Decoding sentences from covert speech is a challenging task. Sixteen epilepsy patients with intracranially implanted electrodes participated in this study, and ECoGs were recorded during overt speech and covert speech of eight Japanese sentences, each consisting of three tokens. In particular, Transformer neural network model was applied to decode text sentences from covert speech, which was trained using ECoGs obtained during overt speech. We first examined the proposed Transformer model using the same task for training and testing, and then evaluated the model's performance when trained with overt task for decoding covert speech. The Transformer model trained on covert speech achieved an average token error rate (TER) of 46.6% for decoding covert speech, whereas the model trained on overt speech achieved a TER of 46.3% . Therefore, the challenge of collecting training data for covert speech can be addressed using overt speech. The performance of covert speech can improve by employing several overt speeches.

摘要

已经有一些尝试使用侵入性测量方法(如脑电描记术)进行语音脑机接口(BCI)的研究,以解码语音感知、口语或想象(隐蔽)语音中的音素、亚词、单词或句子。从隐蔽语音中解码句子是一项具有挑战性的任务。本研究纳入了 16 名颅内植入电极的癫痫患者,在他们进行口头和隐蔽日语句子(每个句子包含三个语料)的语音时记录脑电描记术。特别是,应用了基于 Transformer 的神经网络模型来从隐蔽语音中解码文本来训练该模型,使用在口头语音中获得的脑电描记术来进行训练。我们首先使用相同的训练和测试任务来检验所提出的 Transformer 模型,然后评估该模型在使用口头任务进行隐蔽语音解码时的性能。在隐蔽语音上进行训练的 Transformer 模型在解码隐蔽语音时的平均语料错误率(TER)为 46.6%,而在口头语音上进行训练的模型的 TER 为 46.3%。因此,可以使用口头语音来解决隐蔽语音训练数据收集的挑战。通过使用多个口头语音,可以提高隐蔽语音的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验