Mugler Emily M, Patton James L, Flint Robert D, Wright Zachary A, Schuele Stephan U, Rosenow Joshua, Shih Jerry J, Krusienski Dean J, Slutzky Marc W
Bioengineering, University of Illinois at Chicago, 851 S. Morgan Street, Chicago, IL 60607, USA.
J Neural Eng. 2014 Jun;11(3):035015. doi: 10.1088/1741-2560/11/3/035015. Epub 2014 May 19.
Although brain-computer interfaces (BCIs) can be used in several different ways to restore communication, communicative BCI has not approached the rate or efficiency of natural human speech. Electrocorticography (ECoG) has precise spatiotemporal resolution that enables recording of brain activity distributed over a wide area of cortex, such as during speech production. In this study, we sought to decode elements of speech production using ECoG.
We investigated words that contain the entire set of phonemes in the general American accent using ECoG with four subjects. Using a linear classifier, we evaluated the degree to which individual phonemes within each word could be correctly identified from cortical signal.
We classified phonemes with up to 36% accuracy when classifying all phonemes and up to 63% accuracy for a single phoneme. Further, misclassified phonemes follow articulation organization described in phonology literature, aiding classification of whole words. Precise temporal alignment to phoneme onset was crucial for classification success.
We identified specific spatiotemporal features that aid classification, which could guide future applications. Word identification was equivalent to information transfer rates as high as 3.0 bits s(-1) (33.6 words min(-1)), supporting pursuit of speech articulation for BCI control.
尽管脑机接口(BCI)可以通过多种不同方式用于恢复交流,但用于交流的BCI尚未达到自然人类言语的速度或效率。皮层脑电图(ECoG)具有精确的时空分辨率,能够记录分布在广泛皮质区域的脑活动,例如在言语产生过程中。在本研究中,我们试图使用ECoG解码言语产生的要素。
我们使用ECoG对四名受试者进行研究,调查包含通用美国口音中所有音素集的单词。使用线性分类器,我们评估了每个单词内各个音素能够从皮质信号中被正确识别的程度。
对所有音素进行分类时,我们对音素的分类准确率高达36%,对单个音素的分类准确率高达63%。此外,错误分类的音素遵循语音学文献中描述的发音组织,有助于对整个单词进行分类。与音素起始的精确时间对齐对分类成功至关重要。
我们识别出有助于分类的特定时空特征,这可以指导未来的应用。单词识别相当于高达3.0比特每秒(33.6个单词每分钟)的信息传输速率,支持为实现BCI控制而对言语发音进行的探索。