Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030, United States of America.
Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX 77030, United States of America.
J Neural Eng. 2023 Aug 14;20(4). doi: 10.1088/1741-2552/ace9fb.
The speech production network relies on a widely distributed brain network. However, research and development of speech brain-computer interfaces (speech-BCIs) has typically focused on decoding speech only from superficial subregions readily accessible by subdural grid arrays-typically placed over the sensorimotor cortex. Alternatively, the technique of stereo-electroencephalography (sEEG) enables access to distributed brain regions using multiple depth electrodes with lower surgical risks, especially in patients with brain injuries resulting in aphasia and other speech disorders.To investigate the decoding potential of widespread electrode coverage in multiple cortical sites, we used a naturalistic continuous speech production task. We obtained neural recordings using sEEG from eight participants while they read aloud sentences. We trained linear classifiers to decode distinct speech components (articulatory components and phonemes) solely based on broadband gamma activity and evaluated the decoding performance using nested five-fold cross-validation.We achieved an average classification accuracy of 18.7% across 9 places of articulation (e.g. bilabials, palatals), 26.5% across 5 manner of articulation (MOA) labels (e.g. affricates, fricatives), and 4.81% across 38 phonemes. The highest classification accuracies achieved with a single large dataset were 26.3% for place of articulation, 35.7% for MOA, and 9.88% for phonemes. Electrodes that contributed high decoding power were distributed across multiple sulcal and gyral sites in both dominant and non-dominant hemispheres, including ventral sensorimotor, inferior frontal, superior temporal, and fusiform cortices. Rather than finding a distinct cortical locus for each speech component, we observed neural correlates of both articulatory and phonetic components in multiple hubs of a widespread language production network.These results reveal the distributed cortical representations whose activity can enable decoding speech components during continuous speech through the use of this minimally invasive recording method, elucidating language neurobiology and neural targets for future speech-BCIs.
言语产生网络依赖于一个广泛分布的大脑网络。然而,言语脑机接口(speech-BCI)的研究和开发通常集中在仅从易于通过硬膜下栅格阵列(通常放置在感觉运动皮层上方)获得的浅层亚区解码言语。或者,立体脑电图(sEEG)技术使用具有较低手术风险的多个深部电极来访问分布式脑区,尤其是在因失语症和其他言语障碍而导致脑损伤的患者中。为了研究在多个皮质区广泛电极覆盖的解码潜力,我们使用自然连续言语产生任务。我们使用 sEEG 从八名参与者中获取神经记录,同时他们大声朗读句子。我们仅基于宽带伽马活动训练线性分类器来解码不同的言语成分(发音成分和音素),并使用嵌套五折交叉验证来评估解码性能。我们在 9 个发音部位(例如双唇音、硬腭音)上实现了 18.7%的平均分类准确率,在 5 个发音方式(MOA)标签(例如塞擦音、擦音)上实现了 26.5%的平均分类准确率,在 38 个音素上实现了 4.81%的平均分类准确率。使用单个大型数据集实现的最高分类准确率分别为发音部位 26.3%、MOA 35.7%和音素 9.88%。具有高解码能力的电极分布在优势和非优势半球的多个脑回和脑沟部位,包括腹侧感觉运动、下额前、上颞和梭状回皮质。我们没有为每个言语成分找到一个独特的皮质位置,而是在一个广泛的言语产生网络的多个枢纽中观察到发音和语音成分的神经相关性。这些结果揭示了在连续言语中通过使用这种微创记录方法能够解码言语成分的分布式皮质代表,阐明了语言神经生物学和未来言语 BCI 的神经靶标。