Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai 200040, China.
National Center for Neurological Disorders, Shanghai 200052, China.
Sci Adv. 2023 Jun 9;9(23):eadh0478. doi: 10.1126/sciadv.adh0478.
Recent studies have shown that the feasibility of speech brain-computer interfaces (BCIs) as a clinically valid treatment in helping nontonal language patients with communication disorders restore their speech ability. However, tonal language speech BCI is challenging because additional precise control of laryngeal movements to produce lexical tones is required. Thus, the model should emphasize the features from the tonal-related cortex. Here, we designed a modularized multistream neural network that directly synthesizes tonal language speech from intracranial recordings. The network decoded lexical tones and base syllables independently via parallel streams of neural network modules inspired by neuroscience findings. The speech was synthesized by combining tonal syllable labels with nondiscriminant speech neural activity. Compared to commonly used baseline models, our proposed models achieved higher performance with modest training data and computational costs. These findings raise a potential strategy for approaching tonal language speech restoration.
最近的研究表明,语音脑机接口(BCI)作为一种帮助非声调语言患者恢复言语能力的临床有效治疗方法具有可行性。然而,声调语言语音 BCI 具有挑战性,因为需要对喉部运动进行额外的精确控制,以产生词汇声调。因此,该模型应强调与声调相关的皮质特征。在这里,我们设计了一个模块化的多流神经网络,可以直接从颅内记录中合成声调语言语音。该网络通过受神经科学发现启发的神经网络模块的并行流,分别解码词汇声调和谐音基音节。通过将声调音节标签与无判别力的言语神经活动相结合来合成语音。与常用的基线模型相比,我们提出的模型在使用适度的训练数据和计算成本的情况下取得了更高的性能。这些发现为声调语言语音恢复提供了一种潜在的策略。