Suppr超能文献

语音相关神经活动的迭代对齐发现。

Iterative alignment discovery of speech-associated neural activity.

机构信息

Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, United States of America.

Department of Neurology, Johns Hopkins Medicine, Baltimore, MD 21287, United States of America.

出版信息

J Neural Eng. 2024 Aug 28;21(4):046056. doi: 10.1088/1741-2552/ad663c.

Abstract

. Brain-computer interfaces (BCIs) have the potential to preserve or restore speech in patients with neurological disorders that weaken the muscles involved in speech production. However, successful training of low-latency speech synthesis and recognition models requires alignment of neural activity with intended phonetic or acoustic output with high temporal precision. This is particularly challenging in patients who cannot produce audible speech, as ground truth with which to pinpoint neural activity synchronized with speech is not available.. In this study, we present a new iterative algorithm for neural voice activity detection (nVAD) called iterative alignment discovery dynamic time warping (IAD-DTW) that integrates DTW into the loss function of a deep neural network (DNN). The algorithm is designed to discover the alignment between a patient's electrocorticographic (ECoG) neural responses and their attempts to speak during collection of data for training BCI decoders for speech synthesis and recognition.. To demonstrate the effectiveness of the algorithm, we tested its accuracy in predicting the onset and duration of acoustic signals produced by able-bodied patients with intact speech undergoing short-term diagnostic ECoG recordings for epilepsy surgery. We simulated a lack of ground truth by randomly perturbing the temporal correspondence between neural activity and an initial single estimate for all speech onsets and durations. We examined the model's ability to overcome these perturbations to estimate ground truth. IAD-DTW showed no notable degradation (<1% absolute decrease in accuracy) in performance in these simulations, even in the case of maximal misalignments between speech and silence.. IAD-DTW is computationally inexpensive and can be easily integrated into existing DNN-based nVAD approaches, as it pertains only to the final loss computation. This approach makes it possible to train speech BCI algorithms using ECoG data from patients who are unable to produce audible speech, including those with Locked-In Syndrome.

摘要

脑机接口(BCIs)有可能在神经障碍患者中保留或恢复言语功能,这些患者的肌肉参与言语产生的能力减弱。然而,成功训练低延迟语音合成和识别模型需要将神经活动与预期的语音或声学输出对齐,具有高精度的时间精度。对于无法产生可听语音的患者来说,这尤其具有挑战性,因为没有可用于精确定位与语音同步的神经活动的真实数据。在这项研究中,我们提出了一种新的神经语音活动检测(nVAD)迭代算法,称为迭代对齐发现动态时间规整(IAD-DTW),它将 DTW 集成到深度神经网络(DNN)的损失函数中。该算法旨在发现患者脑电图(ECoG)神经反应与他们在收集数据时试图说话之间的对齐,这些数据用于训练语音合成和识别的 BCI 解码器。为了证明该算法的有效性,我们测试了其在预测有能力的患者的声学信号起始和持续时间方面的准确性,这些患者在进行癫痫手术的短期诊断性 ECoG 记录期间言语完整。我们通过随机改变神经活动与所有语音起始和持续时间的初始单个估计之间的时间对应关系,模拟了缺乏真实数据的情况。我们研究了该模型克服这些干扰以估计真实数据的能力。即使在语音和静音之间存在最大的不对准情况下,IAD-DTW 在这些模拟中也没有明显的性能下降(准确性降低<1%)。IAD-DTW 计算成本低廉,并且可以轻松集成到现有的基于 DNN 的 nVAD 方法中,因为它仅涉及最终的损失计算。这种方法使得使用无法产生可听语音的患者(包括闭锁综合征患者)的 ECoG 数据来训练语音 BCI 算法成为可能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9eb1/11351572/77d2b7bb0dd1/jnead663cf1_hr.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验