Suppr超能文献

使用具有连续言语诱发皮层听觉反应的深度学习模型进行客观言语可懂度预测。

Objective speech intelligibility prediction using a deep learning model with continuous speech-evoked cortical auditory responses.

作者信息

Na Youngmin, Joo Hyosung, Trang Le Thi, Quan Luong Do Anh, Woo Jihwan

机构信息

Department of Biomedical Engineering, University of Ulsan, Ulsan, South Korea.

Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea.

出版信息

Front Neurosci. 2022 Aug 18;16:906616. doi: 10.3389/fnins.2022.906616. eCollection 2022.

Abstract

Auditory prostheses provide an opportunity for rehabilitation of hearing-impaired patients. Speech intelligibility can be used to estimate the extent to which the auditory prosthesis improves the user's speech comprehension. Although behavior-based speech intelligibility is the gold standard, precise evaluation is limited due to its subjectiveness. Here, we used a convolutional neural network to predict speech intelligibility from electroencephalography (EEG). Sixty-four-channel EEGs were recorded from 87 adult participants with normal hearing. Sentences spectrally degraded by a 2-, 3-, 4-, 5-, and 8-channel vocoder were used to set relatively low speech intelligibility conditions. A Korean sentence recognition test was used. The speech intelligibility scores were divided into 41 discrete levels ranging from 0 to 100%, with a step of 2.5%. Three scores, namely 30.0, 37.5, and 40.0%, were not collected. The speech features, i.e., the speech temporal envelope (ENV) and phoneme (PH) onset, were used to extract continuous-speech EEGs for speech intelligibility prediction. The deep learning model was trained by a dataset of event-related potentials (ERP), correlation coefficients between the ERPs and ENVs, between the ERPs and PH onset, or between ERPs and the product of the multiplication of PH and ENV (PHENV). The speech intelligibility prediction accuracies were 97.33% (ERP), 99.42% (ENV), 99.55% (PH), and 99.91% (PHENV). The models were interpreted using the occlusion sensitivity approach. While the ENV models' informative electrodes were located in the occipital area, the informative electrodes of the phoneme models, i.e., PH and PHENV, were based on the occlusion sensitivity map located in the language processing area. Of the models tested, the PHENV model obtained the best speech intelligibility prediction accuracy. This model may promote clinical prediction of speech intelligibility with a comfort speech intelligibility test.

摘要

听觉假体为听力受损患者的康复提供了机会。言语可懂度可用于评估听觉假体改善用户言语理解的程度。虽然基于行为的言语可懂度是金标准,但由于其主观性,精确评估受到限制。在此,我们使用卷积神经网络从脑电图(EEG)预测言语可懂度。记录了87名听力正常的成年参与者的64通道脑电图。使用由2、3、4、5和8通道声码器频谱降级的句子来设置相对较低的言语可懂度条件。采用韩语句子识别测试。言语可懂度分数分为41个离散级别,范围从0到100%,步长为2.5%。未收集30.0%、37.5%和40.0%这三个分数。使用语音特征,即语音时间包络(ENV)和声位(PH)起始,提取连续语音脑电图用于言语可懂度预测。深度学习模型由事件相关电位(ERP)数据集、ERP与ENV之间、ERP与PH起始之间或ERP与PH和ENV乘积(PHENV)之间的相关系数进行训练。言语可懂度预测准确率分别为97.33%(ERP)、99.42%(ENV)、99.55%(PH)和99.91%(PHENV)。使用遮挡敏感性方法对模型进行解释。虽然ENV模型的信息电极位于枕叶区域,但音素模型(即PH和PHENV)的信息电极基于位于语言处理区域的遮挡敏感性图。在所测试的模型中,PHENV模型获得了最佳的言语可懂度预测准确率。该模型可能通过舒适的言语可懂度测试促进言语可懂度的临床预测。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验