使用具有连续言语诱发皮层听觉反应的深度学习模型进行客观言语可懂度预测。

Objective speech intelligibility prediction using a deep learning model with continuous speech-evoked cortical auditory responses.

作者信息

Na Youngmin, Joo Hyosung, Trang Le Thi, Quan Luong Do Anh, Woo Jihwan

机构信息

Department of Biomedical Engineering, University of Ulsan, Ulsan, South Korea.

Department of Electrical, Electronic and Computer Engineering, University of Ulsan, Ulsan, South Korea.

出版信息

Front Neurosci. 2022 Aug 18;16:906616. doi: 10.3389/fnins.2022.906616. eCollection 2022.

DOI:10.3389/fnins.2022.906616

PMID:36061597

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9433707/

Abstract

Auditory prostheses provide an opportunity for rehabilitation of hearing-impaired patients. Speech intelligibility can be used to estimate the extent to which the auditory prosthesis improves the user's speech comprehension. Although behavior-based speech intelligibility is the gold standard, precise evaluation is limited due to its subjectiveness. Here, we used a convolutional neural network to predict speech intelligibility from electroencephalography (EEG). Sixty-four-channel EEGs were recorded from 87 adult participants with normal hearing. Sentences spectrally degraded by a 2-, 3-, 4-, 5-, and 8-channel vocoder were used to set relatively low speech intelligibility conditions. A Korean sentence recognition test was used. The speech intelligibility scores were divided into 41 discrete levels ranging from 0 to 100%, with a step of 2.5%. Three scores, namely 30.0, 37.5, and 40.0%, were not collected. The speech features, i.e., the speech temporal envelope (ENV) and phoneme (PH) onset, were used to extract continuous-speech EEGs for speech intelligibility prediction. The deep learning model was trained by a dataset of event-related potentials (ERP), correlation coefficients between the ERPs and ENVs, between the ERPs and PH onset, or between ERPs and the product of the multiplication of PH and ENV (PHENV). The speech intelligibility prediction accuracies were 97.33% (ERP), 99.42% (ENV), 99.55% (PH), and 99.91% (PHENV). The models were interpreted using the occlusion sensitivity approach. While the ENV models' informative electrodes were located in the occipital area, the informative electrodes of the phoneme models, i.e., PH and PHENV, were based on the occlusion sensitivity map located in the language processing area. Of the models tested, the PHENV model obtained the best speech intelligibility prediction accuracy. This model may promote clinical prediction of speech intelligibility with a comfort speech intelligibility test.

摘要

听觉假体为听力受损患者的康复提供了机会。言语可懂度可用于评估听觉假体改善用户言语理解的程度。虽然基于行为的言语可懂度是金标准，但由于其主观性，精确评估受到限制。在此，我们使用卷积神经网络从脑电图（EEG）预测言语可懂度。记录了87名听力正常的成年参与者的64通道脑电图。使用由2、3、4、5和8通道声码器频谱降级的句子来设置相对较低的言语可懂度条件。采用韩语句子识别测试。言语可懂度分数分为41个离散级别，范围从0到100%，步长为2.5%。未收集30.0%、37.5%和40.0%这三个分数。使用语音特征，即语音时间包络（ENV）和声位（PH）起始，提取连续语音脑电图用于言语可懂度预测。深度学习模型由事件相关电位（ERP）数据集、ERP与ENV之间、ERP与PH起始之间或ERP与PH和ENV乘积（PHENV）之间的相关系数进行训练。言语可懂度预测准确率分别为97.33%（ERP）、99.42%（ENV）、99.55%（PH）和99.91%（PHENV）。使用遮挡敏感性方法对模型进行解释。虽然ENV模型的信息电极位于枕叶区域，但音素模型（即PH和PHENV）的信息电极基于位于语言处理区域的遮挡敏感性图。在所测试的模型中，PHENV模型获得了最佳的言语可懂度预测准确率。该模型可能通过舒适的言语可懂度测试促进言语可懂度的临床预测。

相似文献

Objective speech intelligibility prediction using a deep learning model with continuous speech-evoked cortical auditory responses.使用具有连续言语诱发皮层听觉反应的深度学习模型进行客观言语可懂度预测。

Front Neurosci. 2022 Aug 18;16:906616. doi: 10.3389/fnins.2022.906616. eCollection 2022.

Predictions of Speech Chimaera Intelligibility Using Auditory Nerve Mean-Rate and Spike-Timing Neural Cues.使用听觉神经平均发放率和峰电位时间神经线索预测言语嵌合体可懂度

J Assoc Res Otolaryngol. 2017 Oct;18(5):687-710. doi: 10.1007/s10162-017-0627-7. Epub 2017 Jul 26.

On the relationship between auditory cognition and speech intelligibility in cochlear implant users: An ERP study.人工耳蜗使用者听觉认知与言语可懂度之间的关系：一项事件相关电位研究

Neuropsychologia. 2016 Jul 1;87:169-181. doi: 10.1016/j.neuropsychologia.2016.05.019. Epub 2016 May 19.

Prediction of Speech Intelligibility by Means of EEG Responses to Sentences in Noise.通过对噪声中句子的脑电图反应预测言语可懂度

Front Neurosci. 2022 Jun 1;16:876421. doi: 10.3389/fnins.2022.876421. eCollection 2022.

Auditory models of suprathreshold distortion and speech intelligibility in persons with impaired hearing.听力受损者的超阈值失真与言语可懂度的听觉模型。

J Am Acad Audiol. 2013 Apr;24(4):307-28. doi: 10.3766/jaaa.24.4.6.

Cortical auditory responses index the contributions of different RMS-level-dependent segments to speech intelligibility.皮层听觉反应指数不同 RMS 水平相关段对口音可懂度的贡献。

Hear Res. 2019 Nov;383:107808. doi: 10.1016/j.heares.2019.107808. Epub 2019 Oct 4.

An Evaluation of Output Signal to Noise Ratio as a Predictor of Cochlear Implant Speech Intelligibility.输出信噪比评估作为人工耳蜗言语可懂度预测指标的研究。

Ear Hear. 2018 Sep/Oct;39(5):958-968. doi: 10.1097/AUD.0000000000000556.

Transcranial alternating current stimulation with speech envelopes modulates speech comprehension.经颅交流电刺激与语音包络调制语音理解。

Neuroimage. 2018 May 15;172:766-774. doi: 10.1016/j.neuroimage.2018.01.038. Epub 2018 Jan 31.

Effects of directional sound processing and listener's motivation on EEG responses to continuous noisy speech: Do normal-hearing and aided hearing-impaired listeners differ?方向性声音处理和听者动机对连续噪声语音的 EEG 反应的影响：正常听力和助听听力障碍者是否不同？

Hear Res. 2019 Jun;377:260-270. doi: 10.1016/j.heares.2019.04.005. Epub 2019 Apr 11.

Spectrotemporal modulation sensitivity as a predictor of speech intelligibility for hearing-impaired listeners.作为听力受损听众言语可懂度预测指标的频谱时间调制敏感性

J Am Acad Audiol. 2013 Apr;24(4):293-306. doi: 10.3766/jaaa.24.4.5.

本文引用的文献

Prediction of Speech Intelligibility by Means of EEG Responses to Sentences in Noise.通过对噪声中句子的脑电图反应预测言语可懂度

Front Neurosci. 2022 Jun 1;16:876421. doi: 10.3389/fnins.2022.876421. eCollection 2022.

Predicting speech intelligibility from a selective attention decoding paradigm in cochlear implant users.从人工耳蜗使用者的选择性注意解码范式预测言语可懂度。

J Neural Eng. 2022 Apr 12;19(2). doi: 10.1088/1741-2552/ac599f.

Variability of EEG electrode positions and their underlying brain regions: visualizing gel artifacts from a simultaneous EEG-fMRI dataset.脑电电极位置及其潜在脑区的变异性：从同时进行的脑电-功能磁共振成像（EEG-fMRI）数据集中可视化凝胶伪影。

Brain Behav. 2022 Feb;12(2):e2476. doi: 10.1002/brb3.2476. Epub 2022 Jan 18.

Linear Modeling of Neurophysiological Responses to Speech and Other Continuous Stimuli: Methodological Considerations for Applied Research.对语音和其他连续刺激的神经生理反应的线性建模：应用研究的方法学考量

Front Neurosci. 2021 Nov 22;15:705621. doi: 10.3389/fnins.2021.705621. eCollection 2021.

Editorial: Explainable Artificial Intelligence (XAI) in Systems Neuroscience.社论：系统神经科学中的可解释人工智能（XAI）

Front Syst Neurosci. 2021 Oct 29;15:766980. doi: 10.3389/fnsys.2021.766980. eCollection 2021.

Differentiation of COVID-19 conditions in planar chest radiographs using optimized convolutional neural networks.使用优化的卷积神经网络在胸部平面X光片中鉴别新型冠状病毒肺炎病情

Appl Intell (Dordr). 2021;51(5):2764-2775. doi: 10.1007/s10489-020-01941-8. Epub 2020 Nov 6.

Predicting speech intelligibility from EEG in a non-linear classification paradigm.基于非线性分类范式的脑电语音可懂度预测。

J Neural Eng. 2021 Nov 15;18(6). doi: 10.1088/1741-2552/ac33e9.

Explainable Deep Learning Models in Medical Image Analysis.医学图像分析中的可解释深度学习模型

J Imaging. 2020 Jun 20;6(6):52. doi: 10.3390/jimaging6060052.

Ensemble Regularized Common Spatio-Spectral Pattern (ensemble RCSSP) model for motor imagery-based EEG signal classification.基于集成正则化共空间-谱模式（ensemble RCSSP）模型的脑电信号运动想象分类。

Comput Biol Med. 2021 Aug;135:104546. doi: 10.1016/j.compbiomed.2021.104546. Epub 2021 Jun 11.

End-To-End Alzheimer's Disease Diagnosis and Biomarker Identification.端到端的阿尔茨海默病诊断与生物标志物识别

Mach Learn Med Imaging. 2018 Sep;11046:337-345. doi: 10.1007/978-3-030-00919-9_39. Epub 2018 Sep 15.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用具有连续言语诱发皮层听觉反应的深度学习模型进行客观言语可懂度预测。

Objective speech intelligibility prediction using a deep learning model with continuous speech-evoked cortical auditory responses.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献