利用全卷积神经网络提高电刺激和声刺激模拟语音的可懂度。

Improving the Intelligibility of Speech for Simulated Electric and Acoustic Stimulation Using Fully Convolutional Neural Networks.

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2021;29:184-195. doi: 10.1109/TNSRE.2020.3042655. Epub 2021 Feb 26.

DOI:10.1109/TNSRE.2020.3042655

Abstract

Combined electric and acoustic stimulation (EAS) has demonstrated better speech recognition than conventional cochlear implant (CI) and yielded satisfactory performance under quiet conditions. However, when noise signals are involved, both the electric signal and the acoustic signal may be distorted, thereby resulting in poor recognition performance. To suppress noise effects, speech enhancement (SE) is a necessary unit in EAS devices. Recently, a time-domain speech enhancement algorithm based on the fully convolutional neural networks (FCN) with a short-time objective intelligibility (STOI)-based objective function (termed FCN(S) in short) has received increasing attention due to its simple structure and effectiveness of restoring clean speech signals from noisy counterparts. With evidence showing the benefits of FCN(S) for normal speech, this study sets out to assess its ability to improve the intelligibility of EAS simulated speech. Objective evaluations and listening tests were conducted to examine the performance of FCN(S) in improving the speech intelligibility of normal and vocoded speech in noisy environments. The experimental results show that, compared with the traditional minimum-mean square-error SE method and the deep denoising autoencoder SE method, FCN(S) can obtain better gain in the speech intelligibility for normal as well as vocoded speech. This study, being the first to evaluate deep learning SE approaches for EAS, confirms that FCN(S) is an effective SE approach that may potentially be integrated into an EAS processor to benefit users in noisy environments.

摘要

联合电声刺激 (EAS) 已被证明比传统的人工耳蜗 (CI) 具有更好的语音识别能力，并在安静环境下产生了令人满意的性能。然而，当涉及噪声信号时，电信号和声学信号都可能会失真，从而导致识别性能不佳。为了抑制噪声的影响，语音增强 (SE) 是 EAS 设备中的一个必要单元。最近，一种基于全卷积神经网络 (FCN) 的时域语音增强算法，由于其结构简单，并且能够有效地从噪声信号中恢复干净的语音信号，因此基于短期目标可懂度 (STOI) 的目标函数 (简称 FCN(S)) 受到了越来越多的关注。有证据表明 FCN(S) 对正常语音有益，本研究旨在评估其提高 EAS 模拟语音可懂度的能力。通过客观评估和听力测试，研究了 FCN(S) 在改善噪声环境下正常语音和语音编码语音的可懂度方面的性能。实验结果表明，与传统的最小均方误差 SE 方法和深度去噪自动编码器 SE 方法相比，FCN(S) 可以在正常语音和语音编码语音的可懂度方面获得更好的增益。本研究首次评估了用于 EAS 的深度学习 SE 方法，证实了 FCN(S) 是一种有效的 SE 方法，它可能被集成到 EAS 处理器中，使在噪声环境下的用户受益。

相似文献

Improving the Intelligibility of Speech for Simulated Electric and Acoustic Stimulation Using Fully Convolutional Neural Networks.利用全卷积神经网络提高电刺激和声刺激模拟语音的可懂度。

IEEE Trans Neural Syst Rehabil Eng. 2021;29:184-195. doi: 10.1109/TNSRE.2020.3042655. Epub 2021 Feb 26.

A Deep Denoising Autoencoder Approach to Improving the Intelligibility of Vocoded Speech in Cochlear Implant Simulation.一种用于提高人工耳蜗模拟中声码语音清晰度的深度去噪自动编码器方法。

IEEE Trans Biomed Eng. 2017 Jul;64(7):1568-1578. doi: 10.1109/TBME.2016.2613960. Epub 2016 Sep 27.

Effects of Additional Low-Pass-Filtered Speech on Listening Effort for Noise-Band-Vocoded Speech in Quiet and in Noise.附加低通滤波语音对安静和噪声环境下噪声带编码语音聆听努力的影响。

Ear Hear. 2019 Jan/Feb;40(1):3-17. doi: 10.1097/AUD.0000000000000587.

Speech Perception With Combined Electric-Acoustic Stimulation: A Simulation and Model Comparison.电声联合刺激下的言语感知：模拟与模型比较

Ear Hear. 2015 Nov-Dec;36(6):e314-25. doi: 10.1097/AUD.0000000000000178.

Predicting the intelligibility of vocoded speech.语音编码语音可懂度预测。

Ear Hear. 2011 May-Jun;32(3):331-8. doi: 10.1097/AUD.0b013e3181ff3515.

Comparing the effects of reverberation and of noise on speech recognition in simulated electric-acoustic listening.比较混响和噪声对模拟电声聆听中言语识别的影响。

J Acoust Soc Am. 2012 Jan;131(1):416-23. doi: 10.1121/1.3664101.

A physiologically-inspired model reproducing the speech intelligibility benefit in cochlear implant listeners with residual acoustic hearing.一种受生理启发的模型，再现了具有残余听觉的人工耳蜗聆听者的言语可懂度优势。

Hear Res. 2017 Feb;344:50-61. doi: 10.1016/j.heares.2016.10.023. Epub 2016 Nov 9.

Potential Benefits of an Integrated Electric-Acoustic Sound Processor with Children: A Preliminary Report.集成式电声声音处理器对儿童的潜在益处：初步报告。

J Am Acad Audiol. 2017 Feb;28(2):127-140. doi: 10.3766/jaaa.15133.

Masking release with changing fundamental frequency: Electric acoustic stimulation resembles normal hearing subjects.随着基频变化的掩蔽释放：电声刺激类似于正常听力受试者。

Hear Res. 2017 Jul;350:226-234. doi: 10.1016/j.heares.2017.05.004. Epub 2017 May 11.

Speech perception in individuals with auditory neuropathy.听觉神经病患者的言语感知

J Speech Lang Hear Res. 2006 Apr;49(2):367-80. doi: 10.1044/1092-4388(2006/029).

引用本文的文献

Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants.基于深度学习的人工耳蜗语音增强：一种权衡语音失真与噪声残留的损失函数

Front Med (Lausanne). 2021 Nov 8;8:740123. doi: 10.3389/fmed.2021.740123. eCollection 2021.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用全卷积神经网络提高电刺激和声刺激模拟语音的可懂度。

Improving the Intelligibility of Speech for Simulated Electric and Acoustic Stimulation Using Fully Convolutional Neural Networks.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献