Suppr超能文献

利用全卷积神经网络提高电刺激和声刺激模拟语音的可懂度。

Improving the Intelligibility of Speech for Simulated Electric and Acoustic Stimulation Using Fully Convolutional Neural Networks.

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2021;29:184-195. doi: 10.1109/TNSRE.2020.3042655. Epub 2021 Feb 26.

Abstract

Combined electric and acoustic stimulation (EAS) has demonstrated better speech recognition than conventional cochlear implant (CI) and yielded satisfactory performance under quiet conditions. However, when noise signals are involved, both the electric signal and the acoustic signal may be distorted, thereby resulting in poor recognition performance. To suppress noise effects, speech enhancement (SE) is a necessary unit in EAS devices. Recently, a time-domain speech enhancement algorithm based on the fully convolutional neural networks (FCN) with a short-time objective intelligibility (STOI)-based objective function (termed FCN(S) in short) has received increasing attention due to its simple structure and effectiveness of restoring clean speech signals from noisy counterparts. With evidence showing the benefits of FCN(S) for normal speech, this study sets out to assess its ability to improve the intelligibility of EAS simulated speech. Objective evaluations and listening tests were conducted to examine the performance of FCN(S) in improving the speech intelligibility of normal and vocoded speech in noisy environments. The experimental results show that, compared with the traditional minimum-mean square-error SE method and the deep denoising autoencoder SE method, FCN(S) can obtain better gain in the speech intelligibility for normal as well as vocoded speech. This study, being the first to evaluate deep learning SE approaches for EAS, confirms that FCN(S) is an effective SE approach that may potentially be integrated into an EAS processor to benefit users in noisy environments.

摘要

联合电声刺激 (EAS) 已被证明比传统的人工耳蜗 (CI) 具有更好的语音识别能力,并在安静环境下产生了令人满意的性能。然而,当涉及噪声信号时,电信号和声学信号都可能会失真,从而导致识别性能不佳。为了抑制噪声的影响,语音增强 (SE) 是 EAS 设备中的一个必要单元。最近,一种基于全卷积神经网络 (FCN) 的时域语音增强算法,由于其结构简单,并且能够有效地从噪声信号中恢复干净的语音信号,因此基于短期目标可懂度 (STOI) 的目标函数 (简称 FCN(S)) 受到了越来越多的关注。有证据表明 FCN(S) 对正常语音有益,本研究旨在评估其提高 EAS 模拟语音可懂度的能力。通过客观评估和听力测试,研究了 FCN(S) 在改善噪声环境下正常语音和语音编码语音的可懂度方面的性能。实验结果表明,与传统的最小均方误差 SE 方法和深度去噪自动编码器 SE 方法相比,FCN(S) 可以在正常语音和语音编码语音的可懂度方面获得更好的增益。本研究首次评估了用于 EAS 的深度学习 SE 方法,证实了 FCN(S) 是一种有效的 SE 方法,它可能被集成到 EAS 处理器中,使在噪声环境下的用户受益。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验