Suppr超能文献

基于卷积神经网络的人工耳蜗植入者语音增强技术

Convolutional Neural Network-based Speech Enhancement for Cochlear Implant Recipients.

作者信息

Mamun Nursadul, Khorram Soheil, Hansen John H L

机构信息

Cochlear Implant Processing Laboratory, Center for Robust Speech Systems (CRSS-CILab), Department of Electrical & Computer Engineering, The University of Texas at Dallas.

出版信息

Interspeech. 2019 Sep;2019:4265-4269. doi: 10.21437/interspeech.2019-1850.

Abstract

Attempts to develop speech enhancement algorithms with improved speech intelligibility for cochlear implant (CI) users have met with limited success. To improve speech enhancement methods for CI users, we propose to perform speech enhancement in a cochlear filter-bank feature space, a feature-set specifically designed for CI users based on CI auditory stimuli. We leverage a convolutional neural network (CNN) to extract both stationary and non-stationary components of environmental acoustics and speech. We propose three CNN architectures: (1) vanilla CNN that directly generates the enhanced signal; (2) spectral-subtraction-style CNN (SS-CNN) that first predicts noise and then generates the enhanced signal by subtracting noise from the noisy signal; (3) Wiener-style CNN (Wiener-CNN) that generates an optimal mask for suppressing noise. An important problem of the proposed networks is that they introduce considerable delays, which limits their real-time application for CI users. To address this, this study also considers causal variations of these networks. Our experiments show that the proposed networks (both causal and non-causal forms) achieve significant improvement over existing baseline systems. We also found that causal Wiener-CNN outperforms other networks, and leads to the best overall envelope coefficient measure (ECM). The proposed algorithms represent a viable option for implementation on the CCi-MOBILE research platform as a pre-processor for CI users in naturalistic environments.

摘要

为提高人工耳蜗(CI)使用者的语音清晰度而开发语音增强算法的尝试取得的成功有限。为改进针对CI使用者的语音增强方法,我们建议在耳蜗滤波器组特征空间中进行语音增强,该特征集是基于CI听觉刺激专门为CI使用者设计的。我们利用卷积神经网络(CNN)来提取环境声学和语音的平稳和非平稳成分。我们提出了三种CNN架构:(1)直接生成增强信号的普通CNN;(2)先预测噪声然后通过从噪声信号中减去噪声来生成增强信号的谱减法风格CNN(SS-CNN);(3)生成用于抑制噪声的最优掩码的维纳风格CNN(Wiener-CNN)。所提出网络的一个重要问题是它们会引入相当大的延迟,这限制了它们在CI使用者中的实时应用。为解决这个问题,本研究还考虑了这些网络的因果变体。我们的实验表明,所提出的网络(因果和非因果形式)相对于现有的基线系统都有显著改进。我们还发现因果维纳CNN优于其他网络,并导致最佳的总体包络系数度量(ECM)。所提出的算法是在CCi-MOBILE研究平台上作为自然环境中CI使用者的预处理器进行实现的一个可行选择。

相似文献

6
Speech enhancement for cochlear implant recipients.人工耳蜗植入者的语音增强。
J Acoust Soc Am. 2018 Apr;143(4):2244. doi: 10.1121/1.5031112.

本文引用的文献

1
Jointly Aligning and Predicting Continuous Emotion Annotations.联合对齐与预测连续情感注释
IEEE Trans Affect Comput. 2021 Oct-Dec;12(4):1069-1083. doi: 10.1109/taffc.2019.2917047. Epub 2019 May 16.
6
Speech enhancement for cochlear implant recipients.人工耳蜗植入者的语音增强。
J Acoust Soc Am. 2018 Apr;143(4):2244. doi: 10.1121/1.5031112.
9
Cochlear implants: system design, integration, and evaluation.人工耳蜗:系统设计、集成和评估。
IEEE Rev Biomed Eng. 2008;1:115-42. doi: 10.1109/RBME.2008.2008250. Epub 2008 Nov 5.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验