基于卷积神经网络的人工耳蜗植入者语音增强技术

Convolutional Neural Network-based Speech Enhancement for Cochlear Implant Recipients.

作者信息

Mamun Nursadul, Khorram Soheil, Hansen John H L

机构信息

Cochlear Implant Processing Laboratory, Center for Robust Speech Systems (CRSS-CILab), Department of Electrical & Computer Engineering, The University of Texas at Dallas.

出版信息

Interspeech. 2019 Sep;2019:4265-4269. doi: 10.21437/interspeech.2019-1850.

DOI:10.21437/interspeech.2019-1850

PMID:34307643

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8296973/

Abstract

Attempts to develop speech enhancement algorithms with improved speech intelligibility for cochlear implant (CI) users have met with limited success. To improve speech enhancement methods for CI users, we propose to perform speech enhancement in a cochlear filter-bank feature space, a feature-set specifically designed for CI users based on CI auditory stimuli. We leverage a convolutional neural network (CNN) to extract both stationary and non-stationary components of environmental acoustics and speech. We propose three CNN architectures: (1) vanilla CNN that directly generates the enhanced signal; (2) spectral-subtraction-style CNN (SS-CNN) that first predicts noise and then generates the enhanced signal by subtracting noise from the noisy signal; (3) Wiener-style CNN (Wiener-CNN) that generates an optimal mask for suppressing noise. An important problem of the proposed networks is that they introduce considerable delays, which limits their real-time application for CI users. To address this, this study also considers causal variations of these networks. Our experiments show that the proposed networks (both causal and non-causal forms) achieve significant improvement over existing baseline systems. We also found that causal Wiener-CNN outperforms other networks, and leads to the best overall envelope coefficient measure (ECM). The proposed algorithms represent a viable option for implementation on the CCi-MOBILE research platform as a pre-processor for CI users in naturalistic environments.

摘要

为提高人工耳蜗（CI）使用者的语音清晰度而开发语音增强算法的尝试取得的成功有限。为改进针对CI使用者的语音增强方法，我们建议在耳蜗滤波器组特征空间中进行语音增强，该特征集是基于CI听觉刺激专门为CI使用者设计的。我们利用卷积神经网络（CNN）来提取环境声学和语音的平稳和非平稳成分。我们提出了三种CNN架构：（1）直接生成增强信号的普通CNN；（2）先预测噪声然后通过从噪声信号中减去噪声来生成增强信号的谱减法风格CNN（SS-CNN）；（3）生成用于抑制噪声的最优掩码的维纳风格CNN（Wiener-CNN）。所提出网络的一个重要问题是它们会引入相当大的延迟，这限制了它们在CI使用者中的实时应用。为解决这个问题，本研究还考虑了这些网络的因果变体。我们的实验表明，所提出的网络（因果和非因果形式）相对于现有的基线系统都有显著改进。我们还发现因果维纳CNN优于其他网络，并导致最佳的总体包络系数度量（ECM）。所提出的算法是在CCi-MOBILE研究平台上作为自然环境中CI使用者的预处理器进行实现的一个可行选择。

相似文献

Convolutional Neural Network-based Speech Enhancement for Cochlear Implant Recipients.基于卷积神经网络的人工耳蜗植入者语音增强技术

Interspeech. 2019 Sep;2019:4265-4269. doi: 10.21437/interspeech.2019-1850.

Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users.基于神经网络的语音增强技术可提高人工耳蜗使用者在噪声环境中的语音清晰度。

Hear Res. 2017 Feb;344:183-194. doi: 10.1016/j.heares.2016.11.012. Epub 2016 Nov 30.

Experimental Investigation of Acoustic Features to Optimize Intelligibility in Cochlear Implants.实验研究优化人工耳蜗植入中可懂度的声学特征。

Sensors (Basel). 2023 Aug 31;23(17):7553. doi: 10.3390/s23177553.

CNN-based noise reduction for multi-channel speech enhancement system with discrete wavelet transform (DWT) preprocessing.基于卷积神经网络（CNN）的多通道语音增强系统的降噪方法，采用离散小波变换（DWT）预处理。

PeerJ Comput Sci. 2024 Feb 28;10:e1901. doi: 10.7717/peerj-cs.1901. eCollection 2024.

A Real-Time Convolutional Neural Network Based Speech Enhancement for Hearing Impaired Listeners Using Smartphone.一种基于实时卷积神经网络的、使用智能手机的听力受损者语音增强方法。

IEEE Access. 2019;7:78421-78433. doi: 10.1109/access.2019.2922370. Epub 2019 Jun 12.

Speech enhancement for cochlear implant recipients.人工耳蜗植入者的语音增强。

J Acoust Soc Am. 2018 Apr;143(4):2244. doi: 10.1121/1.5031112.

Speech onset enhancement improves intelligibility in adverse listening conditions for cochlear implant users.语音起始增强可改善人工耳蜗使用者在不利聆听条件下的言语可懂度。

Hear Res. 2016 Dec;342:13-22. doi: 10.1016/j.heares.2016.09.002. Epub 2016 Sep 30.

A convolutional neural network-based framework for analysis and assessment of non-linguistic sound classification and enhancement for normal hearing and cochlear implant listeners.一种基于卷积神经网络的框架，用于分析和评估正常听力和人工耳蜗听者的非语言声音分类及增强。

J Acoust Soc Am. 2022 Nov;152(5):2720. doi: 10.1121/10.0014955.

Application of Noise Reduction Algorithm ClearVoice in Cochlear Implant Processing: Effects on Noise Tolerance and Speech Intelligibility in Noise in Relation to Spectral Resolution.降噪算法ClearVoice在人工耳蜗处理中的应用：与频谱分辨率相关的对噪声耐受性和噪声中语音清晰度的影响

Ear Hear. 2015 May-Jun;36(3):357-67. doi: 10.1097/AUD.0000000000000125.

Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normal-hearing and cochlear implant listeners.理想的时频掩蔽算法在正常听力和人工耳蜗听众中会导致不同的言语可懂度和质量。

IEEE Trans Biomed Eng. 2015 Jan;62(1):331-41. doi: 10.1109/TBME.2014.2351854. Epub 2014 Aug 26.

引用本文的文献

Prediction of Auditory Performance in Cochlear Implants Using Machine Learning Methods: A Systematic Review.使用机器学习方法预测人工耳蜗的听觉性能：一项系统综述。

Audiol Res. 2025 May 8;15(3):56. doi: 10.3390/audiolres15030056.

Speech Enhancement for Cochlear Implant Recipients using Deep Complex Convolution Transformer with Frequency Transformation.使用具有频率变换的深度复卷积变换器对人工耳蜗植入者进行语音增强

IEEE/ACM Trans Audio Speech Lang Process. 2024;32:2616-2629. doi: 10.1109/taslp.2024.3366760. Epub 2024 Feb 22.

Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants.基于深度学习的人工耳蜗语音增强：一种权衡语音失真与噪声残留的损失函数

Front Med (Lausanne). 2021 Nov 8;8:740123. doi: 10.3389/fmed.2021.740123. eCollection 2021.

Quantifying Cochlear Implant Users' Ability for Speaker Identification using CI Auditory Stimuli.使用人工耳蜗听觉刺激量化人工耳蜗使用者的说话者识别能力。

Interspeech. 2019 Sep;2019:3118-3122. doi: 10.21437/interspeech.2019-1852.

本文引用的文献

Jointly Aligning and Predicting Continuous Emotion Annotations.联合对齐与预测连续情感注释

IEEE Trans Affect Comput. 2021 Oct-Dec;12(4):1069-1083. doi: 10.1109/taffc.2019.2917047. Epub 2019 May 16.

Quantifying Cochlear Implant Users' Ability for Speaker Identification using CI Auditory Stimuli.使用人工耳蜗听觉刺激量化人工耳蜗使用者的说话者识别能力。

Interspeech. 2019 Sep;2019:3118-3122. doi: 10.21437/interspeech.2019-1852.

CCi-MOBILE: Design and Evaluation of a Cochlear Implant and Hearing Aid Research Platform for Speech Scientists and Engineers.CCi-MOBILE：面向语音科学家和工程师的人工耳蜗与助听器研究平台的设计与评估

IEEE EMBS Int Conf Biomed Health Inform. 2019 May;2019. doi: 10.1109/BHI.2019.8834652. Epub 2019 Sep 12.

Cochlear implant failures and reimplantation: A 30-year analysis and literature review.人工耳蜗植入失败与再次植入：30 年分析与文献回顾。

Laryngoscope. 2020 Mar;130(3):782-789. doi: 10.1002/lary.28071. Epub 2019 May 21.

Near physiological spectral selectivity of cochlear optogenetics.耳蜗光遗传学的近生理光谱选择性。

Nat Commun. 2019 Apr 29;10(1):1962. doi: 10.1038/s41467-019-09980-7.

Speech enhancement for cochlear implant recipients.人工耳蜗植入者的语音增强。

J Acoust Soc Am. 2018 Apr;143(4):2244. doi: 10.1121/1.5031112.

Hear Res. 2017 Feb;344:183-194. doi: 10.1016/j.heares.2016.11.012. Epub 2016 Nov 30.

Predicting the speech reception threshold of cochlear implant listeners using an envelope-correlation based measure.基于包络相关测量的人工耳蜗植入者言语接受阈的预测。

J Acoust Soc Am. 2012 Nov;132(5):3399-405. doi: 10.1121/1.4754539.

Cochlear implants: system design, integration, and evaluation.人工耳蜗：系统设计、集成和评估。

IEEE Rev Biomed Eng. 2008;1:115-42. doi: 10.1109/RBME.2008.2008250. Epub 2008 Nov 5.

Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants.噪声环境下语音识别与频谱通道数量的关系：声学听力与人工耳蜗的比较

J Acoust Soc Am. 2001 Aug;110(2):1150-63. doi: 10.1121/1.1381538.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验