Suppr超能文献

为正常听力听众测量混响语音编码语音的客观可懂度:促进人工耳蜗语音增强算法的发展。

Objective intelligibility measurement of reverberant vocoded speech for normal-hearing listeners: Towards facilitating the development of speech enhancement algorithms for cochlear implants.

机构信息

Department of Electrical and Computer Engineering, Duke University, Durham, North Carolina 27701, USA.

出版信息

J Acoust Soc Am. 2024 Mar 1;155(3):2151-2168. doi: 10.1121/10.0025285.

Abstract

Cochlear implant (CI) recipients often struggle to understand speech in reverberant environments. Speech enhancement algorithms could restore speech perception for CI listeners by removing reverberant artifacts from the CI stimulation pattern. Listening studies, either with cochlear-implant recipients or normal-hearing (NH) listeners using a CI acoustic model, provide a benchmark for speech intelligibility improvements conferred by the enhancement algorithm but are costly and time consuming. To reduce the associated costs during algorithm development, speech intelligibility could be estimated offline using objective intelligibility measures. Previous evaluations of objective measures that considered CIs primarily assessed the combined impact of noise and reverberation and employed highly accurate enhancement algorithms. To facilitate the development of enhancement algorithms, we evaluate twelve objective measures in reverberant-only conditions characterized by a gradual reduction of reverberant artifacts, simulating the performance of an enhancement algorithm during development. Measures are validated against the performance of NH listeners using a CI acoustic model. To enhance compatibility with reverberant CI-processed signals, measure performance was assessed after modifying the reference signal and spectral filterbank. Measures leveraging the speech-to-reverberant ratio, cepstral distance and, after modifying the reference or filterbank, envelope correlation are strong predictors of intelligibility for reverberant CI-processed speech.

摘要

人工耳蜗(CI)使用者在混响环境中理解言语常常会有困难。言语增强算法可以通过从 CI 刺激模式中去除混响伪影来恢复 CI 使用者的言语感知。使用 CI 声学模型进行的听力研究,无论是针对 CI 使用者还是正常听力(NH)使用者,都为增强算法带来的言语可懂度提高提供了基准,但这些研究既昂贵又耗时。为了在算法开发过程中降低相关成本,可以使用客观可懂度指标在线下估计言语可懂度。之前评估考虑了 CI 的客观指标主要评估了噪声和混响的综合影响,并采用了高度精确的增强算法。为了促进增强算法的发展,我们在仅混响的条件下评估了 12 种客观指标,这些条件下混响伪影逐渐减少,模拟了增强算法在开发过程中的性能。使用 CI 声学模型评估了这些指标与 NH 听众表现的相关性。为了提高与混响 CI 处理信号的兼容性,在修改参考信号和频谱滤波器组后评估了指标性能。利用语音与混响比、倒谱距离以及在修改参考或滤波器组后利用包络相关的指标是混响 CI 处理语音可懂度的强预测因子。

相似文献

4
Predicting the intelligibility of vocoded speech.语音编码语音可懂度预测。
Ear Hear. 2011 May-Jun;32(3):331-8. doi: 10.1097/AUD.0b013e3181ff3515.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验