Suppr超能文献

添加背景噪声可提高理想二值掩蔽噪声语音的可懂度。

Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise.

机构信息

Department of Machine Intelligence, Peking University, Beijing 100871, China.

出版信息

J Acoust Soc Am. 2011 Apr;129(4):2227-36. doi: 10.1121/1.3559707.

Abstract

When a target-speech/masker mixture is processed with the signal-separation technique, ideal binary mask (IBM), intelligibility of target speech is remarkably improved in both normal-hearing listeners and hearing-impaired listeners. Intelligibility of speech can also be improved by filling in speech gaps with un-modulated broadband noise. This study investigated whether intelligibility of target speech in the IBM-treated target-speech/masker mixture can be further improved by adding a broadband-noise background. The results of this study show that following the IBM manipulation, which remarkably released target speech from speech-spectrum noise, foreign-speech, or native-speech masking (experiment 1), adding a broadband-noise background with the signal-to-noise ratio no less than 4 dB significantly improved intelligibility of target speech when the masker was either noise (experiment 2) or speech (experiment 3). The results suggest that since adding the noise background shallows the areas of silence in the time-frequency domain of the IBM-treated target-speech/masker mixture, the abruption of transient changes in the mixture is smoothed and the perceived continuity of target-speech components becomes enhanced, leading to improved target-speech intelligibility. The findings are useful for advancing computational auditory scene analysis, hearing-aid/cochlear-implant designs, and understanding of speech perception under "cocktail-party" conditions.

摘要

当目标语音/掩蔽混合信号采用信号分离技术处理时,正常听力和听力受损的听力者的目标语音可懂度都会显著提高。通过用未调制的宽带噪声填补语音间隙,也可以提高语音的可懂度。本研究探讨了在经过理想二进制掩蔽(IBM)处理的目标语音/掩蔽混合物中添加宽带噪声背景是否可以进一步提高目标语音的可懂度。研究结果表明,在经过 IBM 处理后,目标语音明显从语音频谱噪声、外语或本族语掩蔽中释放出来(实验 1),当掩蔽器是噪声(实验 2)或语音(实验 3)时,添加信噪比不小于 4dB 的宽带噪声背景可显著提高目标语音的可懂度。结果表明,由于添加噪声背景会使 IBM 处理后的目标语音/掩蔽混合物的时频域中的静音区域变浅,因此混合物中瞬态变化的不连续性会得到平滑,从而增强目标语音成分的感知连续性,从而提高目标语音的可懂度。这些发现对于推进计算听觉场景分析、助听器/人工耳蜗设计以及理解“鸡尾酒会”条件下的语音感知非常有用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验