基于理想二元时频掩蔽的背景噪声下语音清晰度

Speech intelligibility in background noise with ideal binary time-frequency masking.

作者信息

Wang DeLiang, Kjems Ulrik, Pedersen Michael S, Boldt Jesper B, Lunner Thomas

机构信息

Department of Computer Science & Engineering and Center for Cognitive Science, The Ohio State University, Columbus, Ohio 43210, USA.

出版信息

J Acoust Soc Am. 2009 Apr;125(4):2336-47. doi: 10.1121/1.3083233.

DOI:10.1121/1.3083233

PMID:19354408

Abstract

Ideal binary time-frequency masking is a signal separation technique that retains mixture energy in time-frequency units where local signal-to-noise ratio exceeds a certain threshold and rejects mixture energy in other time-frequency units. Two experiments were designed to assess the effects of ideal binary masking on speech intelligibility of both normal-hearing (NH) and hearing-impaired (HI) listeners in different kinds of background interference. The results from Experiment 1 demonstrate that ideal binary masking leads to substantial reductions in speech-reception threshold for both NH and HI listeners, and the reduction is greater in a cafeteria background than in a speech-shaped noise. Furthermore, listeners with hearing loss benefit more than listeners with normal hearing, particularly for cafeteria noise, and ideal masking nearly equalizes the speech intelligibility performances of NH and HI listeners in noisy backgrounds. The results from Experiment 2 suggest that ideal binary masking in the low-frequency range yields larger intelligibility improvements than in the high-frequency range, especially for listeners with hearing loss. The findings from the two experiments have major implications for understanding speech perception in noise, computational auditory scene analysis, speech enhancement, and hearing aid design.

摘要

理想二元时频掩蔽是一种信号分离技术，它在局部信噪比超过特定阈值的时频单元中保留混合能量，并在其他时频单元中拒绝混合能量。设计了两个实验来评估理想二元掩蔽对不同背景干扰下正常听力（NH）和听力受损（HI）听众语音可懂度的影响。实验1的结果表明，理想二元掩蔽可使NH和HI听众的言语接受阈大幅降低，且在自助餐厅背景下的降低幅度大于在言语噪声背景下。此外，听力损失听众比正常听力听众受益更多，尤其是在自助餐厅噪声环境下，理想掩蔽几乎使NH和HI听众在噪声背景下的语音可懂度表现趋于平等。实验2的结果表明，低频范围内的理想二元掩蔽比高频范围内能带来更大的可懂度提升，尤其是对于听力损失听众。这两个实验的结果对于理解噪声中的语音感知、计算听觉场景分析、语音增强和助听器设计具有重要意义。