Department of Computer Science and Engineering, The Ohio State University at Lima, Lima, Ohio 45804, USA.
J Acoust Soc Am. 2013 Mar;133(3):1707-17. doi: 10.1121/1.4789895.
Ideal binary masking is a signal processing technique that separates a desired signal from a mixture by retaining only the time-frequency units where the signal-to-noise ratio (SNR) exceeds a predetermined threshold. In reverberant conditions there are multiple possible definitions of the ideal binary mask in that one may choose to treat the target early reflections as either desired signal or noise. The ideal binary mask may therefore be parameterized by the reflection boundary, a predetermined division point between early and late reflections. Another important parameter is the local SNR threshold used in labeling the time-frequency units as either target or background. Two experiments were designed to assess the impact of these two parameters on speech intelligibility with ideal binary masking for normal-hearing listeners in reverberant conditions. Experiment 1 shows that in order to achieve intelligibility improvements only the early reflections should be preserved by the binary mask. Moreover, it shows that the effective SNR should be accounted for when deciding the local threshold optimal range. Experiment 2 shows that with long reverberation times, intelligibility improvements are only obtained when the reflection boundary is 100 ms or less. Also, the experiment suggests that binary masking can be used for dereverberation.
理想二值掩蔽是一种信号处理技术,通过仅保留信噪比(SNR)超过预定阈值的时频单元,从混合信号中分离出期望信号。在混响条件下,存在多种理想二值掩蔽的可能定义,因为可以选择将目标早期反射视为期望信号或噪声。因此,理想二值掩蔽可以通过反射边界进行参数化,反射边界是早期和晚期反射之间的预定划分点。另一个重要参数是用于将时频单元标记为目标或背景的局部 SNR 阈值。设计了两个实验来评估这两个参数对正常听力受试者在混响条件下使用理想二值掩蔽的言语可懂度的影响。实验 1 表明,为了实现可懂度的提高,二进制掩蔽只应保留早期反射。此外,实验表明,在决定局部阈值最佳范围时,应考虑有效 SNR。实验 2 表明,在较长的混响时间下,只有当反射边界为 100ms 或更短时,才会获得可懂度的提高。此外,该实验表明,二值掩蔽可用于去混响。