Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, United Kingdom.
J Acoust Soc Am. 2012 Jul;132(1):317-26. doi: 10.1121/1.4725766.
Stone et al. [J. Acoust. Soc Am. 130, 2874-2881 (2011)], using vocoder processing, showed that the envelope modulations of a notionally steady noise were more effective than the envelope energy as a masker of speech. Here the same effect is demonstrated using non-vocoded signals. Speech was filtered into 28 channels. A masker centered on each channel was added to the channel signal at a target-to-background ratio of -5 or -10 dB. Maskers were sinusoids or noise bands with bandwidth 1/3 or 1 ERB(N) (ERB(N) being the bandwidth of "normal" auditory filters), synthesized with Gaussian (GN) or low-noise (LNN) statistics. To minimize peripheral interactions between maskers, odd-numbered channels were presented to one ear and even to the other. Speech intelligibility was assessed in the presence of each "steady" masker and that masker 100% sinusoidally amplitude modulated (SAM) at 8 Hz. Intelligibility decreased with increasing envelope fluctuation of the maskers. Masking release, the difference in intelligibility between the SAM and its "steady" counterpart, increased with bandwidth from near-zero to around 50 percentage points for the 1-ERB(N) GN. It is concluded that the sinusoidal and GN maskers behaved primarily as energetic and modulation maskers, respectively.
斯通等人[J. Acoust. Soc Am. 130, 2874-2881 (2011)]利用声码器处理表明,拟稳态噪声的包络调制比包络能量更能有效地掩蔽语音。本文使用非声码化信号证明了相同的效果。语音被过滤到 28 个通道。在目标与背景的比例为-5 或-10 dB 的情况下,在每个通道信号上添加以中心位于每个通道的掩蔽信号。掩蔽信号为具有 1/3 或 1 ERB(N)带宽的正弦波或噪声带(ERB(N)是“正常”听觉滤波器的带宽),使用高斯(GN)或低噪声(LNN)统计信息合成。为了最小化掩蔽器之间的外围相互作用,奇数次通道被呈现给一只耳朵,偶数通道呈现给另一只耳朵。在每个“稳态”掩蔽器以及该掩蔽器以 8 Hz 的 100%正弦幅度调制(SAM)的情况下评估语音可懂度。可懂度随着掩蔽器的包络波动增加而降低。掩蔽释放,即 SAM 与其“稳态”对应物之间的可懂度差异,随着带宽从近零增加到 1-ERB(N)GN 的约 50 个百分点。得出的结论是,正弦波和 GN 掩蔽器分别主要表现为能量和调制掩蔽器。