Carlyon R P
MRC Applied Psychology Unit, Cambridge, England.
J Acoust Soc Am. 1996 Jan;99(1):517-24. doi: 10.1121/1.414510.
A series of experiments investigated listeners' ability to encode the fundamental frequency (F0) of a group of harmonics (the "target") in the presence of a second, spectrally overlapping, group (the "masker"). Experiment 1a was a sequential F0 discrimination task between two targets, whose F0s were geometrically centered on 210 Hz, in the presence of a 210-Hz masker. The target and the masker were bandpass filtered identically, either from 20 to 1420 Hz ("low-frequency" condition) or from 3900 to 5400 Hz ("high-frequency" condition). In the low-frequency condition the masker affected performance only moderately, regardless of whether it was gated synchronously with, or was turned on 150 ms before and off 150 ms after, each 200-ms target. In the high-frequency condition, the synchronous masker also had a moderate effect, but the asynchronous masker reduced performance dramatically. Whatever the masker gating, listeners did not hear the combination of the masker and target in this region as a mixture of two complex tones, but experienced a unitary noiselike or "crackle" percept. Experiment 1b showed that the large deterioration seen in the high-frequency condition of experiment 1a could be obtained in the low-frequency condition by reducing the F0 to 62.5 Hz, suggesting that the resolvability of adjacent harmonics was important for the effect. Experiment 2 required listeners to detect a difference in F0 ("delta F0") between two simultaneous groups of components, one filtered in the high region and the other in the low region. Performance was only slightly degraded by a continuous masker filtered in the low region, but was reduced to chance by a masker in the high region. Experiment 3 showed that, as the delta F0 between a masker and a target in the low region increased from 1% to 8%, listeners identified the mixture as sounding progressively "less fused," but this was not the case in the high region. It is concluded that listeners are poor at extracting the F0s of two groups of unresolved harmonics in the same frequency region. The experiments provide no evidence that listeners can use the leading part of an asynchronous masker to identify its F0 and thereby help extract the target's F0 from the mixture.
一系列实验研究了听众在存在第二个频谱重叠的谐波组(“掩蔽音”)的情况下,对一组谐波(“目标音”)的基频(F0)进行编码的能力。实验1a是在210 Hz掩蔽音存在的情况下,对两个目标音进行连续的F0辨别任务,这两个目标音的F0在几何上以210 Hz为中心。目标音和掩蔽音经过相同的带通滤波,要么从20 Hz到1420 Hz(“低频”条件),要么从3900 Hz到5400 Hz(“高频”条件)。在低频条件下,掩蔽音对表现的影响仅为中等程度,无论它是与每个200毫秒的目标音同步门控,还是在目标音之前150毫秒开启并在目标音之后150毫秒关闭。在高频条件下,同步掩蔽音也有中等程度的影响,但异步掩蔽音会显著降低表现。无论掩蔽音的门控情况如何,听众在该区域都没有将掩蔽音和目标音的组合听作两个复合音的混合,而是体验到一种单一的类似噪声或“噼啪声”的感知。实验1b表明,通过将实验1a高频条件下的F0降低到62.5 Hz,在低频条件下也能得到类似的大幅下降,这表明相邻谐波的可分辨性对该效应很重要。实验2要求听众检测两个同时出现的成分组之间的F0差异(“F0差值”),其中一组在高频区域滤波,另一组在低频区域滤波。低频区域滤波的连续掩蔽音对表现的影响仅为轻微下降,但高频区域的掩蔽音会使表现降至随机水平。实验3表明,随着低频区域掩蔽音与目标音之间的F0差值从1%增加到8%,听众将混合物识别为听起来逐渐“融合度降低”,但在高频区域并非如此。研究得出结论,听众在提取同一频率区域内两组未分辨谐波的F0方面能力较差。实验没有提供证据表明听众可以利用异步掩蔽音的前导部分来识别其F0,从而帮助从混合物中提取目标音的F0。