Middlebrooks J C, Green D M
Department of Neuroscience, University of Florida, Gainesville 32610.
Annu Rev Psychol. 1991;42:135-59. doi: 10.1146/annurev.ps.42.020191.001031.
In keeping with our promise earlier in this review, we summarize here the process by which we believe spatial cues are used for localizing a sound source in a free-field listening situation. We believe it entails two parallel processes: 1. The azimuth of the source is determined using differences in interaural time or interaural intensity, whichever is present. Wightman and colleagues (1989) believe the low-frequency temporal information is dominant if both are present. 2. The elevation of the source is determined from spectral shape cues. The received sound spectrum, as modified by the pinna, is in effect compared with a stored set of directional transfer functions. These are actually the spectra of a nearly flat source heard at various elevations. The elevation that corresponds to the best-matching transfer function is selected as the locus of the sound. Pinnae are similar enough between people that certain general rules (e.g. Blauert's boosted bands or Butler's covert peaks) can describe this process. Head motion is probably not a critical part of the localization process, except in cases where time permits a very detailed assessment of location, in which case one tries to localize the source by turning the head toward the putative location. Sound localization is only moderately more precise when the listener points directly toward the source. The process is not analogous to localizing a visual source on the fovea of the retina. Thus, head motion provides only a moderate increase in localization accuracy. Finally, current evidence does not support the view that auditory motion perception is anything more than detection of changes in static location over time.
按照我们在本综述前文的承诺,我们在此总结我们认为在自由场聆听情境中用于定位声源的空间线索的过程。我们认为这涉及两个并行的过程:1. 使用耳间时间差或耳间强度差(无论存在哪种)来确定声源的方位角。威特曼及其同事(1989年)认为,如果两者都存在,低频时间信息占主导。2. 从频谱形状线索确定声源的仰角。将经耳廓修正后的接收到的声谱与一组存储的方向传递函数进行实际比较。这些实际上是在不同仰角听到的近乎平坦声源的频谱。选择与最佳匹配传递函数相对应的仰角作为声音的位置。人与人之间的耳廓足够相似,以至于某些通用规则(例如布劳尔特的增强频段或巴特勒的隐蔽峰值)可以描述这个过程。头部运动可能不是定位过程的关键部分,除非在时间允许对位置进行非常详细评估的情况下,在这种情况下,人们会试图通过将头部转向假定位置来定位声源。当听众直接指向声源时,声音定位仅略微更精确。这个过程与在视网膜中央凹上定位视觉源并不类似。因此,头部运动仅适度提高定位精度。最后,目前的证据不支持这样的观点,即听觉运动感知不仅仅是检测静态位置随时间的变化。