Theunissen F E, Doupe A J
Sloan Center for Theoretical Neuroscience and Keck Center for Integrative Neuroscience, Department of Physiology and Psychiatry, University of California, San Francisco, San Francisco, California 94143-0444, USA.
J Neurosci. 1998 May 15;18(10):3786-802. doi: 10.1523/JNEUROSCI.18-10-03786.1998.
Complex vocalizations, such as human speech and birdsong, are characterized by their elaborate spectral and temporal structure. Because auditory neurons of the zebra finch forebrain nucleus HVc respond extremely selectively to a particular complex sound, the bird's own song (BOS), we analyzed the spectral and temporal requirements of these neurons by measuring their responses to systematically degraded versions of the BOS. These synthetic songs were based exclusively on the set of amplitude envelopes obtained from a decomposition of the original sound into frequency bands and preserved the acoustical structure present in the original song with varying degrees of spectral versus temporal resolution, which depended on the width of the frequency bands. Although both excessive temporal or spectral degradation eliminated responses, HVc neurons responded well to degraded synthetic songs with time-frequency resolutions of approximately 5 msec or 200 Hz. By comparing this neuronal time-frequency tuning with the time-frequency scales that best represented the acoustical structure in zebra finch song, we concluded that HVc neurons are more sensitive to temporal than to spectral cues. Furthermore, neuronal responses to synthetic songs were indistinguishable from those to the original BOS only when the amplitude envelopes of these songs were represented with 98% accuracy. That level of precision was equivalent to preserving the relative time-varying phase across frequency bands with resolutions finer than 2 msec. Spectral and temporal information are well known to be extracted by the peripheral auditory system, but this study demonstrates how precisely these cues must be preserved for the full response of high-level auditory neurons sensitive to learned vocalizations.
复杂发声,如人类语言和鸟鸣,其特征在于精细的频谱和时间结构。由于斑胸草雀前脑核HVC的听觉神经元对特定的复杂声音,即鸟类自身的歌声(BOS),具有极高的选择性反应,我们通过测量这些神经元对系统降解的BOS版本的反应,分析了它们的频谱和时间需求。这些合成歌曲完全基于从原始声音分解为频带所获得的一组幅度包络,并以不同程度的频谱与时间分辨率保留了原始歌曲中存在的声学结构,这取决于频带的宽度。尽管过度的时间或频谱降解都会消除反应,但HVC神经元对时间频率分辨率约为5毫秒或200赫兹的降解合成歌曲反应良好。通过将这种神经元的时间频率调谐与最能代表斑胸草雀歌声声学结构的时间频率尺度进行比较,我们得出结论,HVC神经元对时间线索比对频谱线索更敏感。此外,只有当这些歌曲的幅度包络以98%的准确度表示时,神经元对合成歌曲的反应才与对原始BOS的反应没有区别。这种精度水平相当于以优于2毫秒的分辨率保留跨频带的相对时变相位。众所周知,频谱和时间信息是由外周听觉系统提取的,但这项研究表明,对于对习得发声敏感的高级听觉神经元的完整反应而言,这些线索必须被精确保留。