Center for Neural Science, New York University, New York, New York 10003, USA.
J Neurophysiol. 2013 Sep;110(5):1190-204. doi: 10.1152/jn.00645.2012. Epub 2013 Jun 12.
Animal communication sounds contain spectrotemporal fluctuations that provide powerful cues for detection and discrimination. Human perception of speech is influenced both by spectral and temporal acoustic features but is most critically dependent on envelope information. To investigate the neural coding principles underlying the perception of communication sounds, we explored the effect of disrupting the spectral or temporal content of five different gerbil call types on neural responses in the awake gerbil's primary auditory cortex (AI). The vocalizations were impoverished spectrally by reduction to 4 or 16 channels of band-passed noise. For this acoustic manipulation, an average firing rate of the neuron did not carry sufficient information to distinguish between call types. In contrast, the discharge patterns of individual AI neurons reliably categorized vocalizations composed of only four spectral bands with the appropriate natural token. The pooled responses of small populations of AI cells classified spectrally disrupted and natural calls with an accuracy that paralleled human performance on an analogous speech task. To assess whether discharge pattern was robust to temporal perturbations of an individual call, vocalizations were disrupted by time-reversing segments of variable duration. For this acoustic manipulation, cortical neurons were relatively insensitive to short reversal lengths. Consistent with human perception of speech, these results indicate that the stable representation of communication sounds in AI is more dependent on sensitivity to slow temporal envelopes than on spectral detail.
动物的通讯声音包含着丰富的时频谱率变化,为声音的检测和分辨提供了有力的线索。人类对语音的感知受到频谱和时域声学特征的影响,但最关键的是依赖于包络信息。为了研究感知通讯声音的神经编码原理,我们探索了破坏五种不同沙鼠叫声类型的频谱或时域内容对清醒沙鼠初级听觉皮层(AI)中神经反应的影响。通过将声音减少到 4 或 16 个带通噪声通道来实现频谱的简化。对于这种声学处理,神经元的平均发放率没有携带足够的信息来区分不同的叫声类型。相比之下,个别 AI 神经元的放电模式可以可靠地区分由适当的自然标记组成的仅包含四个频谱带的叫声。一小群 AI 细胞的混合反应可以以与人在类似语音任务中表现相当的准确度对频谱破坏和自然叫声进行分类。为了评估放电模式是否对单个叫声的时间扰动具有鲁棒性,我们通过改变时长的时间反转片段来破坏叫声。对于这种声学处理,皮质神经元对短的反转长度相对不敏感。与人类对语音的感知一致,这些结果表明,在 AI 中通讯声音的稳定表示更依赖于对慢时变包络的敏感性,而不是对频谱细节的敏感性。