Center for Neural Science, New York University, New York, New York 10003
Center for Neural Science, New York University, New York, New York 10003.
J Neurosci. 2023 Jan 4;43(1):93-112. doi: 10.1523/JNEUROSCI.1616-21.2022. Epub 2022 Nov 15.
Animal communication sounds exhibit complex temporal structure because of the amplitude fluctuations that comprise the sound envelope. In human speech, envelope modulations drive synchronized activity in auditory cortex (AC), which correlates strongly with comprehension (Giraud and Poeppel, 2012; Peelle and Davis, 2012; Haegens and Zion Golumbic, 2018). Studies of envelope coding in single neurons, performed in nonhuman animals, have focused on periodic amplitude modulation (AM) stimuli and use response metrics that are not easy to juxtapose with data from humans. In this study, we sought to bridge these fields. Specifically, we looked directly at the temporal relationship between stimulus envelope and spiking, and we assessed whether the apparent diversity across neurons' AM responses contributes to the population representation of speech-like sound envelopes. We gathered responses from single neurons to vocoded speech stimuli and compared them to sinusoidal AM responses in auditory cortex (AC) of alert, freely moving Mongolian gerbils of both sexes. While AC neurons displayed heterogeneous tuning to AM rate, their temporal dynamics were stereotyped. Preferred response phases accumulated near the onsets of sinusoidal AM periods for slower rates (<8 Hz), and an over-representation of amplitude edges was apparent in population responses to both sinusoidal AM and vocoded speech envelopes. Crucially, this encoding bias imparted a decoding benefit: a classifier could discriminate vocoded speech stimuli using summed population activity, while higher frequency modulations required a more sophisticated decoder that tracked spiking responses from individual cells. Together, our results imply that the envelope structure relevant to parsing an acoustic stream could be read-out from a distributed, redundant population code. Animal communication sounds have rich temporal structure and are often produced in extended sequences, including the syllabic structure of human speech. Although the auditory cortex (AC) is known to play a crucial role in representing speech syllables, the contribution of individual neurons remains uncertain. Here, we characterized the representations of both simple, amplitude-modulated sounds and complex, speech-like stimuli within a broad population of cortical neurons, and we found an overrepresentation of amplitude edges. Thus, a phasic, redundant code in auditory cortex can provide a mechanistic explanation for segmenting acoustic streams like human speech.
动物的交流声音表现出复杂的时间结构,因为声音包络的幅度波动包含其中。在人类言语中,包络调制驱动听觉皮层(AC)中的同步活动,这与理解密切相关(Giraud 和 Poeppel,2012;Peelle 和 Davis,2012;Haegens 和 Zion Golumbic,2018)。在非人类动物中进行的关于单个神经元的包络编码研究,侧重于周期性幅度调制(AM)刺激,并使用不易与人类数据并列的响应度量。在这项研究中,我们试图弥合这些领域之间的差距。具体来说,我们直接观察刺激包络和尖峰之间的时间关系,并评估神经元的 AM 响应的明显多样性是否有助于言语样声音包络的群体表示。我们从单个神经元中收集对语音编码刺激的响应,并将其与警觉、自由移动的雄性和雌性蒙古沙鼠听觉皮层(AC)中的正弦 AM 响应进行比较。虽然 AC 神经元对 AM 率表现出异质调谐,但它们的时间动态是刻板的。对于较慢的速率(<8 Hz),首选响应相位在正弦 AM 周期的开始附近累积,并且在群体对正弦 AM 和语音编码包络的响应中都明显存在幅度边缘的过表示。至关重要的是,这种编码偏差赋予了解码优势:分类器可以使用群体活动的总和来区分语音编码的刺激,而更高频率的调制则需要一个更复杂的解码器,该解码器可以跟踪来自单个细胞的尖峰响应。总的来说,我们的结果表明,与解析声流相关的包络结构可以从分布式、冗余的群体代码中读出。动物的交流声音具有丰富的时间结构,通常在扩展的序列中产生,包括人类言语的音节结构。尽管已知听觉皮层(AC)在表示言语音节方面起着至关重要的作用,但单个神经元的贡献仍然不确定。在这里,我们在广泛的皮层神经元群体中描述了简单的、幅度调制的声音和复杂的、言语样的刺激的表示,并发现了幅度边缘的过表示。因此,听觉皮层中的相位、冗余代码可以为分割像人类言语这样的声流提供一种机制解释。