Department of Electrical and Computer Engineering, The Center for Language and Speech Processing, Johns Hopkins University, Baltimore, Maryland, United States of America.
PLoS Comput Biol. 2013;9(3):e1002982. doi: 10.1371/journal.pcbi.1002982. Epub 2013 Mar 28.
The processing characteristics of neurons in the central auditory system are directly shaped by and reflect the statistics of natural acoustic environments, but the principles that govern the relationship between natural sound ensembles and observed responses in neurophysiological studies remain unclear. In particular, accumulating evidence suggests the presence of a code based on sustained neural firing rates, where central auditory neurons exhibit strong, persistent responses to their preferred stimuli. Such a strategy can indicate the presence of ongoing sounds, is involved in parsing complex auditory scenes, and may play a role in matching neural dynamics to varying time scales in acoustic signals. In this paper, we describe a computational framework for exploring the influence of a code based on sustained firing rates on the shape of the spectro-temporal receptive field (STRF), a linear kernel that maps a spectro-temporal acoustic stimulus to the instantaneous firing rate of a central auditory neuron. We demonstrate the emergence of richly structured STRFs that capture the structure of natural sounds over a wide range of timescales, and show how the emergent ensembles resemble those commonly reported in physiological studies. Furthermore, we compare ensembles that optimize a sustained firing code with one that optimizes a sparse code, another widely considered coding strategy, and suggest how the resulting population responses are not mutually exclusive. Finally, we demonstrate how the emergent ensembles contour the high-energy spectro-temporal modulations of natural sounds, forming a discriminative representation that captures the full range of modulation statistics that characterize natural sound ensembles. These findings have direct implications for our understanding of how sensory systems encode the informative components of natural stimuli and potentially facilitate multi-sensory integration.
中枢听觉系统中的神经元的处理特性直接受到自然声学环境的统计特性的塑造和反映,但在神经生理学研究中,支配自然声音集合与观察到的反应之间关系的原则仍不清楚。特别是,越来越多的证据表明存在基于持续神经发放率的编码,其中中枢听觉神经元对其首选刺激表现出强烈、持久的反应。这种策略可以指示持续声音的存在,参与解析复杂的听觉场景,并可能在将神经动力学与声信号的不同时间尺度匹配中发挥作用。在本文中,我们描述了一个计算框架,用于探索基于持续发放率的编码对谱时感受野(STRF)形状的影响,STRF 是一种将谱时声学刺激映射到中枢听觉神经元的瞬时发放率的线性核。我们展示了具有丰富结构的 STRF 的出现,这些 STRF 可以捕获广泛时间尺度上的自然声音的结构,并展示了这些 STRF 如何类似于生理研究中常见的报告。此外,我们比较了优化持续发放编码的集合与优化稀疏编码的集合,稀疏编码是另一种广泛考虑的编码策略,并提出了如何使产生的群体反应不是相互排斥的。最后,我们展示了这些集合如何围绕自然声音的高能谱时调制形成轮廓,形成一种具有判别性的表示,该表示捕获了表征自然声音集合的调制统计的全部范围。这些发现对我们理解感觉系统如何对自然刺激的信息成分进行编码具有直接影响,并可能促进多感觉整合。