Larsen Erik, Cedolin Leonardo, Delgutte Bertrand
Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston, MA, USA.
J Neurophysiol. 2008 Sep;100(3):1301-19. doi: 10.1152/jn.01361.2007. Epub 2008 Jul 16.
Pitch differences between concurrent sounds are important cues used in auditory scene analysis and also play a major role in music perception. To investigate the neural codes underlying these perceptual abilities, we recorded from single fibers in the cat auditory nerve in response to two concurrent harmonic complex tones with missing fundamentals and equal-amplitude harmonics. We investigated the efficacy of rate-place and interspike-interval codes to represent both pitches of the two tones, which had fundamental frequency (F0) ratios of 15/14 or 11/9. We relied on the principle of scaling invariance in cochlear mechanics to infer the spatiotemporal response patterns to a given stimulus from a series of measurements made in a single fiber as a function of F0. Templates created by a peripheral auditory model were used to estimate the F0s of double complex tones from the inferred distribution of firing rate along the tonotopic axis. This rate-place representation was accurate for F0s greater, similar900 Hz. Surprisingly, rate-based F0 estimates were accurate even when the two-tone mixture contained no resolved harmonics, so long as some harmonics were resolved prior to mixing. We also extended methods used previously for single complex tones to estimate the F0s of concurrent complex tones from interspike-interval distributions pooled over the tonotopic axis. The interval-based representation was accurate for F0s less, similar900 Hz, where the two-tone mixture contained no resolved harmonics. Together, the rate-place and interval-based representations allow accurate pitch perception for concurrent sounds over the entire range of human voice and cat vocalizations.
同时发声的声音之间的音高差异是听觉场景分析中使用的重要线索,在音乐感知中也起着重要作用。为了研究这些感知能力背后的神经编码,我们记录了猫听觉神经中单纤维对两个同时发声的具有缺失基频和等幅谐波的谐波复合音的反应。我们研究了速率-位置编码和峰峰间隔编码表示两个音调音高的有效性,这两个音调的基频(F0)比率为15/14或11/9。我们依靠耳蜗力学中的尺度不变性原理,从在单根纤维中作为F0函数进行的一系列测量中推断出对给定刺激的时空反应模式。由外周听觉模型创建的模板用于根据沿音频拓扑轴的推断放电率分布来估计双复合音的F0。这种速率-位置表示对于大于、近似900 Hz的F0是准确的。令人惊讶的是,基于速率的F0估计即使在双音混合中没有分辨出谐波时也是准确的,只要在混合之前有一些谐波被分辨出来。我们还扩展了先前用于单个复合音的方法,以从跨音频拓扑轴汇总的峰峰间隔分布中估计同时发声的复合音的F0。基于间隔的表示对于小于、近似900 Hz的F0是准确的,在这种情况下双音混合中没有分辨出谐波。总之,速率-位置表示和基于间隔的表示使得在人类语音和猫叫声的整个范围内对同时发声的声音能够进行准确的音高感知。