Suppr超能文献

自然声音的调制频谱与听觉处理的行为学理论

Modulation spectra of natural sounds and ethological theories of auditory processing.

作者信息

Singh Nandini C, Theunissen Frédéric E

机构信息

Department of Psychology and Neuroscience Institute, University of California, Berkeley, 3210 Tolman Hall, Berkeley, California 94720-1650, USA.

出版信息

J Acoust Soc Am. 2003 Dec;114(6 Pt 1):3394-411. doi: 10.1121/1.1624067.

Abstract

The modulation statistics of natural sound ensembles were analyzed by calculating the probability distributions of the amplitude envelope of the sounds and their time-frequency correlations given by the modulation spectra. These modulation spectra were obtained by calculating the two-dimensional Fourier transform of the autocorrelation matrix of the sound stimulus in its spectrographic representation. Since temporal bandwidth and spectral bandwidth are conjugate variables, it is shown that the joint modulation spectrum of sound occupies a restricted space: sounds cannot have rapid temporal and spectral modulations simultaneously. Within this restricted space, it is shown that natural sounds have a characteristic signature. Natural sounds, in general, are low-passed, showing most of their modulation energy for low temporal and spectral modulations. Animal vocalizations and human speech are further characterized by the fact that most of the spectral modulation power is found only for low temporal modulation. Similarly, the distribution of the amplitude envelopes also exhibits characteristic shapes for natural sounds, reflecting the high probability of epochs with no sound, systematic differences across frequencies, and a relatively uniform distribution for the log of the amplitudes for vocalizations. It is postulated that the auditory system as well as engineering applications may exploit these statistical properties to obtain an efficient representation of behaviorally relevant sounds. To test such a hypothesis we show how to create synthetic sounds with first and second order envelope statistics identical to those found in natural sounds.

摘要

通过计算声音幅度包络的概率分布及其由调制谱给出的时频相关性,分析了自然声音集合的调制统计特性。这些调制谱是通过计算声音刺激在其频谱表示中的自相关矩阵的二维傅里叶变换得到的。由于时间带宽和频谱带宽是共轭变量,研究表明声音的联合调制谱占据一个受限空间:声音不能同时具有快速的时间和频谱调制。在这个受限空间内,研究表明自然声音具有特征性标志。一般来说,自然声音是低通的,其大部分调制能量集中在低时间和频谱调制上。动物发声和人类语音的进一步特征在于,大部分频谱调制功率仅在低时间调制时出现。同样,幅度包络的分布对于自然声音也呈现出特征形状,反映了无声时段的高概率、不同频率间的系统性差异以及发声幅度对数的相对均匀分布。据推测,听觉系统以及工程应用可能会利用这些统计特性来获得与行为相关声音的有效表示。为了验证这一假设,我们展示了如何创建具有与自然声音中发现的一阶和二阶包络统计特性相同的合成声音。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验