Dept. of Biomedical Sciences, City University of Hong Kong, Hong Kong, Hong Kong.
Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom.
PLoS One. 2021 Jun 23;16(6):e0238960. doi: 10.1371/journal.pone.0238960. eCollection 2021.
Sounds like "running water" and "buzzing bees" are classes of sounds which are a collective result of many similar acoustic events and are known as "sound textures". A recent psychoacoustic study using sound textures has reported that natural sounding textures can be synthesized from white noise by imposing statistical features such as marginals and correlations computed from the outputs of cochlear models responding to the textures. The outputs being the envelopes of bandpass filter responses, the 'cochlear envelope'. This suggests that the perceptual qualities of many natural sounds derive directly from such statistical features, and raises the question of how these statistical features are distributed in the acoustic environment. To address this question, we collected a corpus of 200 sound textures from public online sources and analyzed the distributions of the textures' marginal statistics (mean, variance, skew, and kurtosis), cross-frequency correlations and modulation power statistics. A principal component analysis of these parameters revealed a great deal of redundancy in the texture parameters. For example, just two marginal principal components, which can be thought of as measuring the sparseness or burstiness of a texture, capture as much as 64% of the variance of the 128 dimensional marginal parameter space, while the first two principal components of cochlear correlations capture as much as 88% of the variance in the 496 correlation parameters. Knowledge of the statistical distributions documented here may help guide the choice of acoustic stimuli with high ecological validity in future research.
听起来“流水”和“嗡嗡蜜蜂”是许多相似声学事件的集体结果,被称为“声音纹理”。最近一项使用声音纹理的心理声学研究报告称,可以通过施加统计特征,如从耳蜗模型对纹理的响应输出计算的边际和相关性,从白噪声中合成自然 sounding 纹理。输出是带通滤波器响应的包络,即“耳蜗包络”。这表明许多自然声音的感知质量直接源自这些统计特征,并提出了这些统计特征在声学环境中如何分布的问题。为了解决这个问题,我们从公共在线资源中收集了 200 个声音纹理的语料库,并分析了纹理的边际统计(均值、方差、偏度和峰度)、交叉频率相关和调制功率统计的分布。这些参数的主成分分析显示纹理参数存在很大的冗余。例如,仅两个边际主成分,可被认为是测量纹理的稀疏度或突发度,就可以捕获多达 64%的 128 维边际参数空间的方差,而耳蜗相关性的前两个主成分可以捕获多达 88%的 496 个相关参数的方差。这里记录的统计分布知识可能有助于指导未来研究中具有高生态有效性的声学刺激的选择。