Maruyama Hironori, Okada Kosuke, Motoyoshi Isamu
Department of Life Sciences, The University of Tokyo, Japan.
Iperception. 2023 Feb 22;14(1):20416695231157349. doi: 10.1177/20416695231157349. eCollection 2023 Jan-Feb.
The natural environment is filled with a variety of auditory events such as wind blowing, water flowing, and fire crackling. It has been suggested that the perception of such textural sounds is based on the statistics of the natural auditory events. Inspired by a recent spectral model for visual texture perception, we propose a model that can describe the perceived sound texture only with the linear spectrum and the energy spectrum. We tested the validity of the model by using synthetic noise sounds that preserve the two-stage amplitude spectra of the original sound. Psychophysical experiment showed that our synthetic noises were perceived as like the original sounds for 120 real-world auditory events. The performance was comparable with the synthetic sounds produced by McDermott-Simoncelli's model which considers various classes of auditory statistics. The results support the notion that the perception of natural sound textures is predictable by the two-stage spectral signals.
自然环境中充满了各种听觉事件,如风吹、水流和火的噼啪声。有人提出,对这种纹理声音的感知是基于自然听觉事件的统计数据。受最近视觉纹理感知光谱模型的启发,我们提出了一种仅用线性谱和能谱来描述感知到的声音纹理的模型。我们通过使用保留原始声音两阶段振幅谱的合成噪声声音来测试该模型的有效性。心理物理学实验表明,对于120个真实世界的听觉事件,我们的合成噪声被感知为与原始声音相似。该性能与考虑各种听觉统计类别的麦克德莫特 - 西蒙切利模型产生的合成声音相当。结果支持了自然声音纹理的感知可由两阶段光谱信号预测的观点。