Bordonné Thomas, Kronland-Martinet Richard, Ystad Sølvi, Derrien Olivier, Aramaki Mitsuko
Aix Marseille Univ., CNRS, PRISM (Perception, Representations, Image, Sound, Music), 31 Chemin J. Aiguier, CS 70071, 13402 Marseille Cedex 20, France.
J Acoust Soc Am. 2020 May;147(5):3306. doi: 10.1121/10.0001224.
Understanding how sounds are perceived and interpreted is an important challenge for researchers dealing with auditory perception. The ecological approach to perception suggests that the salient perceptual information that enables an auditor to recognize events through sounds is contained in specific structures called invariants. Identifying such invariants is of interest from a fundamental point of view to better understand auditory perception and it is also useful to include perceptual considerations to model and control sounds. Among the different approaches used to identify perceptually relevant sound structures, vocal imitations are believed to bring a fresh perspective to the field. The main goal of this paper is to better understand how invariants are transmitted through vocal imitations. A sound corpus containing different types of known invariants obtained from an existing synthesizer was established. Participants took part in a test where they were asked to imitate the sound corpus. A continuous and sparse model adapted to the specificities of the vocal imitations was then developed and used to analyze the imitations. Results show that participants were able to highlight salient elements of the sounds that partially correspond to the invariants used in the sound corpus. This study also confirms that vocal imitations reveal how these invariants are transmitted through perception and offers promising perspectives on auditory investigations.
对于研究听觉感知的人员而言,理解声音是如何被感知和解释的是一项重大挑战。感知的生态学方法表明,使听者能够通过声音识别事件的显著感知信息包含在称为不变量的特定结构中。从根本角度来看,识别此类不变量有助于更好地理解听觉感知,并且将感知因素纳入声音建模和控制也很有用。在用于识别与感知相关的声音结构的不同方法中,语音模仿被认为为该领域带来了新的视角。本文的主要目标是更好地理解不变量是如何通过语音模仿进行传递的。建立了一个包含从现有合成器获得的不同类型已知不变量的声音语料库。参与者参加了一项测试,在测试中他们被要求模仿该声音语料库。然后开发了一个适应语音模仿特点的连续稀疏模型,并用于分析模仿情况。结果表明,参与者能够突出声音中与声音语料库中使用的不变量部分对应的显著元素。这项研究还证实,语音模仿揭示了这些不变量是如何通过感知进行传递的,并为听觉研究提供了有前景的视角。