Anikin Andrey, Persson Tomas
Division of Cognitive Science, Department of Philosophy, Lund University, Box 192, SE-221 00, Lund, Sweden.
Behav Res Methods. 2017 Apr;49(2):758-771. doi: 10.3758/s13428-016-0736-y.
This study introduces a corpus of 260 naturalistic human nonlinguistic vocalizations representing nine emotions: amusement, anger, disgust, effort, fear, joy, pain, pleasure, and sadness. The recognition accuracy in a rating task varied greatly per emotion, from <40% for joy and pain, to >70% for amusement, pleasure, fear, and sadness. In contrast, the raters' linguistic-cultural group had no effect on recognition accuracy: The predominantly English-language corpus was classified with similar accuracies by participants from Brazil, Russia, Sweden, and the UK/USA. Supervised random forest models classified the sounds as accurately as the human raters. The best acoustic predictors of emotion were pitch, harmonicity, and the spacing and regularity of syllables. This corpus of ecologically valid emotional vocalizations can be filtered to include only sounds with high recognition rates, in order to study reactions to emotional stimuli of known perceptual types (reception side), or can be used in its entirety to study the association between affective states and vocal expressions (production side).
本研究引入了一个包含260种自然主义的人类非语言发声的语料库,这些发声代表九种情绪:娱乐、愤怒、厌恶、用力、恐惧、喜悦、痛苦、愉悦和悲伤。在评级任务中的识别准确率因情绪而异,差异很大,从喜悦和痛苦的低于40%到娱乐、愉悦、恐惧和悲伤的高于70%。相比之下,评分者的语言文化群体对识别准确率没有影响:以英语为主的语料库被来自巴西、俄罗斯、瑞典以及英国/美国的参与者以相似的准确率进行分类。有监督的随机森林模型对声音的分类与人类评分者一样准确。情绪的最佳声学预测指标是音高、谐波性以及音节的间距和规律性。这个具有生态效度的情绪发声语料库可以进行筛选,只包括识别率高的声音,以便研究对已知感知类型的情绪刺激的反应(接收端),或者可以整体用于研究情感状态与发声表达之间的关联(产生端)。