Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of Vienna, Liebiggasse 5, 1010, Vienna, Austria.
Konrad Lorenz Institute of Ethology, University of Veterinary Medicine, Vienna, Austria.
Behav Res Methods. 2024 Apr;56(4):2923-2940. doi: 10.3758/s13428-023-02264-5. Epub 2023 Nov 10.
Social perception relies on different sensory channels, including vision and audition, which are specifically important for judgements of appearance. Therefore, to understand multimodal integration in person perception, it is important to study both face and voice in a synchronized form. We introduce the Vienna Talking Faces (ViTaFa) database, a high-quality audiovisual database focused on multimodal research of social perception. ViTaFa includes different stimulus modalities: audiovisual dynamic, visual dynamic, visual static, and auditory dynamic. Stimuli were recorded and edited under highly standardized conditions and were collected from 40 real individuals, and the sample matches typical student samples in psychological research (young individuals aged 18 to 45). Stimuli include sequences of various types of spoken content from each person, including German sentences, words, reading passages, vowels, and language-unrelated pseudo-words. Recordings were made with different emotional expressions (neutral, happy, angry, sad, and flirtatious). ViTaFa is freely accessible for academic non-profit research after signing a confidentiality agreement form via https://osf.io/9jtzx/ and stands out from other databases due to its multimodal format, high quality, and comprehensive quantification of stimulus features and human judgements related to attractiveness. Additionally, over 200 human raters validated emotion expression of the stimuli. In summary, ViTaFa provides a valuable resource for investigating audiovisual signals of social perception.
社会感知依赖于不同的感觉通道,包括视觉和听觉,这些通道对于外观判断特别重要。因此,为了理解多模态整合在人感知中的作用,重要的是要以同步的形式研究面部和声音。我们介绍了维也纳说话人脸(ViTaFa)数据库,这是一个专注于社会感知多模态研究的高质量视听数据库。ViTaFa 包括不同的刺激模态:视听动态、视觉动态、视觉静态和听觉动态。刺激是在高度标准化的条件下记录和编辑的,来自 40 位真实个体,样本与心理研究中的典型学生样本匹配(年龄在 18 至 45 岁之间的年轻人)。刺激包括每个人的各种类型的口语内容序列,包括德语句子、单词、阅读段落、元音和与语言无关的假词。录制时使用了不同的情感表达(中性、高兴、生气、悲伤和调情)。ViTaFa 可在签署保密协议后通过 https://osf.io/9jtzx/ 免费供学术非营利研究使用,与其他数据库相比,它具有多模态格式、高质量以及对与吸引力相关的刺激特征和人类判断的全面量化等特点。此外,有 200 多名人类评分员验证了刺激的情感表达。总之,ViTaFa 为研究社会感知的视听信号提供了有价值的资源。