Suppr超能文献

一个粤语视听情感语音(CAVES)数据集。

A Cantonese Audio-Visual Emotional Speech (CAVES) dataset.

机构信息

The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Locked Bag 1797, Penrith, NSW, 2751, Australia.

出版信息

Behav Res Methods. 2024 Aug;56(5):5264-5278. doi: 10.3758/s13428-023-02270-7. Epub 2023 Nov 28.

Abstract

We present a Cantonese emotional speech dataset that is suitable for use in research investigating the auditory and visual expression of emotion in tonal languages. This unique dataset consists of auditory and visual recordings of ten native speakers of Cantonese uttering 50 sentences each in the six basic emotions plus neutral (angry, happy, sad, surprise, fear, and disgust). The visual recordings have a full HD resolution of 1920 × 1080 pixels and were recorded at 50 fps. The important features of the dataset are outlined along with the factors considered when compiling the dataset. A validation study of the recorded emotion expressions was conducted in which 15 native Cantonese perceivers completed a forced-choice emotion identification task. The variability of the speakers and the sentences was examined by testing the degree of concordance between the intended and the perceived emotion. We compared these results with those of other emotion perception and evaluation studies that have tested spoken emotions in languages other than Cantonese. The dataset is freely available for research purposes.

摘要

我们呈现了一个粤语情感语音数据集,该数据集适用于研究声调语言中听觉和视觉情感表达。这个独特的数据集由十名母语为粤语的人的听觉和视觉记录组成,他们每人用六种基本情感(愤怒、快乐、悲伤、惊讶、恐惧和厌恶)加中性各说 50 句话。视觉记录的分辨率为全高清 1920×1080 像素,帧率为 50 fps。本文概述了数据集的重要特征,并介绍了在编制数据集时考虑的因素。我们进行了一项录制的情感表达验证研究,其中 15 名母语为粤语的感知者完成了一项强制选择情感识别任务。我们通过测试意图和感知的情感之间的一致性程度来检查说话者和句子的可变性。我们将这些结果与其他在非粤语语言中测试口语情感的情感感知和评估研究的结果进行了比较。该数据集可供研究使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/535d0145992b/13428_2023_2270_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验