• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一个粤语视听情感语音(CAVES)数据集。

A Cantonese Audio-Visual Emotional Speech (CAVES) dataset.

机构信息

The MARCS Institute for Brain, Behaviour and Development, Western Sydney University, Locked Bag 1797, Penrith, NSW, 2751, Australia.

出版信息

Behav Res Methods. 2024 Aug;56(5):5264-5278. doi: 10.3758/s13428-023-02270-7. Epub 2023 Nov 28.

DOI:10.3758/s13428-023-02270-7
PMID:38017201
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11289252/
Abstract

We present a Cantonese emotional speech dataset that is suitable for use in research investigating the auditory and visual expression of emotion in tonal languages. This unique dataset consists of auditory and visual recordings of ten native speakers of Cantonese uttering 50 sentences each in the six basic emotions plus neutral (angry, happy, sad, surprise, fear, and disgust). The visual recordings have a full HD resolution of 1920 × 1080 pixels and were recorded at 50 fps. The important features of the dataset are outlined along with the factors considered when compiling the dataset. A validation study of the recorded emotion expressions was conducted in which 15 native Cantonese perceivers completed a forced-choice emotion identification task. The variability of the speakers and the sentences was examined by testing the degree of concordance between the intended and the perceived emotion. We compared these results with those of other emotion perception and evaluation studies that have tested spoken emotions in languages other than Cantonese. The dataset is freely available for research purposes.

摘要

我们呈现了一个粤语情感语音数据集,该数据集适用于研究声调语言中听觉和视觉情感表达。这个独特的数据集由十名母语为粤语的人的听觉和视觉记录组成,他们每人用六种基本情感(愤怒、快乐、悲伤、惊讶、恐惧和厌恶)加中性各说 50 句话。视觉记录的分辨率为全高清 1920×1080 像素,帧率为 50 fps。本文概述了数据集的重要特征,并介绍了在编制数据集时考虑的因素。我们进行了一项录制的情感表达验证研究,其中 15 名母语为粤语的感知者完成了一项强制选择情感识别任务。我们通过测试意图和感知的情感之间的一致性程度来检查说话者和句子的可变性。我们将这些结果与其他在非粤语语言中测试口语情感的情感感知和评估研究的结果进行了比较。该数据集可供研究使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/d7d058678c6e/13428_2023_2270_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/535d0145992b/13428_2023_2270_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/7bb28d19dc61/13428_2023_2270_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/59e6f87845b7/13428_2023_2270_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/741a25bafea4/13428_2023_2270_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/d7d058678c6e/13428_2023_2270_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/535d0145992b/13428_2023_2270_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/7bb28d19dc61/13428_2023_2270_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/59e6f87845b7/13428_2023_2270_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/741a25bafea4/13428_2023_2270_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2593/11289252/d7d058678c6e/13428_2023_2270_Fig5_HTML.jpg

相似文献

1
A Cantonese Audio-Visual Emotional Speech (CAVES) dataset.一个粤语视听情感语音(CAVES)数据集。
Behav Res Methods. 2024 Aug;56(5):5264-5278. doi: 10.3758/s13428-023-02270-7. Epub 2023 Nov 28.
2
BanglaSER: A speech emotion recognition dataset for the Bangla language.孟加拉语SER:一个用于孟加拉语的语音情感识别数据集。
Data Brief. 2022 Mar 22;42:108091. doi: 10.1016/j.dib.2022.108091. eCollection 2022 Jun.
3
Perception of Child-Directed Versus Adult-Directed Emotional Speech in Pediatric Cochlear Implant Users.儿童人工耳蜗使用者对儿童指向和成人指向的情感言语的感知。
Ear Hear. 2020 Sep/Oct;41(5):1372-1382. doi: 10.1097/AUD.0000000000000862.
4
Weighting of Prosodic and Lexical-Semantic Cues for Emotion Identification in Spectrally Degraded Speech and With Cochlear Implants.频谱减损语音和人工耳蜗语音中韵律和词汇语义线索的加权用于情感识别。
Ear Hear. 2021;42(6):1727-1740. doi: 10.1097/AUD.0000000000001057.
5
Recognizing emotional speech in Persian: a validated database of Persian emotional speech (Persian ESD).识别波斯语中的情感语音:经验证的波斯语情感语音数据库(Persian ESD)。
Behav Res Methods. 2015 Mar;47(1):275-94. doi: 10.3758/s13428-014-0467-x.
6
KBES: A dataset for realistic Bangla speech emotion recognition with intensity level.KBES:一个用于具有强度水平的现实孟加拉语语音情感识别的数据集。
Data Brief. 2023 Oct 31;51:109741. doi: 10.1016/j.dib.2023.109741. eCollection 2023 Dec.
7
Production and perception of emotional prosody by adults with autism spectrum disorder.自闭症谱系障碍成人的情感韵律的产生和感知。
Autism Res. 2017 Dec;10(12):1991-2001. doi: 10.1002/aur.1847. Epub 2017 Aug 17.
8
CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset.CREMA-D:众包情感多模态演员数据集。
IEEE Trans Affect Comput. 2014 Oct-Dec;5(4):377-390. doi: 10.1109/TAFFC.2014.2336244.
9
Evidence for shared deficits in identifying emotions from faces and from voices in autism spectrum disorders and specific language impairment.自闭症谱系障碍和特定语言障碍患者在通过面部和声音识别情绪方面存在共同缺陷的证据。
Int J Lang Commun Disord. 2015 Jul;50(4):452-66. doi: 10.1111/1460-6984.12146. Epub 2015 Jan 14.
10
Visual-auditory perception of prosodic focus in Japanese by native and non-native speakers.以日语为母语者和非母语者对日语韵律焦点的视听觉感知。
Front Hum Neurosci. 2023 Sep 21;17:1237395. doi: 10.3389/fnhum.2023.1237395. eCollection 2023.

引用本文的文献

1
CoVox: A dataset of contrasting vocalizations.CoVox:一个包含对比发声的数据集。
Behav Res Methods. 2025 Apr 11;57(5):142. doi: 10.3758/s13428-025-02664-9.

本文引用的文献

1
Feature selection enhancement and feature space visualization for speech-based emotion recognition.基于语音的情感识别的特征选择增强与特征空间可视化
PeerJ Comput Sci. 2022 Nov 4;8:e1091. doi: 10.7717/peerj-cs.1091. eCollection 2022.
2
Cantonese Tone Identification in Three Temporal Cues in Quiet, Speech-Shaped Noise and Two-Talker Babble.安静环境、言语噪声和双说话者嘈杂声中三种时间线索下的粤语声调识别
Front Psychol. 2018 Oct 9;9:1604. doi: 10.3389/fpsyg.2018.01604. eCollection 2018.
3
Identification of Emotional Facial Expressions: Effects of Expression, Intensity, and Sex on Eye Gaze.
情绪性面部表情的识别:表情、强度及性别对目光注视的影响
PLoS One. 2016 Dec 12;11(12):e0168307. doi: 10.1371/journal.pone.0168307. eCollection 2016.
4
Visual and acoustic information supporting a happily expressed speech-in-noise advantage.视觉和听觉信息支持愉悦表达的噪声环境下言语优势。
Q J Exp Psychol (Hove). 2017 Jan;70(1):163-178. doi: 10.1080/17470218.2015.1130069.
5
Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli.参与者样本对刺激样本做出反应的实验中的统计功效与最优设计。
J Exp Psychol Gen. 2014 Oct;143(5):2020-45. doi: 10.1037/xge0000014. Epub 2014 Aug 11.
6
Audio-visual speech perception off the top of the head.即兴视听言语感知。
Cognition. 2006 Jul;100(3):B21-31. doi: 10.1016/j.cognition.2005.09.002. Epub 2005 Nov 8.
7
Development of the Cantonese Hearing In Noise Test (CHINT).粤语噪声环境下听力测试(CHINT)的开发。
Ear Hear. 2005 Jun;26(3):276-89. doi: 10.1097/00003446-200506000-00004.