• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

情感探针:通过机器学习实现跨语言和跨性别言语情感识别的普遍性研究。

The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning.

机构信息

Department of Electronic Engineering, University of Rome Tor Vergata, 00133 Rome, Italy.

Institute of Computational Perception, Johannes Kepler University, 4040 Linz, Austria.

出版信息

Sensors (Basel). 2022 Mar 23;22(7):2461. doi: 10.3390/s22072461.

DOI:10.3390/s22072461
PMID:35408076
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9003467/
Abstract

Machine Learning (ML) algorithms within a human-computer framework are the leading force in speech emotion recognition (SER). However, few studies explore cross-corpora aspects of SER; this work aims to explore the feasibility and characteristics of a cross-linguistic, cross-gender SER. Three ML classifiers (SVM, Naïve Bayes and MLP) are applied to acoustic features, obtained through a procedure based on Kononenko's discretization and correlation-based feature selection. The system encompasses five emotions (disgust, fear, happiness, anger and sadness), using the Emofilm database, comprised of short clips of English movies and the respective Italian and Spanish dubbed versions, for a total of 1115 annotated utterances. The results see MLP as the most effective classifier, with accuracies higher than 90% for single-language approaches, while the cross-language classifier still yields accuracies higher than 80%. The results show cross-gender tasks to be more difficult than those involving two languages, suggesting greater differences between emotions expressed by male versus female subjects than between different languages. Four feature domains, namely, RASTA, F0, MFCC and spectral energy, are algorithmically assessed as the most effective, refining existing literature and approaches based on standard sets. To our knowledge, this is one of the first studies encompassing cross-gender and cross-linguistic assessments on SER.

摘要

在人机框架内,机器学习(ML)算法是语音情感识别(SER)的主要力量。然而,很少有研究探索 SER 的跨语料库方面;这项工作旨在探索跨语言、跨性别 SER 的可行性和特点。三种 ML 分类器(SVM、朴素贝叶斯和 MLP)应用于通过基于 Kononenko 的离散化和基于相关性的特征选择过程获得的声学特征。该系统包括五种情绪(厌恶、恐惧、幸福、愤怒和悲伤),使用 Emofilm 数据库,该数据库由英语电影的短片以及各自的意大利语和西班牙语配音版本组成,共有 1115 个标注的话语。结果表明 MLP 是最有效的分类器,单语言方法的准确率高于 90%,而跨语言分类器的准确率仍高于 80%。结果表明,跨性别任务比涉及两种语言的任务更难,这表明男性和女性受试者表达的情绪之间的差异大于不同语言之间的差异。RASTA、F0、MFCC 和光谱能量这四个特征域被算法评估为最有效的特征域,从而完善了现有基于标准集的文献和方法。据我们所知,这是首批涵盖 SER 跨性别和跨语言评估的研究之一。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9685/9003467/85a314c06284/sensors-22-02461-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9685/9003467/8080240fb8b3/sensors-22-02461-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9685/9003467/87861631e6f6/sensors-22-02461-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9685/9003467/85a314c06284/sensors-22-02461-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9685/9003467/8080240fb8b3/sensors-22-02461-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9685/9003467/87861631e6f6/sensors-22-02461-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9685/9003467/85a314c06284/sensors-22-02461-g003.jpg

相似文献

1
The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning.情感探针:通过机器学习实现跨语言和跨性别言语情感识别的普遍性研究。
Sensors (Basel). 2022 Mar 23;22(7):2461. doi: 10.3390/s22072461.
2
Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features.深度网络:基于深度学习频率特征的轻量级 CNN 语音情感识别系统
Sensors (Basel). 2020 Sep 12;20(18):5212. doi: 10.3390/s20185212.
3
A Comparison of Machine Learning Algorithms and Feature Sets for Automatic Vocal Emotion Recognition in Speech.机器学习算法和特征集在语音自动情感识别中的比较
Sensors (Basel). 2022 Oct 6;22(19):7561. doi: 10.3390/s22197561.
4
An Urdu speech for emotion recognition.一段用于情感识别的乌尔都语语音。
PeerJ Comput Sci. 2022 May 9;8:e954. doi: 10.7717/peerj-cs.954. eCollection 2022.
5
A comprehensive study on bilingual and multilingual speech emotion recognition using a two-pass classification scheme.使用双通分类方案进行双语和多语语音情感识别的综合研究。
PLoS One. 2019 Aug 15;14(8):e0220386. doi: 10.1371/journal.pone.0220386. eCollection 2019.
6
Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization.基于声门激励能量和频谱特征融合及特征优化的应激语音情感识别
Comput Intell Neurosci. 2023 Oct 11;2023:5765760. doi: 10.1155/2023/5765760. eCollection 2023.
7
Respiration Based Non-Invasive Approach for Emotion Recognition Using Impulse Radio Ultra Wide Band Radar and Machine Learning.基于呼吸的非侵入式情绪识别方法,使用脉冲无线电超宽带雷达和机器学习。
Sensors (Basel). 2021 Dec 13;21(24):8336. doi: 10.3390/s21248336.
8
Feature selection for speech emotion recognition in Spanish and Basque: on the use of machine learning to improve human-computer interaction.西班牙语和巴斯克语语音情感识别的特征选择:关于使用机器学习改善人机交互
PLoS One. 2014 Oct 3;9(10):e108975. doi: 10.1371/journal.pone.0108975. eCollection 2014.
9
A Cross-Linguistic Validation of the Test for Rating Emotions in Speech: Acoustic Analyses of Emotional Sentences in English, German, and Hebrew.言语中情绪评定测试的跨语言验证:英语、德语和希伯来语情绪句子的声学分析。
J Speech Lang Hear Res. 2022 Mar 8;65(3):991-1000. doi: 10.1044/2021_JSLHR-21-00205. Epub 2022 Feb 16.
10
Speech emotion recognition using machine learning techniques: Feature extraction and comparison of convolutional neural network and random forest.基于机器学习技术的语音情感识别:卷积神经网络和随机森林的特征提取与比较。
PLoS One. 2023 Nov 21;18(11):e0291500. doi: 10.1371/journal.pone.0291500. eCollection 2023.

引用本文的文献

1
Facial expression recognition (FER) survey: a vision, architectural elements, and future directions.面部表情识别(FER)研究:愿景、架构要素及未来方向
PeerJ Comput Sci. 2024 Jun 3;10:e2024. doi: 10.7717/peerj-cs.2024. eCollection 2024.
2
Speech emotion classification using attention based network and regularized feature selection.基于注意力网络和正则化特征选择的语音情感分类。
Sci Rep. 2023 Jul 25;13(1):11990. doi: 10.1038/s41598-023-38868-2.
3
High-Level CNN and Machine Learning Methods for Speaker Recognition.基于深度学习的说话人识别方法。

本文引用的文献

1
Machine Learning-based Voice Assessment for the Detection of Positive and Recovered COVID-19 Patients.基于机器学习的语音评估用于检测 COVID-19 阳性和康复患者。
J Voice. 2024 May;38(3):796.e1-796.e13. doi: 10.1016/j.jvoice.2021.11.004. Epub 2021 Nov 26.
2
On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition.语音情感识别中的语音属性和特征提取方法。
Sensors (Basel). 2021 Mar 8;21(5):1888. doi: 10.3390/s21051888.
3
Voice Analysis with Machine Learning: One Step Closer to an Objective Diagnosis of Essential Tremor.
Sensors (Basel). 2023 Mar 25;23(7):3461. doi: 10.3390/s23073461.
4
Acoustic Analysis of Speech for Screening for Suicide Risk: Machine Learning Classifiers for Between- and Within-Person Evaluation of Suicidality.言语声学分析用于自杀风险筛查:用于个体间和个体内评估自杀倾向的机器学习分类器。
J Med Internet Res. 2023 Mar 23;25:e45456. doi: 10.2196/45456.
5
Artificial Intelligence-Based Voice Assessment of Patients with Parkinson's Disease Off and On Treatment: Machine vs. Deep-Learning Comparison.基于人工智能的帕金森病患者治疗前后语音评估:机器与深度学习比较。
Sensors (Basel). 2023 Feb 18;23(4):2293. doi: 10.3390/s23042293.
基于机器学习的语音分析:实现原发性震颤客观诊断的又一步。
Mov Disord. 2021 Jun;36(6):1401-1410. doi: 10.1002/mds.28508. Epub 2021 Feb 2.
4
Worldwide Healthy Adult Voice Baseline Parameters: A Comprehensive Review.全球健康成人嗓音基准参数:全面综述。
J Voice. 2022 Sep;36(5):637-649. doi: 10.1016/j.jvoice.2020.08.028. Epub 2020 Oct 8.
5
Voice analysis in adductor spasmodic dysphonia: Objective diagnosis and response to botulinum toxin.喉肌痉挛性发音障碍的嗓音分析:客观诊断和肉毒毒素反应。
Parkinsonism Relat Disord. 2020 Apr;73:23-30. doi: 10.1016/j.parkreldis.2020.03.012. Epub 2020 Mar 19.
6
Random Deep Belief Networks for Recognizing Emotions from Speech Signals.用于从语音信号中识别情绪的随机深度置信网络。
Comput Intell Neurosci. 2017;2017:1945630. doi: 10.1155/2017/1945630. Epub 2017 Mar 5.
7
Acoustic profiles in vocal emotion expression.声乐情感表达中的声学特征。
J Pers Soc Psychol. 1996 Mar;70(3):614-36. doi: 10.1037//0022-3514.70.3.614.
8
Emotions and speech: some acoustical correlates.情感与言语:一些声学关联
J Acoust Soc Am. 1972 Oct;52(4):1238-50. doi: 10.1121/1.1913238.
9
Measurement of pitch by subharmonic summation.通过次谐波求和法测量音高。
J Acoust Soc Am. 1988 Jan;83(1):257-64. doi: 10.1121/1.396427.
10
Irrelevant thoughts, emotional mood states, and cognitive task performance.无关的想法、情绪状态和认知任务表现。
Mem Cognit. 1991 Sep;19(5):507-13. doi: 10.3758/bf03199574.