• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于先进差分进化算法的性别感知英语语音情感识别

Advanced differential evolution for gender-aware English speech emotion recognition.

机构信息

Fanli Business School, Nanyang Institute of Technology, Nanyang, 473004, China.

School of Computer and Software, Nanyang Institute of Technology, Nanyang, 473004, China.

出版信息

Sci Rep. 2024 Jul 31;14(1):17696. doi: 10.1038/s41598-024-68864-z.

DOI:10.1038/s41598-024-68864-z
PMID:39085418
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11291894/
Abstract

Speech emotion recognition (SER) technology involves feature extraction and prediction models. However, recognition efficiency tends to decrease because of gender differences and the large number of extracted features. Consequently, this paper introduces a SER system based on gender. First, gender and emotion features are extracted from speech signals to develop gender recognition and emotion classification models. Second, according to gender differences, distinct emotion recognition models are established for male and female speakers. The gender of speakers is determined before executing the corresponding emotion model. Third, the accuracy of these emotion models is enhanced by utilizing an advanced differential evolution algorithm (ADE) to select optimal features. ADE incorporates new difference vectors, mutation operators, and position learning, which effectively balance global and local searches. A new position repairing method is proposed to address gender differences. Finally, experiments on four English datasets demonstrate that ADE is superior to comparison algorithms in recognition accuracy, recall, precision, F1-score, the number of used features and execution time. The findings highlight the significance of gender in refining emotion models, while mel-frequency cepstral coefficients are important factors in gender differences.

摘要

语音情感识别(SER)技术涉及特征提取和预测模型。然而,由于性别差异和提取的特征数量众多,识别效率往往会降低。因此,本文介绍了一种基于性别的 SER 系统。首先,从语音信号中提取性别和情感特征,以开发性别识别和情感分类模型。其次,根据性别差异,为男性和女性说话者建立不同的情感识别模型。在执行相应的情感模型之前,确定说话者的性别。第三,利用先进的差分进化算法(ADE)选择最优特征来提高这些情感模型的准确性。ADE 采用了新的差分向量、变异算子和位置学习,有效地平衡了全局搜索和局部搜索。提出了一种新的位置修复方法来解决性别差异问题。最后,在四个英语数据集上的实验表明,在识别精度、召回率、精度、F1 分数、使用特征的数量和执行时间方面,ADE 优于对比算法。研究结果表明,性别在细化情感模型方面具有重要意义,而梅尔频率倒谱系数是性别差异的重要因素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf0b/11291894/3d16242257b8/41598_2024_68864_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf0b/11291894/d4b120cfdfbb/41598_2024_68864_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf0b/11291894/89b50d2e4cb7/41598_2024_68864_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf0b/11291894/a9ed7ac074aa/41598_2024_68864_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf0b/11291894/3d16242257b8/41598_2024_68864_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf0b/11291894/d4b120cfdfbb/41598_2024_68864_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf0b/11291894/89b50d2e4cb7/41598_2024_68864_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf0b/11291894/a9ed7ac074aa/41598_2024_68864_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf0b/11291894/3d16242257b8/41598_2024_68864_Fig3_HTML.jpg

相似文献

1
Advanced differential evolution for gender-aware English speech emotion recognition.基于先进差分进化算法的性别感知英语语音情感识别
Sci Rep. 2024 Jul 31;14(1):17696. doi: 10.1038/s41598-024-68864-z.
2
Gender-Driven English Speech Emotion Recognition with Genetic Algorithm.基于遗传算法的性别驱动英语语音情感识别
Biomimetics (Basel). 2024 Jun 14;9(6):360. doi: 10.3390/biomimetics9060360.
3
An enhanced speech emotion recognition using vision transformer.基于视觉转换器的增强型语音情感识别。
Sci Rep. 2024 Jun 7;14(1):13126. doi: 10.1038/s41598-024-63776-4.
4
A Deep Learning Method Using Gender-Specific Features for Emotion Recognition.基于性别特征的深度学习方法用于情绪识别。
Sensors (Basel). 2023 Jan 25;23(3):1355. doi: 10.3390/s23031355.
5
Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization.基于声门激励能量和频谱特征融合及特征优化的应激语音情感识别
Comput Intell Neurosci. 2023 Oct 11;2023:5765760. doi: 10.1155/2023/5765760. eCollection 2023.
6
IoT-Enabled WBAN and Machine Learning for Speech Emotion Recognition in Patients.物联网支持的 WBAN 和机器学习在患者语音情感识别中的应用。
Sensors (Basel). 2023 Mar 8;23(6):2948. doi: 10.3390/s23062948.
7
MelTrans: Mel-Spectrogram Relationship-Learning for Speech Emotion Recognition via Transformers.基于 Transformer 的梅尔频谱关系学习在语音情感识别中的应用。
Sensors (Basel). 2024 Aug 25;24(17):5506. doi: 10.3390/s24175506.
8
A comprehensive study on bilingual and multilingual speech emotion recognition using a two-pass classification scheme.使用双通分类方案进行双语和多语语音情感识别的综合研究。
PLoS One. 2019 Aug 15;14(8):e0220386. doi: 10.1371/journal.pone.0220386. eCollection 2019.
9
Emotion recognition for human-computer interaction using high-level descriptors.基于高层描述符的人机交互中的情感识别。
Sci Rep. 2024 May 27;14(1):12122. doi: 10.1038/s41598-024-59294-y.
10
Cross-corpus speech emotion recognition with transformers: Leveraging handcrafted features and data augmentation.基于 Transformer 的跨语料库语音情感识别:利用手工特征和数据增强。
Comput Biol Med. 2024 Sep;179:108841. doi: 10.1016/j.compbiomed.2024.108841. Epub 2024 Jul 12.

引用本文的文献

1
Forecasting Renewable energy and electricity consumption using evolutionary hyperheuristic algorithm.使用进化超启发式算法预测可再生能源和电力消耗
Sci Rep. 2025 Jan 20;15(1):2565. doi: 10.1038/s41598-025-87013-8.

本文引用的文献

1
Speech emotion analysis using convolutional neural network (CNN) and gamma classifier-based error correcting output codes (ECOC).基于卷积神经网络 (CNN) 和基于 Gamma 分类器的纠错输出码 (ECOC) 的语音情感分析。
Sci Rep. 2023 Nov 21;13(1):20398. doi: 10.1038/s41598-023-47118-4.
2
Speech emotion classification using attention based network and regularized feature selection.基于注意力网络和正则化特征选择的语音情感分类。
Sci Rep. 2023 Jul 25;13(1):11990. doi: 10.1038/s41598-023-38868-2.
3
Pupil dilation reflects the dynamic integration of audiovisual emotional speech.
瞳孔扩张反映了视听情感言语的动态整合。
Sci Rep. 2023 Apr 4;13(1):5507. doi: 10.1038/s41598-023-32133-2.
4
A Deep Learning Method Using Gender-Specific Features for Emotion Recognition.基于性别特征的深度学习方法用于情绪识别。
Sensors (Basel). 2023 Jan 25;23(3):1355. doi: 10.3390/s23031355.
5
Acoustic speech features in social comparison: how stress impacts the way you sound.社交比较中的声学语音特征:压力如何影响你的声音。
Sci Rep. 2022 Dec 20;12(1):22022. doi: 10.1038/s41598-022-26375-9.
6
A survey on binary metaheuristic algorithms and their engineering applications.关于二元元启发式算法及其工程应用的一项调查。
Artif Intell Rev. 2023;56(7):6101-6167. doi: 10.1007/s10462-022-10328-9. Epub 2022 Nov 21.
7
Hemodynamic responses to emotional speech in two-month-old infants imaged using diffuse optical tomography.使用漫射光学断层成像技术对两个月大的婴儿进行情绪语音刺激的血流动力学反应研究。
Sci Rep. 2019 Mar 18;9(1):4745. doi: 10.1038/s41598-019-39993-7.
8
Gender Differences in the Recognition of Vocal Emotions.声音情感识别中的性别差异。
Front Psychol. 2018 Jun 5;9:882. doi: 10.3389/fpsyg.2018.00882. eCollection 2018.
9
The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English.瑞尔森情感语音和歌曲音频视频数据库(RAVDESS):一组具有北美英语特色的动态、多模态面部和声音表情数据集。
PLoS One. 2018 May 16;13(5):e0196391. doi: 10.1371/journal.pone.0196391. eCollection 2018.
10
CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset.CREMA-D:众包情感多模态演员数据集。
IEEE Trans Affect Comput. 2014 Oct-Dec;5(4):377-390. doi: 10.1109/TAFFC.2014.2336244.