• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用混合卷积神经网络检测 RAVDESS 音频的语音情感。

Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network.

机构信息

ICT Ganpat University, Ahmedabad, Gujarat, India.

Computer Science and Engineering, Jagran Lakecity University, Bhopal, India.

出版信息

J Healthc Eng. 2022 Feb 27;2022:8472947. doi: 10.1155/2022/8472947. eCollection 2022.

DOI:10.1155/2022/8472947
PMID:35265307
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8898841/
Abstract

Every human being has emotion for every item related to them. For every customer, their emotion can help the customer representative to understand their requirement. So, speech emotion recognition plays an important role in the interaction between humans. Now, the intelligent system can help to improve the performance for which we design the convolution neural network (CNN) based network that can classify emotions in different categories like positive, negative, or more specific. In this paper, we use the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) audio records. The Log Mel Spectrogram and Mel-Frequency Cepstral Coefficients (MFCCs) were used to feature the raw audio file. These properties were used in the classification of emotions using techniques, such as Long Short-Term Memory (LSTM), CNNs, Hidden Markov models (HMMs), and Deep Neural Networks (DNNs). For this paper, we have divided the emotions into three sections for males and females. In the first section, we divide the emotion into two classes as positive. In the second section, we divide the emotion into three classes such as positive, negative, and neutral. In the third section, we divide the emotions into 8 different classes such as happy, sad, angry, fearful, surprise, disgust expressions, calm, and fearful emotions. For these three sections, we proposed the model which contains the eight consecutive layers of the 2D convolution neural method. The purposed model gives the better-performed categories to other previously given models. Now, we can identify the emotion of the consumer in better ways.

摘要

每个人都对与自己相关的物品有情感。对于每个客户来说,他们的情感可以帮助客户代表了解他们的需求。因此,语音情感识别在人机交互中起着重要的作用。现在,智能系统可以帮助提高性能,为此我们设计了基于卷积神经网络(CNN)的网络,该网络可以对积极、消极或更具体的情绪进行分类。在本文中,我们使用 Ryerson 情感语音和歌曲音频数据库(RAVDESS)的音频记录。对数梅尔频谱图和梅尔频率倒谱系数(MFCCs)用于对原始音频文件进行特征提取。这些特性被用于使用技术对情感进行分类,如长短期记忆(LSTM)、CNN、隐马尔可夫模型(HMMs)和深度神经网络(DNNs)。在本文中,我们将情感分为男性和女性的三个部分。在第一部分,我们将情感分为积极的两类。在第二部分,我们将情感分为积极、消极和中性三类。在第三部分,我们将情感分为 8 种不同的类别,如快乐、悲伤、愤怒、恐惧、惊讶、厌恶表情、平静和恐惧情绪。对于这三个部分,我们提出了一个包含 2D 卷积神经网络方法的 8 个连续层的模型。所提出的模型为其他之前的模型提供了更好的分类性能。现在,我们可以以更好的方式识别消费者的情绪。

相似文献

1
Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network.使用混合卷积神经网络检测 RAVDESS 音频的语音情感。
J Healthc Eng. 2022 Feb 27;2022:8472947. doi: 10.1155/2022/8472947. eCollection 2022.
2
Speech Emotion Recognition Using Attention Model.基于注意力模型的语音情感识别
Int J Environ Res Public Health. 2023 Mar 14;20(6):5140. doi: 10.3390/ijerph20065140.
3
Fusing Visual Attention CNN and Bag of Visual Words for Cross-Corpus Speech Emotion Recognition.融合视觉注意 CNN 和视觉词袋用于跨语料库语音情感识别。
Sensors (Basel). 2020 Sep 28;20(19):5559. doi: 10.3390/s20195559.
4
Human-Computer Interaction for Recognizing Speech Emotions Using Multilayer Perceptron Classifier.基于多层感知器分类器的语音情感识别的人机交互。
J Healthc Eng. 2022 Mar 28;2022:6005446. doi: 10.1155/2022/6005446. eCollection 2022.
5
A Hybrid Time-Distributed Deep Neural Architecture for Speech Emotion Recognition.一种用于语音情感识别的混合时间分布深度神经架构。
Int J Neural Syst. 2022 Jun;32(6):2250024. doi: 10.1142/S0129065722500241. Epub 2022 May 12.
6
A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition.基于 CNN 的增强型音频信号处理在语音情感识别中的应用。
Sensors (Basel). 2019 Dec 28;20(1):183. doi: 10.3390/s20010183.
7
Speech emotion recognition using machine learning techniques: Feature extraction and comparison of convolutional neural network and random forest.基于机器学习技术的语音情感识别:卷积神经网络和随机森林的特征提取与比较。
PLoS One. 2023 Nov 21;18(11):e0291500. doi: 10.1371/journal.pone.0291500. eCollection 2023.
8
The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English.瑞尔森情感语音和歌曲音频视频数据库(RAVDESS):一组具有北美英语特色的动态、多模态面部和声音表情数据集。
PLoS One. 2018 May 16;13(5):e0196391. doi: 10.1371/journal.pone.0196391. eCollection 2018.
9
Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network.基于深度卷积神经网络的特征选择算法对语音情感识别的影响。
Sensors (Basel). 2020 Oct 23;20(21):6008. doi: 10.3390/s20216008.
10
Enhancing Speech Emotion Recognition Using Dual Feature Extraction Encoders.利用双特征提取编码器增强语音情感识别。
Sensors (Basel). 2023 Jul 24;23(14):6640. doi: 10.3390/s23146640.

引用本文的文献

1
An enhanced speech emotion recognition using vision transformer.基于视觉转换器的增强型语音情感识别。
Sci Rep. 2024 Jun 7;14(1):13126. doi: 10.1038/s41598-024-63776-4.
2
Retracted: Detection of Emotion of Speech for RAVDESS Audio Using Hybrid Convolution Neural Network.撤回:使用混合卷积神经网络检测RAVDESS音频中的语音情感。
J Healthc Eng. 2023 Nov 1;2023:9872030. doi: 10.1155/2023/9872030. eCollection 2023.
3
Speech emotion classification using attention based network and regularized feature selection.基于注意力网络和正则化特征选择的语音情感分类。

本文引用的文献

1
The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English.瑞尔森情感语音和歌曲音频视频数据库(RAVDESS):一组具有北美英语特色的动态、多模态面部和声音表情数据集。
PLoS One. 2018 May 16;13(5):e0196391. doi: 10.1371/journal.pone.0196391. eCollection 2018.
2
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
3
Reducing the dimensionality of data with neural networks.使用神经网络降低数据维度。
Sci Rep. 2023 Jul 25;13(1):11990. doi: 10.1038/s41598-023-38868-2.
4
A Novel Deep Learning-Based Cooperative Communication Channel Model for Wireless Underground Sensor Networks.一种基于深度学习的新型无线地下传感器网络协作通信信道模型。
Sensors (Basel). 2022 Jun 13;22(12):4475. doi: 10.3390/s22124475.
5
Design of Aging Smart Home Products Based on Radial Basis Function Speech Emotion Recognition.基于径向基函数语音情感识别的老年智能家居产品设计
Front Psychol. 2022 May 4;13:882709. doi: 10.3389/fpsyg.2022.882709. eCollection 2022.
Science. 2006 Jul 28;313(5786):504-7. doi: 10.1126/science.1127647.