• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用卷积神经网络检测说话人情绪的人机交互

Human-Computer Interaction with Detection of Speaker Emotions Using Convolution Neural Networks.

机构信息

Department of Computer Science and Engineering, College of Applied Studies and Community Services, King Saud University, P.O. BOX 22459, Riyadh 11495, Saudi Arabia.

College of Computer and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia.

出版信息

Comput Intell Neurosci. 2022 Mar 31;2022:7463091. doi: 10.1155/2022/7463091. eCollection 2022.

DOI:10.1155/2022/7463091
PMID:35401731
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8989588/
Abstract

Emotions play an essential role in human relationships, and many real-time applications rely on interpreting the speaker's emotion from their words. Speech emotion recognition (SER) modules aid human-computer interface (HCI) applications, but they are challenging to implement because of the lack of balanced data for training and clarity about which features are sufficient for categorization. This research discusses the impact of the classification approach, identifying the most appropriate combination of features and data augmentation on speech emotion detection accuracy. Selection of the correct combination of handcrafted features with the classifier plays an integral part in reducing computation complexity. The suggested classification model, a 1D convolutional neural network (1D CNN), outperforms traditional machine learning approaches in classification. Unlike most earlier studies, which examined emotions primarily through a single language lens, our analysis looks at numerous language data sets. With the most discriminating features and data augmentation, our technique achieves 97.09%, 96.44%, and 83.33% accuracy for the BAVED, ANAD, and SAVEE data sets, respectively.

摘要

情绪在人际关系中起着至关重要的作用,许多实时应用程序都依赖于从说话者的话语中解释其情绪。语音情感识别 (SER) 模块辅助人机交互 (HCI) 应用程序,但由于缺乏用于训练的平衡数据以及关于哪些特征足以进行分类的问题不够明确,因此实现起来具有挑战性。本研究讨论了分类方法的影响,确定了特征和数据增强的最佳组合对语音情感检测准确性的影响。选择与分类器配合使用的手工制作特征的正确组合在降低计算复杂性方面起着不可或缺的作用。所提出的分类模型,即一维卷积神经网络 (1D CNN),在分类方面优于传统的机器学习方法。与大多数早期仅通过单一语言视角研究情绪的研究不同,我们的分析着眼于多个语言数据集。通过使用最具辨别力的特征和数据增强,我们的技术分别为 BAVED、ANAD 和 SAVEE 数据集实现了 97.09%、96.44%和 83.33%的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/bc19ac9ff2d1/CIN2022-7463091.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/64a7309efc68/CIN2022-7463091.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/e969869fc87f/CIN2022-7463091.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/05852d6af23b/CIN2022-7463091.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/c8e172ee283f/CIN2022-7463091.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/738afbdcb45c/CIN2022-7463091.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/35bbc902a9a0/CIN2022-7463091.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/fdfea3034c87/CIN2022-7463091.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/bc4f822f016e/CIN2022-7463091.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/d016360cbfa9/CIN2022-7463091.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/bc19ac9ff2d1/CIN2022-7463091.010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/64a7309efc68/CIN2022-7463091.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/e969869fc87f/CIN2022-7463091.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/05852d6af23b/CIN2022-7463091.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/c8e172ee283f/CIN2022-7463091.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/738afbdcb45c/CIN2022-7463091.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/35bbc902a9a0/CIN2022-7463091.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/fdfea3034c87/CIN2022-7463091.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/bc4f822f016e/CIN2022-7463091.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/d016360cbfa9/CIN2022-7463091.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8989588/bc19ac9ff2d1/CIN2022-7463091.010.jpg

相似文献

1
Human-Computer Interaction with Detection of Speaker Emotions Using Convolution Neural Networks.使用卷积神经网络检测说话人情绪的人机交互
Comput Intell Neurosci. 2022 Mar 31;2022:7463091. doi: 10.1155/2022/7463091. eCollection 2022.
2
Human-Computer Interaction with a Real-Time Speech Emotion Recognition with Ensembling Techniques 1D Convolution Neural Network and Attention.基于集成技术 1D 卷积神经网络和注意力的实时语音情感识别的人机交互
Sensors (Basel). 2023 Jan 26;23(3):1386. doi: 10.3390/s23031386.
3
Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network.基于深度卷积神经网络的特征选择算法对语音情感识别的影响。
Sensors (Basel). 2020 Oct 23;20(21):6008. doi: 10.3390/s20216008.
4
Cascaded Convolutional Neural Network Architecture for Speech Emotion Recognition in Noisy Conditions.用于噪声环境下语音情感识别的级联卷积神经网络架构
Sensors (Basel). 2021 Jun 27;21(13):4399. doi: 10.3390/s21134399.
5
Effect on speech emotion classification of a feature selection approach using a convolutional neural network.使用卷积神经网络的特征选择方法对语音情感分类的影响。
PeerJ Comput Sci. 2021 Nov 3;7:e766. doi: 10.7717/peerj-cs.766. eCollection 2021.
6
A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition.基于 CNN 的增强型音频信号处理在语音情感识别中的应用。
Sensors (Basel). 2019 Dec 28;20(1):183. doi: 10.3390/s20010183.
7
Fusing Visual Attention CNN and Bag of Visual Words for Cross-Corpus Speech Emotion Recognition.融合视觉注意 CNN 和视觉词袋用于跨语料库语音情感识别。
Sensors (Basel). 2020 Sep 28;20(19):5559. doi: 10.3390/s20195559.
8
Speech Emotion Recognition Using Convolution Neural Networks and Multi-Head Convolutional Transformer.基于卷积神经网络和多头卷积变换的语音情感识别。
Sensors (Basel). 2023 Jul 7;23(13):6212. doi: 10.3390/s23136212.
9
Fusion-ConvBERT: Parallel Convolution and BERT Fusion for Speech Emotion Recognition.融合卷积-BERT:语音情感识别的并行卷积和 BERT 融合。
Sensors (Basel). 2020 Nov 23;20(22):6688. doi: 10.3390/s20226688.
10
A Hybrid Time-Distributed Deep Neural Architecture for Speech Emotion Recognition.一种用于语音情感识别的混合时间分布深度神经架构。
Int J Neural Syst. 2022 Jun;32(6):2250024. doi: 10.1142/S0129065722500241. Epub 2022 May 12.

引用本文的文献

1
From Neural Networks to Emotional Networks: A Systematic Review of EEG-Based Emotion Recognition in Cognitive Neuroscience and Real-World Applications.从神经网络到情感网络:认知神经科学与现实世界应用中基于脑电图的情感识别系统综述
Brain Sci. 2025 Feb 20;15(3):220. doi: 10.3390/brainsci15030220.
2
Hybrid CNN-LSTM model with efficient hyperparameter tuning for prediction of Parkinson's disease.基于高效超参数调整的混合 CNN-LSTM 模型用于帕金森病预测。
Sci Rep. 2023 Sep 5;13(1):14605. doi: 10.1038/s41598-023-41314-y.
3
A Multimodal Feature Fusion Framework for Sleep-Deprived Fatigue Detection to Prevent Accidents.

本文引用的文献

1
Combining a parallel 2D CNN with a self-attention Dilated Residual Network for CTC-based discrete speech emotion recognition.基于 CTC 的离散语音情感识别中,将二维并行卷积神经网络与自注意力空洞残差网络相结合。
Neural Netw. 2021 Sep;141:52-60. doi: 10.1016/j.neunet.2021.03.013. Epub 2021 Mar 23.
2
On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition.语音情感识别中的语音属性和特征提取方法。
Sensors (Basel). 2021 Mar 8;21(5):1888. doi: 10.3390/s21051888.
3
Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition.
用于睡眠剥夺疲劳检测以预防事故的多模态特征融合框架。
Sensors (Basel). 2023 Apr 20;23(8):4129. doi: 10.3390/s23084129.
4
A Deep Learning Method Using Gender-Specific Features for Emotion Recognition.基于性别特征的深度学习方法用于情绪识别。
Sensors (Basel). 2023 Jan 25;23(3):1355. doi: 10.3390/s23031355.
5
Region-Based Segmentation and Classification for Ovarian Cancer Detection Using Convolution Neural Network.基于区域的卷积神经网络分割与分类方法在卵巢癌检测中的应用。
Contrast Media Mol Imaging. 2022 Nov 19;2022:5968939. doi: 10.1155/2022/5968939. eCollection 2022.
6
Analysis of Smart Lung Tumour Detector and Stage Classifier Using Deep Learning Techniques with Internet of Things.基于物联网的深度学习技术对智能肺部肿瘤探测器和分期分类器的分析。
Comput Intell Neurosci. 2022 Sep 13;2022:4608145. doi: 10.1155/2022/4608145. eCollection 2022.
7
Text-Based Emotion Recognition Using Deep Learning Approach.基于深度学习的文本情感识别
Comput Intell Neurosci. 2022 Aug 23;2022:2645381. doi: 10.1155/2022/2645381. eCollection 2022.
基于注意力机制的预训练深度卷积神经网络语音情感识别模型
Front Physiol. 2021 Mar 2;12:643202. doi: 10.3389/fphys.2021.643202. eCollection 2021.
4
Fusion-ConvBERT: Parallel Convolution and BERT Fusion for Speech Emotion Recognition.融合卷积-BERT:语音情感识别的并行卷积和 BERT 融合。
Sensors (Basel). 2020 Nov 23;20(22):6688. doi: 10.3390/s20226688.
5
Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network.基于深度卷积神经网络的特征选择算法对语音情感识别的影响。
Sensors (Basel). 2020 Oct 23;20(21):6008. doi: 10.3390/s20216008.
6
Fusing Visual Attention CNN and Bag of Visual Words for Cross-Corpus Speech Emotion Recognition.融合视觉注意 CNN 和视觉词袋用于跨语料库语音情感识别。
Sensors (Basel). 2020 Sep 28;20(19):5559. doi: 10.3390/s20195559.
7
An artificial intelligence-based EEG algorithm for detection of epileptiform EEG discharges: Validation against the diagnostic gold standard.基于人工智能的 EEG 算法用于检测癫痫样 EEG 放电:与诊断金标准的验证。
Clin Neurophysiol. 2020 Jun;131(6):1174-1179. doi: 10.1016/j.clinph.2020.02.032. Epub 2020 Apr 2.
8
Optimal Representation of Anuran Call Spectrum in Environmental Monitoring Systems Using Wireless Sensor Networks.利用无线传感器网络在环境监测系统中对蛙类鸣声进行最佳表示。
Sensors (Basel). 2018 Jun 3;18(6):1803. doi: 10.3390/s18061803.
9
Evaluating deep learning architectures for Speech Emotion Recognition.评估用于语音情感识别的深度学习架构。
Neural Netw. 2017 Aug;92:60-68. doi: 10.1016/j.neunet.2017.02.013. Epub 2017 Mar 21.
10
A survey of affect recognition methods: audio, visual, and spontaneous expressions.情感识别方法综述:音频、视觉与自发表情
IEEE Trans Pattern Anal Mach Intell. 2009 Jan;31(1):39-58. doi: 10.1109/TPAMI.2008.52.