使用卷积神经网络的特征选择方法对语音情感分类的影响。

Effect on speech emotion classification of a feature selection approach using a convolutional neural network.

作者信息

Amjad Ammar, Khan Lal, Chang Hsien-Tsung

机构信息

Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan, Taiwan.

Department of Physical Medicine and Rehabilitation, Chang Gung Memorial Hospital, Taoyuan, Taiwan.

出版信息

PeerJ Comput Sci. 2021 Nov 3;7:e766. doi: 10.7717/peerj-cs.766. eCollection 2021.

DOI:10.7717/peerj-cs.766

PMID:34805511

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8576551/

Abstract

Speech emotion recognition (SER) is a challenging issue because it is not clear which features are effective for classification. Emotionally related features are always extracted from speech signals for emotional classification. Handcrafted features are mainly used for emotional identification from audio signals. However, these features are not sufficient to correctly identify the emotional state of the speaker. The advantages of a deep convolutional neural network (DCNN) are investigated in the proposed work. A pretrained framework is used to extract the features from speech emotion databases. In this work, we adopt the feature selection (FS) approach to find the discriminative and most important features for SER. Many algorithms are used for the emotion classification problem. We use the random forest (RF), decision tree (DT), support vector machine (SVM), multilayer perceptron classifier (MLP), and k-nearest neighbors (KNN) to classify seven emotions. All experiments are performed by utilizing four different publicly accessible databases. Our method obtains accuracies of 92.02%, 88.77%, 93.61%, and 77.23% for Emo-DB, SAVEE, RAVDESS, and IEMOCAP, respectively, for speaker-dependent (SD) recognition with the feature selection method. Furthermore, compared to current handcrafted feature-based SER methods, the proposed method shows the best results for speaker-independent SER. For EMO-DB, all classifiers attain an accuracy of more than 80% with or without the feature selection technique.

摘要

语音情感识别（SER）是一个具有挑战性的问题，因为尚不清楚哪些特征对分类有效。与情感相关的特征总是从语音信号中提取出来用于情感分类。手工制作的特征主要用于从音频信号中识别情感。然而，这些特征不足以正确识别说话者的情感状态。本文研究了深度卷积神经网络（DCNN）的优势。使用一个预训练的框架从语音情感数据库中提取特征。在这项工作中，我们采用特征选择（FS）方法来找到用于SER的有区别且最重要的特征。许多算法被用于情感分类问题。我们使用随机森林（RF）、决策树（DT）、支持向量机（SVM）、多层感知器分类器（MLP）和k近邻（KNN）来对七种情感进行分类。所有实验都是利用四个不同的可公开获取的数据库进行的。对于与说话者相关（SD）的识别，我们的方法在使用特征选择方法时，对于Emo-DB、SAVEE、RAVDESS和IEMOCAP分别获得了92.02%、88.77%、93.61%和77.23%的准确率。此外，与当前基于手工制作特征的SER方法相比，所提出的方法在与说话者无关的SER方面显示出最佳结果。对于EMO-DB，无论是否使用特征选择技术，所有分类器的准确率都超过了80%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abc0/8576551/1bacc2ef2cbb/peerj-cs-07-766-g001.jpg

相似文献

Effect on speech emotion classification of a feature selection approach using a convolutional neural network.

PeerJ Comput Sci. 2021 Nov 3;7:e766. doi: 10.7717/peerj-cs.766. eCollection 2021.

Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network.

Sensors (Basel). 2020 Oct 23;20(21):6008. doi: 10.3390/s20216008.

Cross-corpus speech emotion recognition with transformers: Leveraging handcrafted features and data augmentation.

Comput Biol Med. 2024 Sep;179:108841. doi: 10.1016/j.compbiomed.2024.108841. Epub 2024 Jul 12.

A Comparison of Machine Learning Algorithms and Feature Sets for Automatic Vocal Emotion Recognition in Speech.

Sensors (Basel). 2022 Oct 6;22(19):7561. doi: 10.3390/s22197561.

Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition.

Front Physiol. 2021 Mar 2;12:643202. doi: 10.3389/fphys.2021.643202. eCollection 2021.

Fusing traditionally extracted features with deep learned features from the speech spectrogram for anger and stress detection using convolution neural network.

Multimed Tools Appl. 2022;81(21):31107-31128. doi: 10.1007/s11042-022-12886-0. Epub 2022 Apr 8.

Speech emotion recognition using machine learning techniques: Feature extraction and comparison of convolutional neural network and random forest.

PLoS One. 2023 Nov 21;18(11):e0291500. doi: 10.1371/journal.pone.0291500. eCollection 2023.

A Methodical Framework Utilizing Transforms and Biomimetic Intelligence-Based Optimization with Machine Learning for Speech Emotion Recognition.

Biomimetics (Basel). 2024 Aug 26;9(9):513. doi: 10.3390/biomimetics9090513.

Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features.

Sensors (Basel). 2020 Sep 12;20(18):5212. doi: 10.3390/s20185212.

Fusing Visual Attention CNN and Bag of Visual Words for Cross-Corpus Speech Emotion Recognition.

Sensors (Basel). 2020 Sep 28;20(19):5559. doi: 10.3390/s20195559.

引用本文的文献

Evolving techniques in sentiment analysis: a comprehensive review.

PeerJ Comput Sci. 2025 Jan 28;11:e2592. doi: 10.7717/peerj-cs.2592. eCollection 2025.

Emotion recognition for human-computer interaction using high-level descriptors.

Sci Rep. 2024 May 27;14(1):12122. doi: 10.1038/s41598-024-59294-y.

Migraine headache (MH) classification using machine learning methods with data augmentation.

Sci Rep. 2024 Mar 2;14(1):5180. doi: 10.1038/s41598-024-55874-0.

The influence of music teaching appreciation on the mental health of college students based on multimedia data analysis.

PeerJ Comput Sci. 2023 Sep 26;9:e1589. doi: 10.7717/peerj-cs.1589. eCollection 2023.

Application of artificial intelligence for automatic cataract staging based on anterior segment images: comparing automatic segmentation approaches to manual segmentation.

Front Neurosci. 2023 Apr 20;17:1182388. doi: 10.3389/fnins.2023.1182388. eCollection 2023.

Data augmentation and deep neural networks for the classification of Pakistani racial speakers recognition.

PeerJ Comput Sci. 2022 Aug 3;8:e1053. doi: 10.7717/peerj-cs.1053. eCollection 2022.

Research on the Filtering and Classification Method of Interactive Music Education Resources Based on Neural Network.

Comput Intell Neurosci. 2022 Aug 17;2022:5764148. doi: 10.1155/2022/5764148. eCollection 2022.

Multi-class sentiment analysis of urdu text using multilingual BERT.

Sci Rep. 2022 Mar 31;12(1):5436. doi: 10.1038/s41598-022-09381-9.

本文引用的文献

EEG-Based Emotion Recognition: A State-of-the-Art Review of Current Trends and Opportunities.

Comput Intell Neurosci. 2020 Sep 16;2020:8875426. doi: 10.1155/2020/8875426. eCollection 2020.

EEG-Based Emotion Classification Using a Deep Neural Network and Sparse Autoencoder.

Front Syst Neurosci. 2020 Sep 2;14:43. doi: 10.3389/fnsys.2020.00043. eCollection 2020.

The Effect of Emotional Valence and Arousal on Visuo-Spatial Working Memory: Incidental Emotional Learning and Memory for Object-Location.

Front Psychol. 2019 Nov 19;10:2587. doi: 10.3389/fpsyg.2019.02587. eCollection 2019.

DEEP MULTIMODAL LEARNING FOR EMOTION RECOGNITION IN SPOKEN LANGUAGE.

Proc IEEE Int Conf Acoust Speech Signal Process. 2018 Apr;2018:5079-5083. doi: 10.1109/ICASSP.2018.8462440. Epub 2018 Sep 13.

Evaluating deep learning architectures for Speech Emotion Recognition.

Neural Netw. 2017 Aug;92:60-68. doi: 10.1016/j.neunet.2017.02.013. Epub 2017 Mar 21.

Object Detection Networks on Convolutional Feature Maps.

IEEE Trans Pattern Anal Mach Intell. 2017 Jul;39(7):1476-1481. doi: 10.1109/TPAMI.2016.2601099. Epub 2016 Aug 17.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用卷积神经网络的特征选择方法对语音情感分类的影响。

Effect on speech emotion classification of a feature selection approach using a convolutional neural network.

作者信息

Amjad Ammar, Khan Lal, Chang Hsien-Tsung

机构信息

Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan, Taiwan.

Department of Physical Medicine and Rehabilitation, Chang Gung Memorial Hospital, Taoyuan, Taiwan.

出版信息

PeerJ Comput Sci. 2021 Nov 3;7:e766. doi: 10.7717/peerj-cs.766. eCollection 2021.

DOI:10.7717/peerj-cs.766

PMID:34805511

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8576551/

Abstract

摘要

使用卷积神经网络的特征选择方法对语音情感分类的影响。

Effect on speech emotion classification of a feature selection approach using a convolutional neural network.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

使用卷积神经网络的特征选择方法对语音情感分类的影响。

Effect on speech emotion classification of a feature selection approach using a convolutional neural network.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献